From sunny.chan at clsa.com Fri Jun 1 04:41:41 2018 From: sunny.chan at clsa.com (Sunny Chan, CLSA) Date: Fri, 1 Jun 2018 04:41:41 +0000 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory Message-ID: <6f9228a19d3c4deca170b2032b887dde@clsa.com> (resending for hotspot-gc-dev) Hello, I have a number of question about the proposed changes for the JEP and I would like to make the following suggestions and comments 1) Are we planning to make this change the default behavior for G1? Or this is an optional switch for people who runs in a container environment? 2) In a trading systems, sometimes there are period of quiescence and then suddenly a burst of activity towards market close or market event. The loadavg you suggest to "monitor" the activity in the system only reports up to 14mins and it might not necessary a good measure for this type of applications, especially you would trigger a Full GC 3) You haven't fill in the details for "The GCFrequency value is ignored and therefore, i.e., no full collection is triggered, if:" 4) If we are trigging full GC with this we should make sure the GC reason is populated and log properly in the GC log so we can track it down. 5) I have not heard of J9 Gencon/Shenandoah providing similar functionality. Can you point me to further documentation on which feature you model upon? Thanks. Sunny Chan Senior Lead Engineer, Executive Services D +852 2600 8907 | M +852 6386 1835 | T +852 2600 8888 5/F, One Island East, 18 Westlands Road, Island East, Hong Kong [:1. Social Media Icons:CLSA_Social Media Icons_linkedin.png][:1. Social Media Icons:CLSA_Social Media Icons_twitter.png][:1. Social Media Icons:CLSA_Social Media Icons_youtube.png][:1. Social Media Icons:CLSA_Social Media Icons_facebook.png] clsa.com Insights. Liquidity. Capital. [CLSA_RGB] A CITIC Securities Company The content of this communication is intended for the recipient and is subject to CLSA Legal and Regulatory Notices. These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon request. Please consider before printing. CLSA is ISO14001 certified and committed to reducing its impact on the environment. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 1122 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 1206 bytes Desc: image004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.png Type: image/png Size: 1070 bytes Desc: image006.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image008.png Type: image/png Size: 1024 bytes Desc: image008.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image010.png Type: image/png Size: 4859 bytes Desc: image010.png URL: From kirk.pepperdine at gmail.com Fri Jun 1 05:21:56 2018 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Fri, 1 Jun 2018 08:21:56 +0300 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: <6f9228a19d3c4deca170b2032b887dde@clsa.com> References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> Message-ID: <597AC801-7DD0-4E3E-BDBE-8A4FA9860501@gmail.com> Hi, I?ve tuned a dozens of applications where I wished that G1 returned to a minimum memory configuration when it could. Thus I strongly support the notion of returning uncommitted memory to the OS, this email flows into my previous arguments about speculatively triggered collections. IME, speculatively triggered GC, such as the one proposed in this JEP, has very very very rarely worked out to be a good thing. The trigger in every case has been based on some assumption that is fundamentally flawed. I would revert to the two main culprits, speculative calls to System or Runtime.gc() and DGC (RMI) resetting the guaranteed GC interval value (of max long ms by default). These have a high probability of triggering unnecessary full collections that often create some havoc in the runtime. The assumption in this case is that the application is quiet but as we can see in this email, and as I?ve witnessed in many other applications, runtimes rarely quiet and the machine can decide to do something at the most inappropriate time. Additionally, there is a suspicion here, especially after discussions that included Gil (Tene) that the benchmark used to ?validate? this proposed implementation may have issues.. It would be useful if that benchmark could be released so that we can actually examine it to determine what, if any, issues may exist. I?m very happy to help vet this benchmark. In an era when the GC engineers are working so very hard to make as much of the collection process as concurrent as possible, it just feels very wrong to rely on a huge STW event to get something done. In that spirit, it behoves us to explore how committed memory maybe released at the tail end of a normally triggered GC cycle. I beleive at the end of a mixed cycle was mentioned. I believe this would give those that want/need to minimize their JVM?s footprint an even greater cost advantage then attempting to reduce the footprint at some (un???)random point in time. Kind regards, Kirk > On Jun 1, 2018, at 7:41 AM, Sunny Chan, CLSA wrote: > > (resending for hotspot-gc-dev) > > Hello, > > I have a number of question about the proposed changes for the JEP and I would like to make the following suggestions and comments > > 1) Are we planning to make this change the default behavior for G1? Or this is an optional switch for people who runs in a container environment? > 2) In a trading systems, sometimes there are period of quiescence and then suddenly a burst of activity towards market close or market event. The loadavg you suggest to ?monitor? the activity in the system only reports up to 14mins and it might not necessary a good measure for this type of applications, especially you would trigger a Full GC > 3) You haven?t fill in the details for ?The GCFrequency value is ignored and therefore, i.e., no full collection is triggered, if:? > 4) If we are trigging full GC with this we should make sure the GC reason is populated and log properly in the GC log so we can track it down. > 5) I have not heard of J9 Gencon/Shenandoah providing similar functionality. Can you point me to further documentation on which feature you model upon? > > Thanks. > > Sunny Chan > Senior Lead Engineer, Executive Services > D +852 2600 8907 | M +852 6386 1835 | T +852 2600 8888 > 5/F, One Island East, 18 Westlands Road, Island East, Hong Kong > > > > clsa.com > Insights. Liquidity. Capital. > > > > A CITIC Securities Company > > The content of this communication is intended for the recipient and is subject to CLSA Legal and Regulatory Notices. > These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon request. > Please consider before printing. CLSA is ISO14001 certified and committed to reducing its impact on the environment. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jini.george at oracle.com Fri Jun 1 08:13:39 2018 From: jini.george at oracle.com (Jini George) Date: Fri, 1 Jun 2018 13:43:39 +0530 Subject: RFR (round 4), JEP-318: Epsilon GC In-Reply-To: <094e1093-5a13-4853-aa34-d4e987a069b0@redhat.com> References: <094e1093-5a13-4853-aa34-d4e987a069b0@redhat.com> Message-ID: <0b5c827a-2763-98aa-06af-24df9028aed7@oracle.com> . Thank you very much, Aleksey, for making the SA changes. Some comments on those. ==> share/classes/sun/jvm/hotspot/oops/ObjectHeap.java 444 liveRegions.add(eh.space()); We would need to add an object of type 'Address' to the liveRegions list, instead of type VirtualSpace. Not doing so results in exceptions of the following form from the compare() method for various clhsdb commands like 'jhisto': Exception in thread "main" java.lang.ClassCastException: jdk.hotspot.agent/sun.jvm.hotspot.memory.VirtualSpace cannot be cast to jdk.hotspot.agent/sun.jvm.hotspot.debugger.Address ==> share/classes/sun/jvm/hotspot/oops/ObjectHeap.java 445 } else { 446 if (Assert.ASSERTS_ENABLED) { 447 Assert.that(false, "Expecting GenCollectedHeap, G1CollectedHeap, " + 448 "or ParallelScavengeHeap, but got " + 449 heap.getClass().getName()); 450 } 451 } * Please add EpsilonGC also to the assertion statement. ==> share/classes/sun/jvm/hotspot/tools/HeapSummary.java The run() method would need to handle Epsilon GC here to avoid the Unknown CollectedHeap type error with jhsdb jmap --heap. ==> share/classes/sun/jvm/hotspot/HSDB.java In showThreadStackMemory(), we have: 1101 } 1102 } else if (collHeap instanceof ParallelScavengeHeap) { 1103 ParallelScavengeHeap heap = (ParallelScavengeHeap) collHeap; 1104 if (heap.youngGen().isIn(handle)) { 1105 anno = "PSYoungGen "; 1106 bad = false; 1107 } else if (heap.oldGen().isIn(handle)) { 1108 anno = "PSOldGen "; 1109 bad = false; 1110 } 1111 } else { 1112 // Optimistically assume the oop isn't bad 1113 anno = "[Unknown generation] "; 1114 bad = false; 1115 } 1116 We would need to add the case of collHeap being an instanceof EpsilonHeap too. It would display "Unknown generation" while viewing the stack memory for the Java threads otherwise. ==> It would be great if test/hotspot/jtreg/serviceability/sa/TestUniverse.java is enhanced to add the minimalistic test for EpsilonGC. Thank you, Jini. On 5/31/2018 11:42 PM, Aleksey Shipilev wrote: > Hi, > > This is fourth (and hopefully final) round of code review for Epsilon GC changes. It includes the > fixes done as the result of third round of reviews, all of them inside gc/epsilon or epsilon tests. > > Webrev: > http://cr.openjdk.java.net/~shade/epsilon/webrev.08/ > > If we are good, I am going to push this with the following changeset metadata: > > 8204180: Implementation: JEP 318: Epsilon A No-Op Garbage Collector > Summary: Introduce Epsilon GC > Reviewed-by: rkennke, ihse, pliden, eosterlund, lmesnik > > Builds: > server X {x86_64, x86_32, aarch64, arm32, ppc64le, s390x} > minimal X {x86, x86_64} > zero X {x86_64} > > Testing: gc/epsilon on x86_64 > > Thanks, > -Aleksey > From rkennke at redhat.com Fri Jun 1 09:54:38 2018 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 1 Jun 2018 11:54:38 +0200 Subject: RFR (round 4), JEP-318: Epsilon GC In-Reply-To: <094e1093-5a13-4853-aa34-d4e987a069b0@redhat.com> References: <094e1093-5a13-4853-aa34-d4e987a069b0@redhat.com> Message-ID: <286fbcd2-cc38-7b99-ba44-878a3373a613@redhat.com> Hi Aleksey, Very nice! The changes looks good to me! I haven't looked at the SA stuff though. Thanks! Roman > Hi, > > This is fourth (and hopefully final) round of code review for Epsilon GC changes. It includes the > fixes done as the result of third round of reviews, all of them inside gc/epsilon or epsilon tests. > > Webrev: > http://cr.openjdk.java.net/~shade/epsilon/webrev.08/ > > If we are good, I am going to push this with the following changeset metadata: > > 8204180: Implementation: JEP 318: Epsilon A No-Op Garbage Collector > Summary: Introduce Epsilon GC > Reviewed-by: rkennke, ihse, pliden, eosterlund, lmesnik > > Builds: > server X {x86_64, x86_32, aarch64, arm32, ppc64le, s390x} > minimal X {x86, x86_64} > zero X {x86_64} > > Testing: gc/epsilon on x86_64 > > Thanks, > -Aleksey > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From erik.osterlund at oracle.com Fri Jun 1 11:54:27 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 1 Jun 2018 13:54:27 +0200 Subject: RFR: JDK-8198285: More consistent Access API for arraycopy In-Reply-To: <3bb2415e-4d15-fbd3-dde2-73a25c7bd65b@redhat.com> References: <5AE09F74.7050005@oracle.com> <5AE1F062.3020303@oracle.com> <3bb2415e-4d15-fbd3-dde2-73a25c7bd65b@redhat.com> Message-ID: <5B1133F3.4040704@oracle.com> Hi Roman, Thanks for doing this. A few comments... This: ArrayAccess<>::arraycopy_from_native(... ...won't reliably compile without saying "template", like this: ArrayAccess<>::template arraycopy_from_native(... ...because the template member is qualified on another template class. My suggestion is to remove the explicit and let that be inferred from the address instead. And things like this: bool RuntimeDispatch::arraycopy_init(arrayOop src_obj, size_t src_offset_in_bytes, const T* src_raw, arrayOop dst_obj, size_t dst_offset_in_bytes, T* dst_raw, size_t length) { ...as a single line, could do with some newlines to make it reasonably wide. Now that we copy both from heap to heap, from heap to native, and native to heap, we have to be a bit more careful in modref oop_arraycopy_in_heap. In the covariant case of arraycopy, you unconditionally apply the pre and post barriers on the destination. However, if the destination is native (ArrayAccess::arraycopy_to_native), that does not feel like it will go particularly well, unless I have missed something. Also your new ArrayAccess class inherits from HeapAccess, which is nice. But its member functions call HeapAccess::arraycopy, which will miss out on the IN_HEAP_ARRAY decorator, which might lead to not using precise marking in barrier backends. If I were you, I would probably typedef a BaseAccess for HeapAccess inside of ArrayAccess, and call that in the member functions. Thanks, /Erik On 2018-06-01 00:25, Roman Kennke wrote: > Hi Erik, > > It took me a while to get back to this. > > I wrote a little wrapper around the extended arraycopy API that allows > to basically write exactly what you suggested: > > >> ArrayAccess::arraycopy_from_native(ptr, obj, index, >> length); > I agree that this seems the cleanest solution. > > The backend still sees both obj+off OR raw-ptr, but I think this is ok. > > Differential: > http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/ > Full: > http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/ > > What do you think? > > Roman > >> Each required parameter is clear when you read the API. >> >> But I am open to suggestions of course. >> >> Thanks, >> /Erik >> >> On 2018-04-25 17:59, Roman Kennke wrote: >>>> 1) Resolve the addresses as we do today, but you think you get better >>>> Shenandoah performance if we either >>>> 2) Have 3 different access calls, (from heap to native, from native to >>>> heap, from heap to heap) for copying, or >>>> 3) Have 1 access call that can encode the above 3 variants, but looks >>>> ugly at the call sites. >>> There's also the idea to pass Address arguments to arraycopy, one for >>> src, one for dst, and have 2 different subclasses: one for obj+offset >>> (heap access) and one with direct pointer (raw). Your comment gave me >>> the idea to also provide arrayOop+idx. This would look clean on the >>> caller side I think. >>> >>> It would also be useful on the GC side: BarrierSets would specialize >>> only in the variants that they are interested in, for example, in case >>> of Shenandoah: >>> 1. arraycopy(HeapAddress,HeapAddress) Java->Java >>> 2. arraycopy(HeapAddress,RawAddress) Java->native >>> 3. arraycopy(RawAddress,HeapAddress) native->Java >>> >>> other barriersets can ignore the exact type and only call the args >>> address->resolve() or so to get the actual raw address. >>> >>> This would *also* be beneficial for the other APIs: instead of having >>> all the X() and X_at() variants, we can just use one X variant that >>> either takes RawAddress or HeapAddress. >>> >>> I made a little (not yet compiling/working) prototype of this a while >>> ago: >>> >>> http://cr.openjdk.java.net/~rkennke/JDK-8199801-2.patch >>> >>> >>> What do you think? Would it make sense to go further down that road? >>> >>> Roman >>> >>>> You clearly went for 3, which leaves the callsites looking rather hard >>>> to read. It is for example not obvious for me what is going on here >>>> (javaClasses.cpp line 313): >>>> >>>> HeapAccess<>::arraycopy(NULL, 0, reinterpret_cast>>> jbyte*>(utf8_str), value(h_obj()), >>>> typeArrayOopDesc::element_offset(0), NULL, length); >>>> >>>> ...without looking very carefully at the long list of arguments encoding >>>> what is actually going on (copy from native to the heap). What is worse >>>> is that this will probably not compile without adding the template >>>> keyword to the call (since you have a qualified template member function >>>> behind a template class), like this: >>>> HeapAccess<>::template arraycopy(NULL, 0, reinterpret_cast>>> jbyte*>(utf8_str), value(h_obj()), >>>> typeArrayOopDesc::element_offset(0), NULL, length); >>>> >>>> ...which as a public API leaves me feeling a bit like this: :C >>>> >>>> May I suggest adding an array access helper. The idea is to keep a >>>> single call through Access, and a single backend point for array copy, >>>> but let the helper provide the three different types of copying as >>>> different functions, so that the call sites still look pretty and easy >>>> to read and follow. >>>> >>>> This helper could additionally have load_at, and store_at use array >>>> indices as opposed to offsets, and hence hide the offset calculations we >>>> perform today (typically involving checking if we are using compressed >>>> oops or not). >>>> >>>> I am thinking something along the lines of >>>> ArrayAccess<>::arraycopy_to_native(readable_arguments), >>>> ArrayAccess<>::arraycopy_from_native(readable_arguments), >>>> ArrayAccess<>::arraycopy(readable_arguments), which translates to some >>>> form of Access<>::arraycopy(unreadable_arguments). And for example >>>> ArrayAccess<>::load_at(obj, index) would translate to some kind of >>>> HeapAccess::load_at(obj, offset_for_index(index)) as a >>>> bonus, making everyone using the API jump with happiness. >>>> >>>> What do you think about this idea? Good or bad? I guess the question is >>>> whether this helper should be in access.hpp, or somewhere else (like in >>>> arrayOop). Thoughts are welcome. >>>> >>>> Thanks, >>>> /Erik >>>> >>>> On 2018-04-11 19:54, Roman Kennke wrote: >>>>> Currently, the arraycopy API in access.hpp gets the src and dst >>>>> oops, >>>>> plus the src and dst addresses. In order to be most useful to garbage >>>>> collectors, it should receive the src and dst oops together with the >>>>> src >>>>> and dst offsets instead, and let the Access API / GC calculate the src >>>>> and dst addresses. >>>>> >>>>> For example, Shenandoah needs to resolve the src and dst objects for >>>>> arraycopy, and then apply the corresponding offsets. With the current >>>>> API (obj+ptr) it would calculate the ptr-diff from obj to ptr, then >>>>> resolve obj, then re-add the calculate ptr-diff. This is fragile >>>>> because >>>>> we also may resolve obj in the runtime before calculating ptr (e.g. via >>>>> arrayOop::base()). If we then pass in the original obj and a ptr >>>>> calculated from another copy of the same obj, the above resolution >>>>> logic >>>>> would not work. This is currently the case for obj-arraycopy. >>>>> >>>>> I propose to change the API to accept obj+offset, in addition to ptr >>>>> for >>>>> both src and dst. Only one or the other should be used. Heap accesses >>>>> should use obj+offset and pass NULL for raw-ptr, off-heap accesses (or >>>>> heap accesses that are already resolved.. use with care) should pass >>>>> NULL+0 for obj+offset and the raw-ptr. Notice that this also allows the >>>>> API to be used for Java<->native array bulk transfers. >>>>> >>>>> An alternative would be to break the API up into 4 variants: >>>>> >>>>> Java->Java transfer: >>>>> arraycopy(oop src, size_t src_offs, oop dst, size_t dst_offs, size_t >>>>> len) >>>>> >>>>> Java->Native transfer: >>>>> arraycopy(oop src, size_t src_offs, D* raw_dst, size_t len) >>>>> >>>>> Native->Java transfer: >>>>> arraycopy(S* src_raw, oop dst, size_t dst_offs, size_t len) >>>>> >>>>> 'Unsafe' transfer: >>>>> arraycopy(S* src_raw, D* dst_raw, size_t len) >>>>> >>>>> >>>>> But that seemed to be too much boilerplate copy+pasting for my taste. >>>>> (See how having this overly complicated template layer hurts us?) >>>>> >>>>> Plus, I had a better idea: instead of accepting oop+offset OR T* for >>>>> almost every Access API, we may want to abstract that and take an >>>>> Address type argument, which would be either HeapAddress(obj, >>>>> offset) or >>>>> RawAddress(T* ptr). GCs may then just call addr->address() to get the >>>>> actual address, or specialize for HeapAddress variants and resolve the >>>>> objs and then resolve the address. This would also allow us to get rid >>>>> of almost half of the API (all the *_at variants would go) and some >>>>> other simplifications. However, this seemed to explode the scope of >>>>> this >>>>> RFE, and would be better handled in another RFE. >>>>> >>>>> This changes makes both typeArrayKlass and objArrayKlass use the >>>>> changed >>>>> API, plus I identified all (hopefully) places where we do bulk >>>>> Java<->native array transfers and make them use the API too. Gets us >>>>> rid >>>>> of a bunch of memcpy calls :-) >>>>> >>>>> Please review the change: >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.00/ >>>>> >>>>> Thanks, Roman >>>>> > From zgu at redhat.com Fri Jun 1 12:10:43 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 1 Jun 2018 08:10:43 -0400 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: <6f9228a19d3c4deca170b2032b887dde@clsa.com> References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> Message-ID: <4e8867c0-8e32-949e-e731-280d1f9cbe07@redhat.com> Hi, > > 5)I have not heard of J9 Gencon/Shenandoah providing similar > functionality. Can you point me to further documentation on which > feature you model upon? Shenandoah does provide similar functionality under experimental flags: -XX:+ShenandoahUncommit -XX:ShenandoahUncommitDelay It does not need full gc before uncommitting (actually it calls madvise(MADV_DONTNEED) on Linux) Also, it has "compact" heuristics that targets low memory footprint by aggressively uncommitting heap. Thanks, -Zhengyu > > Thanks. > > *Sunny Chan* > > *Senior Lead Engineer, Executive Services* > > D? +852 2600 8907? |? M? +852 6386 1835? |? T? +852 2600 8888 > > 5/F, One Island East, 18 Westlands Road, Island East, Hong Kong > > :1. Social Media Icons:CLSA_Social Media Icons_linkedin.png > :1. Social Media Icons:CLSA_Social > Media Icons_twitter.png :1. > Social Media Icons:CLSA_Social Media Icons_youtube.png > :1. Social > Media Icons:CLSA_Social Media Icons_facebook.png > > > *clsa.com* ** > > *Insights. Liquidity. Capital. * > > ** > > CLSA_RGB ** > > ** > > *A CITIC Securities Company* > > The content of this communication is intended for the recipient and is > subject to CLSA Legal and Regulatory Notices. > These can be viewed at https://www.clsa.com/disclaimer.htmlor sent to > you upon request. > Please consider before printing. CLSA is ISO14001 certified and > committed to reducing its impact on the environment. > From shade at redhat.com Fri Jun 1 12:41:41 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Jun 2018 14:41:41 +0200 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: <4e8867c0-8e32-949e-e731-280d1f9cbe07@redhat.com> References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> <4e8867c0-8e32-949e-e731-280d1f9cbe07@redhat.com> Message-ID: On 06/01/2018 02:10 PM, Zhengyu Gu wrote: >> 5)I have not heard of J9 Gencon/Shenandoah providing similar functionality. Can you point me to >> further documentation on which feature you model upon? > > Shenandoah does provide similar functionality under experimental flags: > > -XX:+ShenandoahUncommit > -XX:ShenandoahUncommitDelay Which are also enabled by default: bool ShenandoahUncommit = true {experimental} {default} uintx ShenandoahUncommitDelay = 300000 {experimental} {default} bool ShenandoahUncommitWithIdle = false {experimental} {default} > It does not need full gc before uncommitting To be obnoxiously precise, in Shenandoah, the decision to uncommit memory is very loosely tied to the GC cycle. It just uncommits the regions that stayed empty for more than ShenandoahUncommitDelay milliseconds. This naturally captures the post-GC cleanup when GC freed up enough memory, and the lag provides keeps free space committed for the cases where the free memory would be used by the active application itself. As a bonus, Shenandoah uncommits all free memory on explicit GC (e.g. System.gc()), regardless in which mode (concurrent or full stw) the cycle was performed. That is the only loose tie to the GC cycle. > (actually it calls madvise(MADV_DONTNEED) on Linux) Shenandoah has this capability under ShenandoahUncommitWithIdle, which is not enabled by default. It uses the plain old os::uncommit_memory by default. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From thomas.stuefe at gmail.com Fri Jun 1 13:03:34 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 1 Jun 2018 15:03:34 +0200 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> <4e8867c0-8e32-949e-e731-280d1f9cbe07@redhat.com> Message-ID: On Fri, Jun 1, 2018 at 2:41 PM, Aleksey Shipilev wrote: > On 06/01/2018 02:10 PM, Zhengyu Gu wrote: >>> 5)I have not heard of J9 Gencon/Shenandoah providing similar functionality. Can you point me to >>> further documentation on which feature you model upon? >> >> Shenandoah does provide similar functionality under experimental flags: >> >> -XX:+ShenandoahUncommit >> -XX:ShenandoahUncommitDelay > > Which are also enabled by default: > > bool ShenandoahUncommit = true {experimental} {default} > uintx ShenandoahUncommitDelay = 300000 {experimental} {default} > bool ShenandoahUncommitWithIdle = false {experimental} {default} > > >> It does not need full gc before uncommitting > > To be obnoxiously precise, in Shenandoah, the decision to uncommit memory is very loosely tied to > the GC cycle. It just uncommits the regions that stayed empty for more than ShenandoahUncommitDelay > milliseconds. This naturally captures the post-GC cleanup when GC freed up enough memory, and the > lag provides keeps free space committed for the cases where the free memory would be used by the > active application itself. Just curious, how expensive is re-committing? I.e. how important is it to avoid unnecessary uncommits? > > As a bonus, Shenandoah uncommits all free memory on explicit GC (e.g. System.gc()), regardless in > which mode (concurrent or full stw) the cycle was performed. That is the only loose tie to the GC cycle. > >> (actually it calls madvise(MADV_DONTNEED) on Linux) > > Shenandoah has this capability under ShenandoahUncommitWithIdle, which is not enabled by default. It > uses the plain old os::uncommit_memory by default. > > -Aleksey > ..Thomas From shade at redhat.com Fri Jun 1 13:20:42 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Jun 2018 15:20:42 +0200 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> <4e8867c0-8e32-949e-e731-280d1f9cbe07@redhat.com> Message-ID: <2bf5d271-af0a-cedb-40b2-b8157780ea4a@redhat.com> On 06/01/2018 03:03 PM, Thomas St?fe wrote: > On Fri, Jun 1, 2018 at 2:41 PM, Aleksey Shipilev wrote: >> To be obnoxiously precise, in Shenandoah, the decision to uncommit memory is very loosely tied to >> the GC cycle. It just uncommits the regions that stayed empty for more than ShenandoahUncommitDelay >> milliseconds. This naturally captures the post-GC cleanup when GC freed up enough memory, and the >> lag provides keeps free space committed for the cases where the free memory would be used by the >> active application itself. > > Just curious, how expensive is re-committing? I.e. how important is it > to avoid unnecessary uncommits? That depends on what the underlying OS memory manager is doing. I don't think it is ignorable, especially if you are not sure what you exactly you are ignoring :) This is a handy exercise: $ time java -Xms100g -Xmx100g -XX:+AlwaysPreTouch -XX:+UseTransparentHugePages -XX:+UseSerialGC -version real 0m15.189s user 0m0.253s sys 0m14.924s // <--- time spent in Linux kernel, clear_page_erms -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From synytskyy at jelastic.com Fri Jun 1 13:21:48 2018 From: synytskyy at jelastic.com (Ruslan Synytsky) Date: Fri, 1 Jun 2018 16:21:48 +0300 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: References: Message-ID: Hi Sunny. On 1 June 2018 at 07:36, Sunny Chan, CLSA wrote: > Hello, > > > > I have a number of question about the proposed changes for the JEP and I > would like to make the following suggestions and comments > > > > 1) Are we planning to make this change the default behavior for G1? > Or this is an optional switch for people who runs in a container > environment? > Optional. It will be enough for beginning. > 2) In a trading systems, sometimes there are period of quiescence > and then suddenly a burst of activity towards market close or market event. > The loadavg you suggest to ?monitor? the activity in the system only > reports up to 14mins and it might not necessary a good measure for this > type of applications, especially you would trigger a Full GC > I'm sure certain number of workloads can't afford it. Such kind of projects or other mission critical environments simply should not use this option. > 3) You haven?t fill in the details for ?The GCFrequency value is > ignored and therefore, i.e., no full collection is triggered, if:? > Thanks. It's a missed part. Here are the "if" rules: - GCFrequency is zero or below - the average load on the host system is above MaxLoadGC. The MaxLoadGC is a dynamically user-defined variable. This check is ignored if MaxLoadGC is zero or below - the committed memory is above MinCommitted bytes. MinCommitted is a dynamically user-defined variable. This check is ignored if MinCommitted is zero or below - the difference between the current heap capacity and the current heap usage is below MaxOverCommitted bytes. The MaxOverCommitted is a dynamically user-defined variable. This check is ignored if MaxOverCommitted is zero or below The doc will be updated. 4) If we are trigging full GC with this we should make sure the GC reason is populated and log properly in the GC log so we can track it down. Good point. > 5) I have not heard of J9 Gencon/Shenandoah providing similar > functionality. Can you point me to further documentation on which feature > you model upon? > What we know so far that OpenJ9 provides - -XX:+IdleTuningCompactOnIdle - this option controls garbage collection processing with compaction when the status of the JVM is set to idle - and -Xsoftmx - this option sets a "soft" maximum limit for the initial size of the Java? heap. We have not tested it yet, but the idea looks similar. Do we have anyone in the group involved in OpenJ9 to confirm or refute this statement? Thank you > > > Thanks. > > > > *Sunny Chan* > > *Senior Lead Engineer, Executive Services* > > D +852 2600 8907 | M +852 6386 1835 | T +852 2600 8888 > > 5/F, One Island East, 18 Westlands Road > , > Island East, Hong Kong > > > > [image: :1. Social Media Icons:CLSA_Social Media Icons_linkedin.png] > [image: :1. Social Media > Icons:CLSA_Social Media Icons_twitter.png] > [image: :1. Social Media > Icons:CLSA_Social Media Icons_youtube.png] > [image: :1. > Social Media Icons:CLSA_Social Media Icons_facebook.png] > > > > > *clsa.com* > > *Insights. Liquidity. Capital. * > > > > [image: CLSA_RGB] > > > > *A CITIC Securities Company* > > > > The content of this communication is intended for the recipient and is > subject to CLSA Legal and Regulatory Notices. > These can be viewed at https://www.clsa.com/disclaimer.html or sent to > you upon request. > Please consider before printing. CLSA is ISO14001 certified and committed > to reducing its impact on the environment. > -- Ruslan CEO @ Jelastic -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.png Type: image/png Size: 1070 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image010.png Type: image/png Size: 4859 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image008.png Type: image/png Size: 1024 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 1122 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 1206 bytes Desc: not available URL: From synytskyy at jelastic.com Fri Jun 1 13:44:40 2018 From: synytskyy at jelastic.com (Ruslan Synytsky) Date: Fri, 1 Jun 2018 16:44:40 +0300 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: <597AC801-7DD0-4E3E-BDBE-8A4FA9860501@gmail.com> References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> <597AC801-7DD0-4E3E-BDBE-8A4FA9860501@gmail.com> Message-ID: Hi Kirk, thank you for the additional important highlights. On 1 June 2018 at 08:21, Kirk Pepperdine wrote: > Hi, > > I?ve tuned a dozens of applications where I wished that G1 returned to a > minimum memory configuration when it could. Thus I strongly support the > notion of returning uncommitted memory to the OS, this email flows into my > previous arguments about speculatively triggered collections. IME, > speculatively triggered GC, such as the one proposed in this JEP, has very > very very rarely worked out to be a good thing. The trigger in every case > has been based on some assumption that is fundamentally flawed. I would > revert to the two main culprits, speculative calls to System or > Runtime.gc() and DGC (RMI) resetting the guaranteed GC interval value (of > max long ms by default). These have a high probability of triggering > unnecessary full collections that often create some havoc in the runtime. > The assumption in this case is that the application is quiet but as we can > see in this email, and as I?ve witnessed in many other applications, > runtimes rarely quiet and the machine can decide to do something at the > most inappropriate time. > As we discuss this is optional, so when somebody is not sure it's a right thing to do simply ignore the existence of this option, like many others. > > Additionally, there is a suspicion here, especially after discussions that > included Gil (Tene) that the benchmark used to ?validate? this proposed > implementation may have issues.. It would be useful if that benchmark could > be released so that we can actually examine it to determine what, if any, > issues may exist. I?m very happy to help vet this benchmark. > Your help with additional review of the benchmarking process will be useful and highly appreciated. Rodrigo will share specifics. > > In an era when the GC engineers are working so very hard to make as much > of the collection process as concurrent as possible, it just feels very > wrong to rely on a huge STW event to get something done. In that spirit, it > behoves us to explore how committed memory maybe released at the tail end > of a normally triggered GC cycle. I beleive at the end of a mixed cycle was > mentioned. I believe this would give those that want/need to minimize their > JVM?s footprint an even greater cost advantage then attempting to reduce > the footprint at some (un???)random point in time. > How hard/expensive is to implement in G1 something similar like Shenandoah does? Thanks > > Kind regards, > Kirk > > On Jun 1, 2018, at 7:41 AM, Sunny Chan, CLSA wrote: > > (resending for hotspot-gc-dev) > > Hello, > > I have a number of question about the proposed changes for the JEP and I > would like to make the following suggestions and comments > > 1) Are we planning to make this change the default behavior for G1? > Or this is an optional switch for people who runs in a container > environment? > 2) In a trading systems, sometimes there are period of quiescence > and then suddenly a burst of activity towards market close or market event. > The loadavg you suggest to ?monitor? the activity in the system only > reports up to 14mins and it might not necessary a good measure for this > type of applications, especially you would trigger a Full GC > 3) You haven?t fill in the details for ?The GCFrequency value is > ignored and therefore, i.e., no full collection is triggered, if:? > 4) If we are trigging full GC with this we should make sure the GC > reason is populated and log properly in the GC log so we can track it down. > 5) I have not heard of J9 Gencon/Shenandoah providing similar > functionality. Can you point me to further documentation on which feature > you model upon? > > Thanks. > > *Sunny Chan* > *Senior Lead Engineer, Executive Services* > D +852 2600 8907 | M +852 6386 1835 | T +852 2600 8888 > 5/F, One Island East, 18 Westlands Road > , > Island East, Hong Kong > > > > > > > *clsa.com* > *Insights. Liquidity. Capital.* > > > > *A CITIC Securities Company* > > > The content of this communication is intended for the recipient and is > subject to CLSA Legal and Regulatory Notices. > These can be viewed at https://www.clsa.com/disclaimer.html or sent to > you upon request. > Please consider before printing. CLSA is ISO14001 certified and committed > to reducing its impact on the environment. > > > -- Ruslan CEO @ Jelastic -------------- next part -------------- An HTML attachment was scrubbed... URL: From HORIE at jp.ibm.com Fri Jun 1 15:08:31 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Sat, 2 Jun 2018 00:08:31 +0900 Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 In-Reply-To: <4dddb18526a745cc83941a0f58af77f5@sap.com> References: <5625a595-1165-8d48-afbd-8229cdc4ac07@oracle.com> <7e7fc484287e4da4926176e0b5ae1b64@sap.com> <339D50B6-09D5-4342-A687-918A9A096B39@oracle.com> <6D2268EC-C1E7-4C90-BCD3-90D02D21FA08@oracle.com> <3e05cc86c9d4406d8a9875b705fbf1fc@sap.com> <5B0D40F7.5050807@oracle.com> <4dddb18526a745cc83941a0f58af77f5@sap.com> Message-ID: Hi Kim, Erik, and Martin, Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ This change uses forwardee_acquire(), which would generate better code on ARM. Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. Best regards, -- Michihiro, IBM Research - Tokyo From: "Doerr, Martin" To: "Erik ?sterlund" , Kim Barrett , Michihiro Horie , "Andrew Haley (aph at redhat.com)" Cc: "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" , "ppc-aix-port-dev at openjdk.java.net" Date: 2018/05/30 16:18 Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Erik, the current implementation works on PPC because of "MP+sync+addr". So we already rely on ordering of "load volatile field" + "implicit consume" on the reader's side. We have never seen any issues related to this with the compilers we have been using during the ~10 years the PPC implementation exists. PPC supports "MP+lwsync+addr" the same way, so Michihiro's proposal doesn't make it unreliable for PPC. But I'm ok with evaluating acquire barriers although they are not required by the PPC/ARM memory models. ARM/aarch64 will also be affected when the o->forwardee uses load_acquire. So somebody should check the impact. If it is not acceptable we may need to introduce explicit consume. Implicit consume is also bad in shared code because somebody may want to run it on DEC Alpha. Thanks and best regards, Martin -----Original Message----- From: Erik ?sterlund [mailto:erik.osterlund at oracle.com] Sent: Dienstag, 29. Mai 2018 14:01 To: Doerr, Martin ; Kim Barrett ; Michihiro Horie Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Martin and Michihiro, On 2018-05-29 12:30, Doerr, Martin wrote: > Hi Kim, > > I'm trying to understand how this is related to Michihiro's change. The else path of the initial test is not affected by it AFAICS. > So it sounds like a request to fix the current implementation in addition to what his original intend was. I think we are just trying to nail down the correct fencing and just go for that. And yes, this is arguably a pre-existing problem, but in a race involving the very same accesses that we are changing the fencing for. So it is not completely unrelated I suppose. In particular, hotspot has code that assumes that if you on the writer side issue a full fence before publishing a pointer to newly initialized data, then the initializing stores and their side effects should be globally "visible" across the system before the pointer to it is published, and hence elide the need for acquire on the loading side, without relying on retained data dependencies on the loader side. I believe this code falls under that category. It is assumed that the leading fence of the CAS publishing the forwarding pointer makes the initializing stores globally observable before publishing a pointer to the initialized data, hence assuming that any loads able to observe the new pointer would not rely on acquire or data dependent loads to correctly read the initialized data. Unfortunately, this is not reliable in the IRIW case, as per the litmus test "MP+sync+ctrl" as described in "Understanding POWER multiprocessors" ( https://dl.acm.org/citation.cfm?id=1993520 ), as opposed to "MP+sync+addr" that gets away with it because of the data dependency (not IRIW). Similarly, an isync does the job too on the reader side as shown in MP+sync+ctrlisync. So while what I believe was the previous reasoning that the leading sync of the CAS would elide the necessity for acquire on the reader side without relying on data dependent loads (implicit consume), I think that assumption was wrong in the first place and that we do indeed need explicit acquire (even with the precious conservative CAS fencing) in this context to not rely on implicit consume semantics generating the required data dependent loads on the reader side. In practice though, the leading sync of the CAS has been enough to generate the correct machine code. Now, with the leading sync removed, we are increasing the possible holes in the generated machine code due to this flawed reasoning. So it would be nice to do something more sound instead that does not make such assumptions. > Anyway, I agree with that implicit consume is not good. And I think it would be good to treat both o->forwardee() the same way. > What about keeping memory_order_release for the CAS and using acquire for both o->forwardee()? > The case in which the CAS succeeds is safe because the current thread has created new_obj so it doesn't need memory barriers to access it. Sure, that sounds good to me. Thanks, /Erik > Thanks and best regards, > Martin > > > -----Original Message----- > From: Kim Barrett [mailto:kim.barrett at oracle.com] > Sent: Dienstag, 29. Mai 2018 01:54 > To: Michihiro Horie > Cc: Erik Osterlund ; david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; Doerr, Martin > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > >> On May 28, 2018, at 4:12 AM, Michihiro Horie wrote: >> >> Hi Erik, >> >> Thank you very much for your review. >> >> I understood that implicit consume should not be used in the shared code. Also, I believe performance degradation would be negligible even if we use acquire. >> >> New webrev uses memory_order_acq_rel: http://cr.openjdk.java.net/~mhorie/8154736/webrev.10 > This is missing the acquire barrier on the else branch for the initial test, so fails to meet > the previously described minimal requirements for even possibly being sufficient. Any > analysis of weakening the CAS barriers must consider that test and successor code. > > In the analysis, it?s not just the lexically nearby debugging / logging code that needs to be > considered; the forwardee is being returned to caller(s) that will presumably do something > with that object. > > Since the whole point of this discussion is performance, any proposed change should come > with performance information. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From erik.osterlund at oracle.com Fri Jun 1 15:18:03 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 1 Jun 2018 17:18:03 +0200 Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 In-Reply-To: References: <5625a595-1165-8d48-afbd-8229cdc4ac07@oracle.com> <7e7fc484287e4da4926176e0b5ae1b64@sap.com> <339D50B6-09D5-4342-A687-918A9A096B39@oracle.com> <6D2268EC-C1E7-4C90-BCD3-90D02D21FA08@oracle.com> <3e05cc86c9d4406d8a9875b705fbf1fc@sap.com> <5B0D40F7.5050807@oracle.com> <4dddb18526a745cc83941a0f58af77f5@sap.com> Message-ID: <5B1163AB.3080201@oracle.com> Hi Michihiro, Looks good to me. Thanks, /Erik On 2018-06-01 17:08, Michihiro Horie wrote: > > Hi Kim, Erik, and Martin, > > Thank you very much for reminding me that an acquire barrier in the > else-statement for ?!test_mark->is_marked()? is necessary under the > criteria of not relying on the consume. > > I uploaded a new webrev : > http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ > > This change uses forwardee_acquire(), which would generate better code > on ARM. > > Necessary barriers are located in all the paths in > copy_to_survivor_space, and the returned new_obj can be safely handled > in the caller sites. > > I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved > by 5%. Since my previous measurement with implicit consume showed 6% > improvement, adding acquire barriers degraded the performance a > little, but 5% is still good enough. > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo > > Inactive hide details for "Doerr, Martin" ---2018/05/30 16:18:09---Hi > Erik, the current implementation works on PPC because of "Doerr, > Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation > works on PPC because of "MP+sync+addr". > > From: "Doerr, Martin" > To: "Erik ?sterlund" , Kim Barrett > , Michihiro Horie , "Andrew > Haley (aph at redhat.com)" > Cc: "david.holmes at oracle.com" , > "hotspot-gc-dev at openjdk.java.net" , > "ppc-aix-port-dev at openjdk.java.net" > Date: 2018/05/30 16:18 > Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and > copy_to_survivor for ppc64 > > ------------------------------------------------------------------------ > > > > Hi Erik, > > the current implementation works on PPC because of "MP+sync+addr". > So we already rely on ordering of "load volatile field" + "implicit > consume" on the reader's side. We have never seen any issues related > to this with the compilers we have been using during the ~10 years the > PPC implementation exists. > > PPC supports "MP+lwsync+addr" the same way, so Michihiro's proposal > doesn't make it unreliable for PPC. > > But I'm ok with evaluating acquire barriers although they are not > required by the PPC/ARM memory models. > ARM/aarch64 will also be affected when the o->forwardee uses > load_acquire. So somebody should check the impact. If it is not > acceptable we may need to introduce explicit consume. > > Implicit consume is also bad in shared code because somebody may want > to run it on DEC Alpha. > > Thanks and best regards, > Martin > > > -----Original Message----- > From: Erik ?sterlund [mailto:erik.osterlund at oracle.com] > Sent: Dienstag, 29. Mai 2018 14:01 > To: Doerr, Martin ; Kim Barrett > ; Michihiro Horie > Cc: david.holmes at oracle.com; Gustavo Bueno Romero > ; hotspot-dev at openjdk.java.net; > hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and > copy_to_survivor for ppc64 > > Hi Martin and Michihiro, > > On 2018-05-29 12:30, Doerr, Martin wrote: > > Hi Kim, > > > > I'm trying to understand how this is related to Michihiro's change. > The else path of the initial test is not affected by it AFAICS. > > So it sounds like a request to fix the current implementation in > addition to what his original intend was. > > I think we are just trying to nail down the correct fencing and just go > for that. And yes, this is arguably a pre-existing problem, but in a > race involving the very same accesses that we are changing the fencing > for. So it is not completely unrelated I suppose. > > In particular, hotspot has code that assumes that if you on the writer > side issue a full fence before publishing a pointer to newly initialized > data, then the initializing stores and their side effects should be > globally "visible" across the system before the pointer to it is > published, and hence elide the need for acquire on the loading side, > without relying on retained data dependencies on the loader side. I > believe this code falls under that category. It is assumed that the > leading fence of the CAS publishing the forwarding pointer makes the > initializing stores globally observable before publishing a pointer to > the initialized data, hence assuming that any loads able to observe the > new pointer would not rely on acquire or data dependent loads to > correctly read the initialized data. > > Unfortunately, this is not reliable in the IRIW case, as per the litmus > test "MP+sync+ctrl" as described in "Understanding POWER > multiprocessors" (https://dl.acm.org/citation.cfm?id=1993520), as > opposed to "MP+sync+addr" that gets away with it because of the data > dependency (not IRIW). Similarly, an isync does the job too on the > reader side as shown in MP+sync+ctrlisync. So while what I believe was > the previous reasoning that the leading sync of the CAS would elide the > necessity for acquire on the reader side without relying on data > dependent loads (implicit consume), I think that assumption was wrong in > the first place and that we do indeed need explicit acquire (even with > the precious conservative CAS fencing) in this context to not rely on > implicit consume semantics generating the required data dependent loads > on the reader side. In practice though, the leading sync of the CAS has > been enough to generate the correct machine code. Now, with the leading > sync removed, we are increasing the possible holes in the generated > machine code due to this flawed reasoning. So it would be nice to do > something more sound instead that does not make such assumptions. > > > Anyway, I agree with that implicit consume is not good. And I think > it would be good to treat both o->forwardee() the same way. > > What about keeping memory_order_release for the CAS and using > acquire for both o->forwardee()? > > The case in which the CAS succeeds is safe because the current > thread has created new_obj so it doesn't need memory barriers to > access it. > > Sure, that sounds good to me. > > Thanks, > /Erik > > > Thanks and best regards, > > Martin > > > > > > -----Original Message----- > > From: Kim Barrett [mailto:kim.barrett at oracle.com] > > Sent: Dienstag, 29. Mai 2018 01:54 > > To: Michihiro Horie > > Cc: Erik Osterlund ; > david.holmes at oracle.com; Gustavo Bueno Romero ; > hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; > ppc-aix-port-dev at openjdk.java.net; Doerr, Martin > > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and > copy_to_survivor for ppc64 > > > >> On May 28, 2018, at 4:12 AM, Michihiro Horie wrote: > >> > >> Hi Erik, > >> > >> Thank you very much for your review. > >> > >> I understood that implicit consume should not be used in the shared > code. Also, I believe performance degradation would be negligible even > if we use acquire. > >> > >> New webrev uses memory_order_acq_rel: > http://cr.openjdk.java.net/~mhorie/8154736/webrev.10 > > > This is missing the acquire barrier on the else branch for the > initial test, so fails to meet > > the previously described minimal requirements for even possibly > being sufficient. Any > > analysis of weakening the CAS barriers must consider that test and > successor code. > > > > In the analysis, it?s not just the lexically nearby debugging / > logging code that needs to be > > considered; the forwardee is being returned to caller(s) that will > presumably do something > > with that object. > > > > Since the whole point of this discussion is performance, any > proposed change should come > > with performance information. > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 105 bytes Desc: not available URL: From HORIE at jp.ibm.com Fri Jun 1 15:37:04 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Sat, 2 Jun 2018 00:37:04 +0900 Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 In-Reply-To: <5B1163AB.3080201@oracle.com> References: <5625a595-1165-8d48-afbd-8229cdc4ac07@oracle.com> <7e7fc484287e4da4926176e0b5ae1b64@sap.com> <339D50B6-09D5-4342-A687-918A9A096B39@oracle.com> <6D2268EC-C1E7-4C90-BCD3-90D02D21FA08@oracle.com> <3e05cc86c9d4406d8a9875b705fbf1fc@sap.com> <5B0D40F7.5050807@oracle.com> <4dddb18526a745cc83941a0f58af77f5@sap.com> Message-ID: >Hi Michihiro, > >Looks good to me. Thanks a lot, Erik! Best regards, -- Michihiro, IBM Research - Tokyo From: "Erik ?sterlund" To: Michihiro Horie , "Doerr, Martin" Cc: "Andrew Haley (aph at redhat.com)" , "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" , Kim Barrett , "ppc-aix-port-dev at openjdk.java.net" Date: 2018/06/02 00:15 Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Michihiro, Looks good to me. Thanks, /Erik On 2018-06-01 17:08, Michihiro Horie wrote: Hi Kim, Erik, and Martin, Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ This change uses forwardee_acquire(), which would generate better code on ARM. Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. Best regards, -- Michihiro, IBM Research - Tokyo Inactive hide details for "Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of "Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of "MP+sync+addr". From: "Doerr, Martin" To: "Erik ?sterlund" , Kim Barrett , Michihiro Horie , "Andrew Haley (aph at redhat.com)" Cc: "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" , "ppc-aix-port-dev at openjdk.java.net" Date: 2018/05/30 16:18 Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Erik, the current implementation works on PPC because of "MP+sync+addr". So we already rely on ordering of "load volatile field" + "implicit consume" on the reader's side. We have never seen any issues related to this with the compilers we have been using during the ~10 years the PPC implementation exists. PPC supports "MP+lwsync+addr" the same way, so Michihiro's proposal doesn't make it unreliable for PPC. But I'm ok with evaluating acquire barriers although they are not required by the PPC/ARM memory models. ARM/aarch64 will also be affected when the o->forwardee uses load_acquire. So somebody should check the impact. If it is not acceptable we may need to introduce explicit consume. Implicit consume is also bad in shared code because somebody may want to run it on DEC Alpha. Thanks and best regards, Martin -----Original Message----- From: Erik ?sterlund [mailto:erik.osterlund at oracle.com] Sent: Dienstag, 29. Mai 2018 14:01 To: Doerr, Martin ; Kim Barrett ; Michihiro Horie Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Martin and Michihiro, On 2018-05-29 12:30, Doerr, Martin wrote: > Hi Kim, > > I'm trying to understand how this is related to Michihiro's change. The else path of the initial test is not affected by it AFAICS. > So it sounds like a request to fix the current implementation in addition to what his original intend was. I think we are just trying to nail down the correct fencing and just go for that. And yes, this is arguably a pre-existing problem, but in a race involving the very same accesses that we are changing the fencing for. So it is not completely unrelated I suppose. In particular, hotspot has code that assumes that if you on the writer side issue a full fence before publishing a pointer to newly initialized data, then the initializing stores and their side effects should be globally "visible" across the system before the pointer to it is published, and hence elide the need for acquire on the loading side, without relying on retained data dependencies on the loader side. I believe this code falls under that category. It is assumed that the leading fence of the CAS publishing the forwarding pointer makes the initializing stores globally observable before publishing a pointer to the initialized data, hence assuming that any loads able to observe the new pointer would not rely on acquire or data dependent loads to correctly read the initialized data. Unfortunately, this is not reliable in the IRIW case, as per the litmus test "MP+sync+ctrl" as described in "Understanding POWER multiprocessors" (https://dl.acm.org/citation.cfm?id=1993520), as opposed to "MP+sync+addr" that gets away with it because of the data dependency (not IRIW). Similarly, an isync does the job too on the reader side as shown in MP+sync+ctrlisync. So while what I believe was the previous reasoning that the leading sync of the CAS would elide the necessity for acquire on the reader side without relying on data dependent loads (implicit consume), I think that assumption was wrong in the first place and that we do indeed need explicit acquire (even with the precious conservative CAS fencing) in this context to not rely on implicit consume semantics generating the required data dependent loads on the reader side. In practice though, the leading sync of the CAS has been enough to generate the correct machine code. Now, with the leading sync removed, we are increasing the possible holes in the generated machine code due to this flawed reasoning. So it would be nice to do something more sound instead that does not make such assumptions. > Anyway, I agree with that implicit consume is not good. And I think it would be good to treat both o->forwardee() the same way. > What about keeping memory_order_release for the CAS and using acquire for both o->forwardee()? > The case in which the CAS succeeds is safe because the current thread has created new_obj so it doesn't need memory barriers to access it. Sure, that sounds good to me. Thanks, /Erik > Thanks and best regards, > Martin > > > -----Original Message----- > From: Kim Barrett [mailto:kim.barrett at oracle.com] > Sent: Dienstag, 29. Mai 2018 01:54 > To: Michihiro Horie > Cc: Erik Osterlund ; david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; Doerr, Martin > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > >> On May 28, 2018, at 4:12 AM, Michihiro Horie wrote: >> >> Hi Erik, >> >> Thank you very much for your review. >> >> I understood that implicit consume should not be used in the shared code. Also, I believe performance degradation would be negligible even if we use acquire. >> >> New webrev uses memory_order_acq_rel: http://cr.openjdk.java.net/~mhorie/8154736/webrev.10 > This is missing the acquire barrier on the else branch for the initial test, so fails to meet > the previously described minimal requirements for even possibly being sufficient. ?Any > analysis of weakening the CAS barriers must consider that test and successor code. > > In the analysis, it?s not just the lexically nearby debugging / logging code that needs to be > considered; the forwardee is being returned to caller(s) that will presumably do something > with that object. > > Since the whole point of this discussion is performance, any proposed change should come > with performance information. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From sangheon.kim at oracle.com Fri Jun 1 21:48:43 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Fri, 1 Jun 2018 14:48:43 -0700 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> Message-ID: Hi all, As webrev.0 is conflicting with webrev.0 of "8203319: JDK-8201487 disabled too much queue balancing"(out for review, but not yet pushed), I'm posting webrev.1. http://cr.openjdk.java.net/~sangheki/8043575/webrev.1 http://cr.openjdk.java.net/~sangheki/8043575/webrev.1_to_0 Thanks, Sangheon On 5/30/18 9:43 PM, sangheon.kim at oracle.com wrote: > Hi all, > > Could I have some reviews for this patch? > > This patch is suggesting ergonomically choosing worker thread count > from given reference count. > We have ParallelRefProcEnabled command-line option which enables to > use ALL workers during reference processing however this option has a > drawback when there's limited number of references. i.e. spends more > time on thread start-up/tear-down than actual processing time if there > are less references. And also we use all threads or single thread > during reference processing which seems less flexible on thread > counts. This patch calculates the worker counts from dividing > reference count by ReferencesPerThread(newly added experimental option). > My suggestion for the default value of ReferencePerThread is 1000 as > it showed good results from some benchmarks. > > Notes: > 1. CMS ParNew is excluded from this patch because: > ??? a) There is a separate CR for CMS (JDK-6938732). > ??? b) It is tricky to manage switching single <-> MT processing > inside of ReferenceProcessor class for ParNew. Tony explained quite > well about the reason here ( > https://bugs.openjdk.java.net/browse/JDK-6938732?focusedCommentId=13932462&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13932462 > ). > ??? c) CMS will be obsoleted in the future so not motivated to fix > within this patch. > 2. JDK-8203951 is the CR for removing temporarily added > flag(ReferenceProcessor::_has_adjustable_queue from webrev.0) to > manage ParNew. So the flag should be removed when CMS is obsoleted. > 3. Current logic of dividing by ReferencesPerThread would be replaced > with better implementation. e.g. time measuring implementation etc. > But I think current approach is simply and good enough. > 4. This patch is based on JDK-8204094 and JDK-8204095, both are not > yet pushed so far. > > CR: https://bugs.openjdk.java.net/browse/JDK-8043575 > Webrev: http://cr.openjdk.java.net/~sangheki/8043575/webrev.0/ > Testing: hs-tier 1~5 with/without ParallelRefProcEnabled > > Thanks, > Sangheon -------------- next part -------------- An HTML attachment was scrubbed... URL: From gerard.ziemski at oracle.com Fri Jun 1 21:46:40 2018 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Fri, 1 Jun 2018 16:46:40 -0500 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> Message-ID: <3B1FF3B8-6A10-401A-A11D-3D027DD59702@oracle.com> Hi, Awesome job Robbin! I especially like that we now have self resizing StringTable (though I don?t like the power of 2 size constraint, but understand why that needs to be so). I do have some feedback, questions and comments below: #1 The initial table size according to this code: #define START_SIZE 16 _current_size = ((size_t)1) << START_SIZE; Hence, it is 65536, but in the old code we had: const int defaultStringTableSize = NOT_LP64(1009) LP64_ONLY(60013); Which on non-64bit architecture was quite a bit smaller. Is it OK for us not to worry about non 64bit architectures now? BTW. It will be extremely interesting to see whether we can lower the initial size, now that we can grow the table. Should we file a followup issue, so we don't forget? #2 Why we have: volatile size_t _items; DEFINE_PAD_MINUS_SIZE(1, 64, sizeof(volatile size_t)); volatile size_t _uncleaned_items; DEFINE_PAD_MINUS_SIZE(2, 64, sizeof(volatile size_t)); and not volatile size_t _items; DEFINE_PAD_MINUS_SIZE(1, DEFAULT_CACHE_LINE_SIZE, sizeof(volatile size_t)); volatile size_t _uncleaned_items; DEFINE_PAD_MINUS_SIZE(2, DEFAULT_CACHE_LINE_SIZE, sizeof(volatile size_t)); #3 Extraneous space here, i.e. " name": return StringTable::the_table()->do_lookup( name, len, hash); #4 Instead of: double fact = StringTable::get_load_factor(); double dead_fact = StringTable::get_dead_factor(); where "fact" is an actual word on its own, can we consider using full names, ex: double load_factor = StringTable::get_load_factor(); double dead_factor = StringTable::get_dead_factor(); #5 In "static int literal_size(oop obj)": a) Why do we need the "else" clause? Will it ever be taken? } else { return obj->size(); } b) Why isn't "java_lang_String::value(obj)->size()" enough in: } else if (obj->klass() == SystemDictionary::String_klass()) { return (obj->size() + java_lang_String::value(obj)->size()) * HeapWordSize; } #6 Can we rename "StringtableDCmd" to "StringtableDumpCmd?? #7 Isn't "#define PREF_AVG_LIST_LEN 2" a bit too aggressive? Where did the value come from? #8 Should we consider adding runtime flag options to control when resizing/cleanup triggers? (i.e. PREF_AVG_LIST_LEN, CLEAN_DEAD_HIGH_WATER_MARK) #9 Why do we only resize up, if we can also resize down? #10 Do we know the impact of the new table on memory usage (at the default initial size)? #11 You mention various benchmarks were ran with no issues, which I take to mean as no regressions, but are there any statistically significant improvements shown that you can report? #12 I mentioned this to you off the list, but in a case anyone else tries to run the code - the changes don?t build (which you pointed out to me is due to 8191798). Once things build again, I?d like an opportunity to be able to run the code for myself to check it out, and then report back with final review. Cheers > On May 28, 2018, at 8:19 AM, Robbin Ehn wrote: > > Hi all, please review. > > This implements the StringTable with the ConcurrentHashtable for managing the > strings using oopStorage for backing the actual oops via WeakHandles. > > The unlinking and freeing of hashtable nodes is moved outside the safepoint, > which means GC only needs to walk the oopStorage, either concurrently or in a > safepoint. Walking oopStorage is also faster so there is a good effect on all > safepoints visiting the oops. > > The unlinking and freeing happens during inserts when dead weak oops are > encountered in that bucket. In any normal workload the stringtable self-cleans > without needing any additional cleaning. Cleaning/unlinking can also be done > concurrently via the ServiceThread, it is started when we have a high ?dead > factor?. E.g. application have a lot of interned string removes the references > and never interns again. The ServiceThread also concurrently grows the table if > ?load factor? is high. Both the cleaning and growing take care to not prolonging > time to safepoint, at the cost of some speed. > > Kitchensink24h, multiple tier1-5 with no issue that I can relate to this > changeset, various benchmark such as JMH, specJBB2015. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 > Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ > > Thanks, Robbin From rkennke at redhat.com Sun Jun 3 12:27:06 2018 From: rkennke at redhat.com (Roman Kennke) Date: Sun, 3 Jun 2018 14:27:06 +0200 Subject: RFR: JDK-8198285: More consistent Access API for arraycopy In-Reply-To: <5B1133F3.4040704@oracle.com> References: <5AE09F74.7050005@oracle.com> <5AE1F062.3020303@oracle.com> <3bb2415e-4d15-fbd3-dde2-73a25c7bd65b@redhat.com> <5B1133F3.4040704@oracle.com> Message-ID: Hi Erik Thanks for the review. See my comments inline below: > Thanks for doing this. A few comments... > > This: > ArrayAccess<>::arraycopy_from_native(... > > ...won't reliably compile without saying "template", like this: > ArrayAccess<>::template arraycopy_from_native(... > > ...because the template member is qualified on another template class. > My suggestion is to remove the explicit and let that be inferred > from the address instead. Ok, I've done that for all the cases where it's possible. The heap->heap cases cannot do that because there's no address pointer to infer from. I've added the 'template' keyword there. > And things like this: > ?bool RuntimeDispatch BARRIER_ARRAYCOPY>::arraycopy_init(arrayOop src_obj, size_t > src_offset_in_bytes, const T* src_raw, arrayOop dst_obj, size_t > dst_offset_in_bytes, T* dst_raw, size_t length) { > > ...as a single line, could do with some newlines to make it reasonably > wide. Ok, I've broken those lines down to have src args in one line, dst args in another. This should make it more readable. > Now that we copy both from heap to heap, from heap to native, and native > to heap, we have to be a bit more careful in modref > oop_arraycopy_in_heap. In the covariant case of arraycopy, you > unconditionally apply the pre and post barriers on the destination. > However, if the destination is native > (ArrayAccess::arraycopy_to_native), that does not feel like it will go > particularly well, unless I have missed something. oop arrays are never copied from/to native. > Also your new ArrayAccess class inherits from HeapAccess decorators>, which is nice. But its member functions call > HeapAccess::arraycopy, which will miss out on the > IN_HEAP_ARRAY decorator, which might lead to not using precise marking > in barrier backends. If I were you, I would probably typedef a > BaseAccess for HeapAccess inside of > ArrayAccess, and call that in the member functions. I am not sure how to do this typedef trick, but I added | IN_HEAP_ARRAY to all the calls. It also required to extend the verify check on the decorators to include IN_HEAP_ARRAY. Differential: http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/ Full: http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/ What do you think? Cheers, Roman > Thanks, > /Erik > > On 2018-06-01 00:25, Roman Kennke wrote: >> Hi Erik, >> >> It took me a while to get back to this. >> >> I wrote a little wrapper around the extended arraycopy API that allows >> to basically write exactly what you suggested: >> >> >>> ArrayAccess::arraycopy_from_native(ptr, obj, index, >>> length); >> I agree that this seems the cleanest solution. >> >> The backend still sees both obj+off OR raw-ptr, but I think this is ok. >> >> Differential: >> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/ >> Full: >> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/ >> >> What do you think? >> >> Roman >> >>> Each required parameter is clear when you read the API. >>> >>> But I am open to suggestions of course. >>> >>> Thanks, >>> /Erik >>> >>> On 2018-04-25 17:59, Roman Kennke wrote: >>>>> 1) Resolve the addresses as we do today, but you think you get better >>>>> Shenandoah performance if we either >>>>> 2) Have 3 different access calls, (from heap to native, from native to >>>>> heap, from heap to heap) for copying, or >>>>> 3) Have 1 access call that can encode the above 3 variants, but looks >>>>> ugly at the call sites. >>>> There's also the idea to pass Address arguments to arraycopy, one for >>>> src, one for dst, and have 2 different subclasses: one for obj+offset >>>> (heap access) and one with direct pointer (raw). Your comment gave me >>>> the idea to also provide arrayOop+idx. This would look clean on the >>>> caller side I think. >>>> >>>> It would also be useful on the GC side: BarrierSets would specialize >>>> only in the variants that they are interested in, for example, in case >>>> of Shenandoah: >>>> 1. arraycopy(HeapAddress,HeapAddress) Java->Java >>>> 2. arraycopy(HeapAddress,RawAddress)? Java->native >>>> 3. arraycopy(RawAddress,HeapAddress)? native->Java >>>> >>>> other barriersets can ignore the exact type and only call the args >>>> address->resolve() or so to get the actual raw address. >>>> >>>> This would *also* be beneficial for the other APIs: instead of having >>>> all the X() and X_at() variants, we can just use one X variant that >>>> either takes RawAddress or HeapAddress. >>>> >>>> I made a little (not yet compiling/working) prototype of this a while >>>> ago: >>>> >>>> http://cr.openjdk.java.net/~rkennke/JDK-8199801-2.patch >>>> >>>> >>>> What do you think? Would it make sense to go further down that road? >>>> >>>> Roman >>>> >>>>> You clearly went for 3, which leaves the callsites looking rather hard >>>>> to read. It is for example not obvious for me what is going on here >>>>> (javaClasses.cpp line 313): >>>>> >>>>> HeapAccess<>::arraycopy(NULL, 0, reinterpret_cast>>>> jbyte*>(utf8_str), value(h_obj()), >>>>> typeArrayOopDesc::element_offset(0), NULL, length); >>>>> >>>>> ...without looking very carefully at the long list of arguments >>>>> encoding >>>>> what is actually going on (copy from native to the heap). What is >>>>> worse >>>>> is that this will probably not compile without adding the template >>>>> keyword to the call (since you have a qualified template member >>>>> function >>>>> behind a template class), like this: >>>>> HeapAccess<>::template arraycopy(NULL, 0, >>>>> reinterpret_cast>>>> jbyte*>(utf8_str), value(h_obj()), >>>>> typeArrayOopDesc::element_offset(0), NULL, length); >>>>> >>>>> ...which as a public API leaves me feeling a bit like this:???? :C >>>>> >>>>> May I suggest adding an array access helper. The idea is to keep a >>>>> single call through Access, and a single backend point for array copy, >>>>> but let the helper provide the three different types of copying as >>>>> different functions, so that the call sites still look pretty and easy >>>>> to read and follow. >>>>> >>>>> This helper could additionally have load_at, and store_at use array >>>>> indices as opposed to offsets, and hence hide the offset >>>>> calculations we >>>>> perform today (typically involving checking if we are using compressed >>>>> oops or not). >>>>> >>>>> I am thinking something along the lines of >>>>> ArrayAccess<>::arraycopy_to_native(readable_arguments), >>>>> ArrayAccess<>::arraycopy_from_native(readable_arguments), >>>>> ArrayAccess<>::arraycopy(readable_arguments), which translates to some >>>>> form of Access<>::arraycopy(unreadable_arguments). And for example >>>>> ArrayAccess<>::load_at(obj, index) would translate to some kind of >>>>> HeapAccess::load_at(obj, offset_for_index(index)) as a >>>>> bonus, making everyone using the API jump with happiness. >>>>> >>>>> What do you think about this idea? Good or bad? I guess the >>>>> question is >>>>> whether this helper should be in access.hpp, or somewhere else >>>>> (like in >>>>> arrayOop). Thoughts are welcome. >>>>> >>>>> Thanks, >>>>> /Erik >>>>> >>>>> On 2018-04-11 19:54, Roman Kennke wrote: >>>>>> ??? Currently, the arraycopy API in access.hpp gets the src and dst >>>>>> oops, >>>>>> plus the src and dst addresses. In order to be most useful to garbage >>>>>> collectors, it should receive the src and dst oops together with the >>>>>> src >>>>>> and dst offsets instead, and let the Access API / GC calculate the >>>>>> src >>>>>> and dst addresses. >>>>>> >>>>>> For example, Shenandoah needs to resolve the src and dst objects for >>>>>> arraycopy, and then apply the corresponding offsets. With the current >>>>>> API (obj+ptr) it would calculate the ptr-diff from obj to ptr, then >>>>>> resolve obj, then re-add the calculate ptr-diff. This is fragile >>>>>> because >>>>>> we also may resolve obj in the runtime before calculating ptr >>>>>> (e.g. via >>>>>> arrayOop::base()). If we then pass in the original obj and a ptr >>>>>> calculated from another copy of the same obj, the above resolution >>>>>> logic >>>>>> would not work. This is currently the case for obj-arraycopy. >>>>>> >>>>>> I propose to change the API to accept obj+offset, in addition to ptr >>>>>> for >>>>>> both src and dst. Only one or the other should be used. Heap accesses >>>>>> should use obj+offset and pass NULL for raw-ptr, off-heap accesses >>>>>> (or >>>>>> heap accesses that are already resolved.. use with care) should pass >>>>>> NULL+0 for obj+offset and the raw-ptr. Notice that this also >>>>>> allows the >>>>>> API to be used for Java<->native array bulk transfers. >>>>>> >>>>>> An alternative would be to break the API up into 4 variants: >>>>>> >>>>>> Java->Java transfer: >>>>>> arraycopy(oop src, size_t src_offs, oop dst, size_t dst_offs, size_t >>>>>> len) >>>>>> >>>>>> Java->Native transfer: >>>>>> arraycopy(oop src, size_t src_offs, D* raw_dst, size_t len) >>>>>> >>>>>> Native->Java transfer: >>>>>> arraycopy(S* src_raw, oop dst, size_t dst_offs, size_t len) >>>>>> >>>>>> 'Unsafe' transfer: >>>>>> arraycopy(S* src_raw, D* dst_raw, size_t len) >>>>>> >>>>>> >>>>>> But that seemed to be too much boilerplate copy+pasting for my taste. >>>>>> (See how having this overly complicated template layer hurts us?) >>>>>> >>>>>> Plus, I had a better idea: instead of accepting oop+offset OR T* for >>>>>> almost every Access API, we may want to abstract that and take an >>>>>> Address type argument, which would be either HeapAddress(obj, >>>>>> offset) or >>>>>> RawAddress(T* ptr). GCs may then just call addr->address() to get the >>>>>> actual address, or specialize for HeapAddress variants and resolve >>>>>> the >>>>>> objs and then resolve the address. This would also allow us to get >>>>>> rid >>>>>> of almost half of the API (all the *_at variants would go) and some >>>>>> other simplifications. However, this seemed to explode the scope of >>>>>> this >>>>>> RFE, and would be better handled in another RFE. >>>>>> >>>>>> This changes makes both typeArrayKlass and objArrayKlass use the >>>>>> changed >>>>>> API, plus I identified all (hopefully) places where we do bulk >>>>>> Java<->native array transfers and make them use the API too. Gets us >>>>>> rid >>>>>> of a bunch of memcpy calls :-) >>>>>> >>>>>> Please review the change: >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.00/ >>>>>> >>>>>> Thanks, Roman >>>>>> >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From rbruno at gsd.inesc-id.pt Sun Jun 3 15:44:18 2018 From: rbruno at gsd.inesc-id.pt (Rodrigo Bruno) Date: Sun, 3 Jun 2018 17:44:18 +0200 Subject: JEP draft: Dynamic Max Memory Limit [Was. Re: Elastic JVM improvements] In-Reply-To: <3d1484befe75b5296589decc1b0df05cdbefac29.camel@oracle.com> References: <3d1484befe75b5296589decc1b0df05cdbefac29.camel@oracle.com> Message-ID: Hi Thomas, further suggestions and rearrangements on this JEP draft follow below: *Goals:* The goal of this JEP is to allow the increase and/or decrease of the amount of memory available to the application. *Non-Goals:* It is not a goal to change current heap sizing algorithms. It is not a goal to dynamically change the maximum heap size limit as this would require a lot of engineering effort (maybe in the future). *Success Metrics:* The implementation should allow a user to increase and/or reduce the amount of memory that can be used by the application. This must be possible at any point during the execution of an application. If it is not possible to increase or decrease the amount of memory available to the application, the operation should fail and the user must be aware of the result of the operation. *Motivation:* Elasticity is the key feature of the cloud computing. It enables to scale resources according to application workloads timely. Now we live in the container era. Containers can be scaled vertically on the fly without downtime. This provides much better elasticity and density compared to VMs. However, JVM-based applications are not fully container-ready. One of the current issues is the fact that it is not possible to increase the size of JVM Heap in runtime. If your production application has an unpredictable traffic spike, the only one way to increase the Heap size is to restart the JVM with a new Xmx parameter. *Alternatives:* There are two alternatives: 1 - restart a JVM whenever the application needs more or less memory. This will adapt the memory usage of the JVM to the application needs at the cost of downtime (which can be prohibitive for many applications). 2 - grant a large maximum memory limit. This will eventually lead to resource wastage. *Testing:* Section 5.4 of the paper available at http://www.gsd.inesc-id.pt/~rbruno/publications/rbruno-ismm18.pdf shows that having a very high maxium memory limit (-Xmx) leads to a very small increase in the memory footprint. For example, increasing -Xmx by 32GB increases the footprint by 31MB. Best, rodrigo 2018-05-30 21:44 GMT+02:00 Thomas Schatzl : > Hi, > > fyi, I filed https://bugs.openjdk.java.net/browse/JDK-8204088 and put > the text we have in there. > > Thanks, > Thomas > > On Wed, 2018-05-30 at 11:43 +0100, Rodrigo Bruno wrote: > > Summary > > ------- > > > > // REQUIRED -- Provide a short summary of the proposal, at most one > > or two > > // sentences. This summary will be rolled up into feature lists, > > JSRs, and > > // other documents, so please take the time to make it short and > > sweet. > > > > This JEP allows the JVM to dynamically adjust the maximum available > > memory to > > the application. It provides a dynamic limit for the maximum memory > > limit as opposed to > > the current static limit (-Xmx). > > > > Goals > > ----- > > > > // What are the goals of this proposal? Omit this section if you > > have > > // nothing to say beyond what's already in the summary. > > > > Increase and decrease the heap size on demand. > > > > Non-Goals > > --------- > > > > // Describe any goals you wish to identify specifically as being out > > of > > // scope for this proposal. > > > > Success Metrics > > --------------- > > > > // If the success of this work can be gauged by specific numerical > > // metrics and associated goals then describe them here. > > > > Section 5.4 of the paper available at http://www.gsd.inesc-id.pt/~rbr > > uno/publications/rbruno-ismm18.pdf > > shows that having a very high maxium memory limit (-Xmx) leads to a > > very small increase in the > > memory footprint. For example, increasing -Xmx by 32GB increases the > > footprint by 31MB. > > > > > > Motivation > > ---------- > > > > // Why should this work be done? What are its benefits? Who's > > asking > > // for it? How does it compare to the competition, if any? > > > > Currently, it is not possible to increase the size of JVM Heap in > > runtime. > > If your production application has an unpredictable traffic spike, > > the only one way to increase the Heap size is to restart the JVM with > > a new Xmx parameter. > > > > Description > > ----------- > > > > // REQUIRED -- Describe the enhancement in detail: Both what it is > > and, > > // to the extent understood, how you intend to implement it. > > Summarize, > > // at a high level, all of the interfaces you expect to modify or > > extend, > > // including Java APIs, command-line switches, library/JVM > > interfaces, > > // and file formats. Explain how failures in applications using this > > // enhancement will be diagnosed, both during development and in > > // production. Describe any open design issues. > > // > > // This section will evolve over time as the work progresses, > > ultimately > > // becoming the authoritative high-level description of the end > > result. > > // Include hyperlinks to additional documents as required. > > > > To dynamically limit how large the committed memory (i.e. the heap > > size) can grow, a new dynamically user-defined variable is > > introduced: CurrentMaxHeapSize. This variable (defined in bytes) > > limits how large the heap can be expanded. It can be set at launch > > time and changed at runtime. Regardless of when it is defined, it > > must always have a value equal or below to MaxHeapSize (Xmx - the > > launch time option that limits how large the heap can grow). Unlike > > MaxHeapSize, CurrentMaxHeapSize, can be dynamically changed at > > runtime. > > > > The expected usage is to setup the JVM with a very conservative Xmx > > value (which is shown to have a very small impact on memory > > footprint) and > > then control how large the heap is using the CurrentMaxHeapSize > > dynamic limit. > > > > > > Alternatives > > ------------ > > > > // Did you consider any alternative approaches or technologies? If > > so > > // then please describe them here and explain why they were not > > chosen. > > > > Testing > > ------- > > > > // What kinds of test development and execution will be required in > > order > > // to validate this enhancement, beyond the usual mandatory unit > > tests? > > // Be sure to list any special platform or hardware requirements. > > > > Risks and Assumptions > > --------------------- > > > > // Describe any risks or assumptions that must be considered along > > with > > // this proposal. Could any plausible events derail this work, or > > even > > // render it unnecessary? If you have mitigation plans for the known > > // risks then please describe them. > > > > Dependencies > > ----------- > > > > // Describe all dependencies that this JEP has on other JEPs, JBS > > issues, > > // components, products, or anything else. Dependencies upon JEPs or > > JBS > > // issues should also be recorded as links in the JEP issue itself. > > // > > // Describe any JEPs that depend upon this JEP, and likewise make > > sure > > // they are linked to this issue in JBS. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sunny.chan at clsa.com Mon Jun 4 02:00:23 2018 From: sunny.chan at clsa.com (Sunny Chan, CLSA) Date: Mon, 4 Jun 2018 02:00:23 +0000 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: References: Message-ID: <537f42e0c83a4fc8b75fb683081d451d@clsa.com> Replied inline. From: Ruslan Synytsky [mailto:synytskyy at jelastic.com] Sent: Friday, June 1, 2018 9:22 PM To: Sunny Chan, CLSA Cc: rbruno at gsd.inesc-id.pt; hotspot-gc-dev at openjdk.java.net; synytskyy at jelastic.com Subject: Re: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory Hi Sunny. On 1 June 2018 at 07:36, Sunny Chan, CLSA > wrote: Hello, I have a number of question about the proposed changes for the JEP and I would like to make the following suggestions and comments 1) Are we planning to make this change the default behavior for G1? Or this is an optional switch for people who runs in a container environment? Optional. It will be enough for beginning. You say ?enough for the beginning? ? do you want to change them to be the default at some point? Please document this in the JEP and highlight that this is optional. 2) In a trading systems, sometimes there are period of quiescence and then suddenly a burst of activity towards market close or market event. The loadavg you suggest to ?monitor? the activity in the system only reports up to 14mins and it might not necessary a good measure for this type of applications, especially you would trigger a Full GC I'm sure certain number of workloads can't afford it. Such kind of projects or other mission critical environments simply should not use this option. 3) You haven?t fill in the details for ?The GCFrequency value is ignored and therefore, i.e., no full collection is triggered, if:? Thanks. It's a missed part. Here are the "if" rules: * GCFrequency is zero or below * the average load on the host system is above MaxLoadGC. The MaxLoadGC is a dynamically user-defined variable. This check is ignored if MaxLoadGC is zero or below * the committed memory is above MinCommitted bytes. MinCommitted is a dynamically user-defined variable. This check is ignored if MinCommitted is zero or below * the difference between the current heap capacity and the current heap usage is below MaxOverCommitted bytes. The MaxOverCommitted is a dynamically user-defined variable. This check is ignored if MaxOverCommitted is zero or below The doc will be updated. 4) If we are trigging full GC with this we should make sure the GC reason is populated and log properly in the GC log so we can track it down. Good point. I would like to see this as a goal in the JEP, and this should be clearly documented how people can track down GC caused by the flag. 5) I have not heard of J9 Gencon/Shenandoah providing similar functionality. Can you point me to further documentation on which feature you model upon? What we know so far that OpenJ9 provides * -XX:+IdleTuningCompactOnIdle - this option controls garbage collection processing with compaction when the status of the JVM is set to idle * and -Xsoftmx - this option sets a "soft" maximum limit for the initial size of the Java? heap. We have not tested it yet, but the idea looks similar. Do we have anyone in the group involved in OpenJ9 to confirm or refute this statement? Thank you Thanks. Sunny Chan Senior Lead Engineer, Executive Services D +852 2600 8907 | M +852 6386 1835 | T +852 2600 8888 5/F, One Island East, 18 Westlands Road, Island East, Hong Kong [:1. Social Media Icons:CLSA_Social Media Icons_linkedin.png][:1. Social Media Icons:CLSA_Social Media Icons_twitter.png][:1. Social Media Icons:CLSA_Social Media Icons_youtube.png][:1. Social Media Icons:CLSA_Social Media Icons_facebook.png] clsa.com Insights. Liquidity. Capital. [CLSA_RGB] A CITIC Securities Company The content of this communication is intended for the recipient and is subject to CLSA Legal and Regulatory Notices. These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon request. Please consider before printing. CLSA is ISO14001 certified and committed to reducing its impact on the environment. -- Ruslan CEO @ Jelastic The content of this communication is intended for the recipient and is subject to CLSA Legal and Regulatory Notices. These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon request. Please consider before printing. CLSA is ISO14001 certified and committed to reducing its impact on the environment. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1122 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 1206 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 1070 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 1024 bytes Desc: image004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.png Type: image/png Size: 4859 bytes Desc: image005.png URL: From per.liden at oracle.com Mon Jun 4 06:51:40 2018 From: per.liden at oracle.com (Per Liden) Date: Mon, 4 Jun 2018 08:51:40 +0200 Subject: RFR (round 4), JEP-318: Epsilon GC In-Reply-To: <0b5c827a-2763-98aa-06af-24df9028aed7@oracle.com> References: <094e1093-5a13-4853-aa34-d4e987a069b0@redhat.com> <0b5c827a-2763-98aa-06af-24df9028aed7@oracle.com> Message-ID: Hi, On 06/01/2018 10:13 AM, Jini George wrote: [...] > ==>? share/classes/sun/jvm/hotspot/oops/ObjectHeap.java > > 445???? } else { > 446??????? if (Assert.ASSERTS_ENABLED) { > 447?????????? Assert.that(false, "Expecting GenCollectedHeap, > G1CollectedHeap, " + > 448?????????????????????? "or ParallelScavengeHeap, but got " + > 449?????????????????????? heap.getClass().getName()); > 450??????? } > 451???? } > > * Please add EpsilonGC also to the assertion statement. > May I suggest that we change this to something like this, to avoid having to update this message when a new collector is added? Assert.that(false, "Unexpected CollectedHeap type: " + heap.getClass().getName()); /Per From jini.george at oracle.com Mon Jun 4 06:58:16 2018 From: jini.george at oracle.com (Jini George) Date: Mon, 4 Jun 2018 12:28:16 +0530 Subject: RFR (round 4), JEP-318: Epsilon GC In-Reply-To: References: <094e1093-5a13-4853-aa34-d4e987a069b0@redhat.com> <0b5c827a-2763-98aa-06af-24df9028aed7@oracle.com> Message-ID: Sounds good to me. Thanks, Jini. On 6/4/2018 12:21 PM, Per Liden wrote: > Hi, > > On 06/01/2018 10:13 AM, Jini George wrote: > [...] >> ==>? share/classes/sun/jvm/hotspot/oops/ObjectHeap.java >> >> 445???? } else { >> 446??????? if (Assert.ASSERTS_ENABLED) { >> 447?????????? Assert.that(false, "Expecting GenCollectedHeap, >> G1CollectedHeap, " + >> 448?????????????????????? "or ParallelScavengeHeap, but got " + >> 449?????????????????????? heap.getClass().getName()); >> 450??????? } >> 451???? } >> >> * Please add EpsilonGC also to the assertion statement. >> > > May I suggest that we change this to something like this, to avoid > having to update this message when a new collector is added? > > Assert.that(false, "Unexpected CollectedHeap type: " + > heap.getClass().getName()); > > /Per From thomas.schatzl at oracle.com Mon Jun 4 08:30:24 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 04 Jun 2018 10:30:24 +0200 Subject: RFR(s): 8204094: assert(worker_i < _length) failed: Worker 15 is greater than max: 11 at ReferenceProcessorPhaseTimes In-Reply-To: <7236ef64-533d-48ad-ecc2-c6bdd12ed4cd@oracle.com> References: <7236ef64-533d-48ad-ecc2-c6bdd12ed4cd@oracle.com> Message-ID: <684dc326e23e8efb4d94e77bfec3f2b1d3248b27.camel@oracle.com> Hi, On Wed, 2018-05-30 at 19:19 -0700, sangheon.kim at oracle.com wrote: > Hi all, > > Can I have reviews for this patch that fixes assertion failure at > ReferenceProcessorPhaseTimes? > > [..] > > This patch is proposing to use maximum queue when create > ReferenceProcessorPhaseTimes. > > CR: https://bugs.openjdk.java.net/browse/JDK-8204094 > Webrev: http://cr.openjdk.java.net/~sangheki/8204094/webrev.0 > Testing: hs-tier1-5 with/without ParallelRefProcEnabled looks good Thomas From thomas.schatzl at oracle.com Mon Jun 4 08:56:39 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 04 Jun 2018 10:56:39 +0200 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> Message-ID: <1a4986a8d19adc3f25efc707b7d026eb81161438.camel@oracle.com> Hi Sangheon, On Fri, 2018-06-01 at 14:48 -0700, sangheon.kim at oracle.com wrote: > Hi all, > > As webrev.0 is conflicting with webrev.0 of "8203319: JDK-8201487 > disabled too much queue balancing"(out for review, but not yet > pushed), I'm posting webrev.1. > > http://cr.openjdk.java.net/~sangheki/8043575/webrev.1 > http://cr.openjdk.java.net/~sangheki/8043575/webrev.1_to_0 > Some minor comments: - g1ConcurrentMark.cpp:1521: maybe update the comment or remove it. - g1FullGCReferenceProcessorExecutor.cpp: I think G1FullGCReferenceProcessingExecutor::run_task(AbstractGangTask*) is not used any more. - I think that in line with not supporting this feature in CMS, the assert in concurrentMarkSweepGeneration.cpp:5126 should not check >= but ==? Same in parNewGeneration.cpp:796. - gc_globals.hpp: I would prefer something like the following for the text for ReferencesPerThread: "Ergonomically start one thread for this amount of references for reference processing if ParallelRefProcEnabled is true. Specify 0 to disable and use all threads." - would you mind renaming ReferenceProcessor::has_adjustable_queue? Not the queues are adjusted, but the number of processing threads. - since the feature changes the number of threads, it should also naturally affect the need_balance_queues() method and impact the code in process_discovered_reflist(). I think must_balance must be re- evaluated after every decision to set the number of threads. At the moment the number of threads is adjusted after deciding whether we need balancing. - the patch might actually be much simpler after JDK-8202845: Refactor reference processing for improved parallelism. Since I do not want you to spend time on fixing this change after I mess it up, would you mind me taking over this work/CR? Thanks, Thomas > Thanks, > Sangheon > > > On 5/30/18 9:43 PM, sangheon.kim at oracle.com wrote: > > Hi all, > > > > Could I have some reviews for this patch? > > > > This patch is suggesting ergonomically choosing worker thread count > > from given reference count. > > We have ParallelRefProcEnabled command-line option which enables to > > use ALL workers during reference processing however this option has > > a drawback when there's limited number of references. i.e. spends > > more time on thread start-up/tear-down than actual processing time > > if there are less references. And also we use all threads or single > > thread during reference processing which seems less flexible on > > thread counts. This patch calculates the worker counts from > > dividing reference count by ReferencesPerThread(newly added > > experimental option). > > My suggestion for the default value of ReferencePerThread is 1000 > > as it showed good results from some benchmarks. > > > > Notes: > > 1. CMS ParNew is excluded from this patch because: > > a) There is a separate CR for CMS (JDK-6938732). > > b) It is tricky to manage switching single <-> MT processing > > inside of ReferenceProcessor class for ParNew. Tony explained quite > > well about the reason here ( https://bugs.openjdk.java.net/browse/J > > DK- > > 6938732?focusedCommentId=13932462&page=com.atlassian.jira.plugin.sy > > stem.issuetabpanels:comment-tabpanel#comment-13932462 ). > > c) CMS will be obsoleted in the future so not motivated to fix > > within this patch. > > 2. JDK-8203951 is the CR for removing temporarily added > > flag(ReferenceProcessor::_has_adjustable_queue from webrev.0) to > > manage ParNew. So the flag should be removed when CMS is obsoleted. > > 3. Current logic of dividing by ReferencesPerThread would be > > replaced with better implementation. e.g. time measuring > > implementation etc. But I think current approach is simply and good > > enough. > > 4. This patch is based on JDK-8204094 and JDK-8204095, both are not > > yet pushed so far. > > > > CR: https://bugs.openjdk.java.net/browse/JDK-8043575 > > Webrev: http://cr.openjdk.java.net/~sangheki/8043575/webrev.0/ > > Testing: hs-tier 1~5 with/without ParallelRefProcEnabled > > > > Thanks, > > Sangheon > From erik.osterlund at oracle.com Mon Jun 4 10:01:07 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 4 Jun 2018 12:01:07 +0200 Subject: RFR: JDK-8198285: More consistent Access API for arraycopy In-Reply-To: References: <5AE09F74.7050005@oracle.com> <5AE1F062.3020303@oracle.com> <3bb2415e-4d15-fbd3-dde2-73a25c7bd65b@redhat.com> <5B1133F3.4040704@oracle.com> Message-ID: <5B150DE3.4090402@oracle.com> Hi Roman, On 2018-06-03 14:27, Roman Kennke wrote: > Hi Erik > > Thanks for the review. See my comments inline below: > >> Thanks for doing this. A few comments... >> >> This: >> ArrayAccess<>::arraycopy_from_native(... >> >> ...won't reliably compile without saying "template", like this: >> ArrayAccess<>::template arraycopy_from_native(... >> >> ...because the template member is qualified on another template class. >> My suggestion is to remove the explicit and let that be inferred >> from the address instead. > Ok, I've done that for all the cases where it's possible. The heap->heap > cases cannot do that because there's no address pointer to infer from. > I've added the 'template' keyword there. There is one use for heap primitive arraycopy that I found that was not used inside of a template, and hence does not need the template keyword. As for the ones using oop_arraycopy, the type information is never used and is hence unnecessary to pass in at all (because the backends always get the information whether these are oops or narrow oops through my cool template machinery). After removing "template" from those cases, there are no cases with "template" left. >> And things like this: >> bool RuntimeDispatch> BARRIER_ARRAYCOPY>::arraycopy_init(arrayOop src_obj, size_t >> src_offset_in_bytes, const T* src_raw, arrayOop dst_obj, size_t >> dst_offset_in_bytes, T* dst_raw, size_t length) { >> >> ...as a single line, could do with some newlines to make it reasonably >> wide. > Ok, I've broken those lines down to have src args in one line, dst args > in another. This should make it more readable. I liked that. Saw that this was not done consistently though. Should probably be done everywhere. >> Now that we copy both from heap to heap, from heap to native, and native >> to heap, we have to be a bit more careful in modref >> oop_arraycopy_in_heap. In the covariant case of arraycopy, you >> unconditionally apply the pre and post barriers on the destination. >> However, if the destination is native >> (ArrayAccess::arraycopy_to_native), that does not feel like it will go >> particularly well, unless I have missed something. > oop arrays are never copied from/to native. Okay, then that makes sense. The general accessor (Access::arraycopy) should probably be protected so that one has to go through the ArrayAccess class for arraycopy that has the legal variants exposed for the public API. >> Also your new ArrayAccess class inherits from HeapAccess> decorators>, which is nice. But its member functions call >> HeapAccess::arraycopy, which will miss out on the >> IN_HEAP_ARRAY decorator, which might lead to not using precise marking >> in barrier backends. If I were you, I would probably typedef a >> BaseAccess for HeapAccess inside of >> ArrayAccess, and call that in the member functions. > I am not sure how to do this typedef trick, but I added | IN_HEAP_ARRAY > to all the calls. It also required to extend the verify check on the > decorators to include IN_HEAP_ARRAY. Like this inside of the class declaration: typedef HeapAccess AccessT; > Differential: > http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/ > Full: > http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/ > > What do you think? Definitely a lot better. I went ahead and incorporated my last feedback into an incremental webrev instead. If you agree and like my additional template polishing, then feel free to go with that and consider it reviewed: Incremental webrev: http://cr.openjdk.java.net/~eosterlund/8198285/webrev.00_01/ Full webrev: http://cr.openjdk.java.net/~eosterlund/8198285/webrev.01/ Thanks, /Erik > Cheers, > Roman > > >> Thanks, >> /Erik >> >> On 2018-06-01 00:25, Roman Kennke wrote: >>> Hi Erik, >>> >>> It took me a while to get back to this. >>> >>> I wrote a little wrapper around the extended arraycopy API that allows >>> to basically write exactly what you suggested: >>> >>> >>>> ArrayAccess::arraycopy_from_native(ptr, obj, index, >>>> length); >>> I agree that this seems the cleanest solution. >>> >>> The backend still sees both obj+off OR raw-ptr, but I think this is ok. >>> >>> Differential: >>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/ >>> Full: >>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/ >>> >>> What do you think? >>> >>> Roman >>> >>>> Each required parameter is clear when you read the API. >>>> >>>> But I am open to suggestions of course. >>>> >>>> Thanks, >>>> /Erik >>>> >>>> On 2018-04-25 17:59, Roman Kennke wrote: >>>>>> 1) Resolve the addresses as we do today, but you think you get better >>>>>> Shenandoah performance if we either >>>>>> 2) Have 3 different access calls, (from heap to native, from native to >>>>>> heap, from heap to heap) for copying, or >>>>>> 3) Have 1 access call that can encode the above 3 variants, but looks >>>>>> ugly at the call sites. >>>>> There's also the idea to pass Address arguments to arraycopy, one for >>>>> src, one for dst, and have 2 different subclasses: one for obj+offset >>>>> (heap access) and one with direct pointer (raw). Your comment gave me >>>>> the idea to also provide arrayOop+idx. This would look clean on the >>>>> caller side I think. >>>>> >>>>> It would also be useful on the GC side: BarrierSets would specialize >>>>> only in the variants that they are interested in, for example, in case >>>>> of Shenandoah: >>>>> 1. arraycopy(HeapAddress,HeapAddress) Java->Java >>>>> 2. arraycopy(HeapAddress,RawAddress) Java->native >>>>> 3. arraycopy(RawAddress,HeapAddress) native->Java >>>>> >>>>> other barriersets can ignore the exact type and only call the args >>>>> address->resolve() or so to get the actual raw address. >>>>> >>>>> This would *also* be beneficial for the other APIs: instead of having >>>>> all the X() and X_at() variants, we can just use one X variant that >>>>> either takes RawAddress or HeapAddress. >>>>> >>>>> I made a little (not yet compiling/working) prototype of this a while >>>>> ago: >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8199801-2.patch >>>>> >>>>> >>>>> What do you think? Would it make sense to go further down that road? >>>>> >>>>> Roman >>>>> >>>>>> You clearly went for 3, which leaves the callsites looking rather hard >>>>>> to read. It is for example not obvious for me what is going on here >>>>>> (javaClasses.cpp line 313): >>>>>> >>>>>> HeapAccess<>::arraycopy(NULL, 0, reinterpret_cast>>>>> jbyte*>(utf8_str), value(h_obj()), >>>>>> typeArrayOopDesc::element_offset(0), NULL, length); >>>>>> >>>>>> ...without looking very carefully at the long list of arguments >>>>>> encoding >>>>>> what is actually going on (copy from native to the heap). What is >>>>>> worse >>>>>> is that this will probably not compile without adding the template >>>>>> keyword to the call (since you have a qualified template member >>>>>> function >>>>>> behind a template class), like this: >>>>>> HeapAccess<>::template arraycopy(NULL, 0, >>>>>> reinterpret_cast>>>>> jbyte*>(utf8_str), value(h_obj()), >>>>>> typeArrayOopDesc::element_offset(0), NULL, length); >>>>>> >>>>>> ...which as a public API leaves me feeling a bit like this: :C >>>>>> >>>>>> May I suggest adding an array access helper. The idea is to keep a >>>>>> single call through Access, and a single backend point for array copy, >>>>>> but let the helper provide the three different types of copying as >>>>>> different functions, so that the call sites still look pretty and easy >>>>>> to read and follow. >>>>>> >>>>>> This helper could additionally have load_at, and store_at use array >>>>>> indices as opposed to offsets, and hence hide the offset >>>>>> calculations we >>>>>> perform today (typically involving checking if we are using compressed >>>>>> oops or not). >>>>>> >>>>>> I am thinking something along the lines of >>>>>> ArrayAccess<>::arraycopy_to_native(readable_arguments), >>>>>> ArrayAccess<>::arraycopy_from_native(readable_arguments), >>>>>> ArrayAccess<>::arraycopy(readable_arguments), which translates to some >>>>>> form of Access<>::arraycopy(unreadable_arguments). And for example >>>>>> ArrayAccess<>::load_at(obj, index) would translate to some kind of >>>>>> HeapAccess::load_at(obj, offset_for_index(index)) as a >>>>>> bonus, making everyone using the API jump with happiness. >>>>>> >>>>>> What do you think about this idea? Good or bad? I guess the >>>>>> question is >>>>>> whether this helper should be in access.hpp, or somewhere else >>>>>> (like in >>>>>> arrayOop). Thoughts are welcome. >>>>>> >>>>>> Thanks, >>>>>> /Erik >>>>>> >>>>>> On 2018-04-11 19:54, Roman Kennke wrote: >>>>>>> Currently, the arraycopy API in access.hpp gets the src and dst >>>>>>> oops, >>>>>>> plus the src and dst addresses. In order to be most useful to garbage >>>>>>> collectors, it should receive the src and dst oops together with the >>>>>>> src >>>>>>> and dst offsets instead, and let the Access API / GC calculate the >>>>>>> src >>>>>>> and dst addresses. >>>>>>> >>>>>>> For example, Shenandoah needs to resolve the src and dst objects for >>>>>>> arraycopy, and then apply the corresponding offsets. With the current >>>>>>> API (obj+ptr) it would calculate the ptr-diff from obj to ptr, then >>>>>>> resolve obj, then re-add the calculate ptr-diff. This is fragile >>>>>>> because >>>>>>> we also may resolve obj in the runtime before calculating ptr >>>>>>> (e.g. via >>>>>>> arrayOop::base()). If we then pass in the original obj and a ptr >>>>>>> calculated from another copy of the same obj, the above resolution >>>>>>> logic >>>>>>> would not work. This is currently the case for obj-arraycopy. >>>>>>> >>>>>>> I propose to change the API to accept obj+offset, in addition to ptr >>>>>>> for >>>>>>> both src and dst. Only one or the other should be used. Heap accesses >>>>>>> should use obj+offset and pass NULL for raw-ptr, off-heap accesses >>>>>>> (or >>>>>>> heap accesses that are already resolved.. use with care) should pass >>>>>>> NULL+0 for obj+offset and the raw-ptr. Notice that this also >>>>>>> allows the >>>>>>> API to be used for Java<->native array bulk transfers. >>>>>>> >>>>>>> An alternative would be to break the API up into 4 variants: >>>>>>> >>>>>>> Java->Java transfer: >>>>>>> arraycopy(oop src, size_t src_offs, oop dst, size_t dst_offs, size_t >>>>>>> len) >>>>>>> >>>>>>> Java->Native transfer: >>>>>>> arraycopy(oop src, size_t src_offs, D* raw_dst, size_t len) >>>>>>> >>>>>>> Native->Java transfer: >>>>>>> arraycopy(S* src_raw, oop dst, size_t dst_offs, size_t len) >>>>>>> >>>>>>> 'Unsafe' transfer: >>>>>>> arraycopy(S* src_raw, D* dst_raw, size_t len) >>>>>>> >>>>>>> >>>>>>> But that seemed to be too much boilerplate copy+pasting for my taste. >>>>>>> (See how having this overly complicated template layer hurts us?) >>>>>>> >>>>>>> Plus, I had a better idea: instead of accepting oop+offset OR T* for >>>>>>> almost every Access API, we may want to abstract that and take an >>>>>>> Address type argument, which would be either HeapAddress(obj, >>>>>>> offset) or >>>>>>> RawAddress(T* ptr). GCs may then just call addr->address() to get the >>>>>>> actual address, or specialize for HeapAddress variants and resolve >>>>>>> the >>>>>>> objs and then resolve the address. This would also allow us to get >>>>>>> rid >>>>>>> of almost half of the API (all the *_at variants would go) and some >>>>>>> other simplifications. However, this seemed to explode the scope of >>>>>>> this >>>>>>> RFE, and would be better handled in another RFE. >>>>>>> >>>>>>> This changes makes both typeArrayKlass and objArrayKlass use the >>>>>>> changed >>>>>>> API, plus I identified all (hopefully) places where we do bulk >>>>>>> Java<->native array transfers and make them use the API too. Gets us >>>>>>> rid >>>>>>> of a bunch of memcpy calls :-) >>>>>>> >>>>>>> Please review the change: >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.00/ >>>>>>> >>>>>>> Thanks, Roman >>>>>>> > From martin.doerr at sap.com Mon Jun 4 10:17:09 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 4 Jun 2018 10:17:09 +0000 Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 In-Reply-To: References: <5625a595-1165-8d48-afbd-8229cdc4ac07@oracle.com> <7e7fc484287e4da4926176e0b5ae1b64@sap.com> <339D50B6-09D5-4342-A687-918A9A096B39@oracle.com> <6D2268EC-C1E7-4C90-BCD3-90D02D21FA08@oracle.com> <3e05cc86c9d4406d8a9875b705fbf1fc@sap.com> <5B0D40F7.5050807@oracle.com> <4dddb18526a745cc83941a0f58af77f5@sap.com> Message-ID: <12a2ffeacafd4577b4c8c5ba7e91c530@sap.com> Hi Michihiro, looks good to me, too. Thanks for adding comments where you have changed memory barriers. Pushed to submission repo and our internal testing. Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Freitag, 1. Juni 2018 17:37 To: Erik ?sterlund Cc: Andrew Haley (aph at redhat.com) ; david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net; Kim Barrett ; Doerr, Martin ; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 >Hi Michihiro, > >Looks good to me. Thanks a lot, Erik! Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for "Erik ?sterlund" ---2018/06/02 00:15:15---Hi Michihiro, Looks good to me.]"Erik ?sterlund" ---2018/06/02 00:15:15---Hi Michihiro, Looks good to me. From: "Erik ?sterlund" > To: Michihiro Horie >, "Doerr, Martin" > Cc: "Andrew Haley (aph at redhat.com)" >, "david.holmes at oracle.com" >, "hotspot-gc-dev at openjdk.java.net" >, Kim Barrett >, "ppc-aix-port-dev at openjdk.java.net" > Date: 2018/06/02 00:15 Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 ________________________________ Hi Michihiro, Looks good to me. Thanks, /Erik On 2018-06-01 17:08, Michihiro Horie wrote: Hi Kim, Erik, and Martin, Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ This change uses forwardee_acquire(), which would generate better code on ARM. Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for "Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of]"Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of "MP+sync+addr". From: "Doerr, Martin" To: "Erik ?sterlund" , Kim Barrett , Michihiro Horie , "Andrew Haley (aph at redhat.com)" Cc: "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" , "ppc-aix-port-dev at openjdk.java.net" Date: 2018/05/30 16:18 Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 ________________________________ Hi Erik, the current implementation works on PPC because of "MP+sync+addr". So we already rely on ordering of "load volatile field" + "implicit consume" on the reader's side. We have never seen any issues related to this with the compilers we have been using during the ~10 years the PPC implementation exists. PPC supports "MP+lwsync+addr" the same way, so Michihiro's proposal doesn't make it unreliable for PPC. But I'm ok with evaluating acquire barriers although they are not required by the PPC/ARM memory models. ARM/aarch64 will also be affected when the o->forwardee uses load_acquire. So somebody should check the impact. If it is not acceptable we may need to introduce explicit consume. Implicit consume is also bad in shared code because somebody may want to run it on DEC Alpha. Thanks and best regards, Martin -----Original Message----- From: Erik ?sterlund [mailto:erik.osterlund at oracle.com] Sent: Dienstag, 29. Mai 2018 14:01 To: Doerr, Martin ; Kim Barrett ; Michihiro Horie Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Martin and Michihiro, On 2018-05-29 12:30, Doerr, Martin wrote: > Hi Kim, > > I'm trying to understand how this is related to Michihiro's change. The else path of the initial test is not affected by it AFAICS. > So it sounds like a request to fix the current implementation in addition to what his original intend was. I think we are just trying to nail down the correct fencing and just go for that. And yes, this is arguably a pre-existing problem, but in a race involving the very same accesses that we are changing the fencing for. So it is not completely unrelated I suppose. In particular, hotspot has code that assumes that if you on the writer side issue a full fence before publishing a pointer to newly initialized data, then the initializing stores and their side effects should be globally "visible" across the system before the pointer to it is published, and hence elide the need for acquire on the loading side, without relying on retained data dependencies on the loader side. I believe this code falls under that category. It is assumed that the leading fence of the CAS publishing the forwarding pointer makes the initializing stores globally observable before publishing a pointer to the initialized data, hence assuming that any loads able to observe the new pointer would not rely on acquire or data dependent loads to correctly read the initialized data. Unfortunately, this is not reliable in the IRIW case, as per the litmus test "MP+sync+ctrl" as described in "Understanding POWER multiprocessors" (https://dl.acm.org/citation.cfm?id=1993520), as opposed to "MP+sync+addr" that gets away with it because of the data dependency (not IRIW). Similarly, an isync does the job too on the reader side as shown in MP+sync+ctrlisync. So while what I believe was the previous reasoning that the leading sync of the CAS would elide the necessity for acquire on the reader side without relying on data dependent loads (implicit consume), I think that assumption was wrong in the first place and that we do indeed need explicit acquire (even with the precious conservative CAS fencing) in this context to not rely on implicit consume semantics generating the required data dependent loads on the reader side. In practice though, the leading sync of the CAS has been enough to generate the correct machine code. Now, with the leading sync removed, we are increasing the possible holes in the generated machine code due to this flawed reasoning. So it would be nice to do something more sound instead that does not make such assumptions. > Anyway, I agree with that implicit consume is not good. And I think it would be good to treat both o->forwardee() the same way. > What about keeping memory_order_release for the CAS and using acquire for both o->forwardee()? > The case in which the CAS succeeds is safe because the current thread has created new_obj so it doesn't need memory barriers to access it. Sure, that sounds good to me. Thanks, /Erik > Thanks and best regards, > Martin > > > -----Original Message----- > From: Kim Barrett [mailto:kim.barrett at oracle.com] > Sent: Dienstag, 29. Mai 2018 01:54 > To: Michihiro Horie > Cc: Erik Osterlund ; david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; Doerr, Martin > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > >> On May 28, 2018, at 4:12 AM, Michihiro Horie wrote: >> >> Hi Erik, >> >> Thank you very much for your review. >> >> I understood that implicit consume should not be used in the shared code. Also, I believe performance degradation would be negligible even if we use acquire. >> >> New webrev uses memory_order_acq_rel: http://cr.openjdk.java.net/~mhorie/8154736/webrev.10 > This is missing the acquire barrier on the else branch for the initial test, so fails to meet > the previously described minimal requirements for even possibly being sufficient. Any > analysis of weakening the CAS barriers must consider that test and successor code. > > In the analysis, it?s not just the lexically nearby debugging / logging code that needs to be > considered; the forwardee is being returned to caller(s) that will presumably do something > with that object. > > Since the whole point of this discussion is performance, any proposed change should come > with performance information. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From thomas.schatzl at oracle.com Mon Jun 4 10:27:30 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 04 Jun 2018 12:27:30 +0200 Subject: RFR (S): 8204169: Humongous continues region remembered set states do not match the one from the corresponding humongous start region Message-ID: Hi all, can I have reviews for this small change that fixes up remembered set states of humongous continues regions? In particular they are not necessarily consistent with the humongous starts region at the moment. This makes for some surprises when reading the logs. Otherwise there is no impact: all decisions on how to treat the humongous object are made on the humongous start region anyway. As for testing, I added code to the heap verification to check this consistency requirement. This makes for a 100% failure rate in gc/g1/TestEagerReclaimHumongousRegionsClearMarkBits if not fixed. CR: https://bugs.openjdk.java.net/browse/JDK-8204169 Webrev: http://cr.openjdk.java.net/~tschatzl/8204169/webrev/ Testing: hs-tier1-3 Thanks, Thomas From thomas.schatzl at oracle.com Mon Jun 4 10:35:17 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 04 Jun 2018 12:35:17 +0200 Subject: RFR (XS): 8202049: G1: ReferenceProcessor doesn't handle mark stack overflow Message-ID: <7f43835383d1ebf1f7d3a192c7949ffdb80541d3.camel@oracle.com> Hi all, can I have reviews for this change that makes sure that if we get a mark stack overflow during reference processing in the Remark pause we error out in the VM because we can't recover from that. I.e. recovery would basically mean restarting from scratch, but visiting Reference objects multiple times can have lots of side- effects. So the only current option is to give up with a "nice" error message for now. Note that there is no known actual problem because of this, i.e. overflow during remark is (supposedly) really rare. I spent some time trying to create a reproducer, but failed. I will file an extra RFE for this. CR: https://bugs.openjdk.java.net/browse/JDK-8202049 Webrev: http://cr.openjdk.java.net/~tschatzl/8202049/webrev Testing: running this patch for months in my usual testing routine for other patches without any error. Thanks, Thomas From rkennke at redhat.com Mon Jun 4 10:44:53 2018 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 4 Jun 2018 12:44:53 +0200 Subject: RFR: JDK-8198285: More consistent Access API for arraycopy In-Reply-To: <5B150DE3.4090402@oracle.com> References: <5AE09F74.7050005@oracle.com> <5AE1F062.3020303@oracle.com> <3bb2415e-4d15-fbd3-dde2-73a25c7bd65b@redhat.com> <5B1133F3.4040704@oracle.com> <5B150DE3.4090402@oracle.com> Message-ID: Hi Erik your changes look good to me. I'll push it through the submit repo. Do I need another review? Otherwise I'll push it if submit-repo comes back clean. Thanks, Roman >> Thanks for the review. See my comments inline below: >> >>> Thanks for doing this. A few comments... >>> >>> This: >>> ArrayAccess<>::arraycopy_from_native(... >>> >>> ...won't reliably compile without saying "template", like this: >>> ArrayAccess<>::template arraycopy_from_native(... >>> >>> ...because the template member is qualified on another template class. >>> My suggestion is to remove the explicit and let that be inferred >>> from the address instead. >> Ok, I've done that for all the cases where it's possible. The heap->heap >> cases cannot do that because there's no address pointer to infer from. >> I've added the 'template' keyword there. > > There is one use for heap primitive arraycopy that I found that was not > used inside of a template, and hence does not need the template keyword. > As for the ones using oop_arraycopy, the type information is never used > and is hence unnecessary to pass in at all (because the backends always > get the information whether these are oops or narrow oops through my > cool template machinery). After removing "template" from those cases, > there are no cases with "template" left. > >>> And things like this: >>> ? bool RuntimeDispatch>> BARRIER_ARRAYCOPY>::arraycopy_init(arrayOop src_obj, size_t >>> src_offset_in_bytes, const T* src_raw, arrayOop dst_obj, size_t >>> dst_offset_in_bytes, T* dst_raw, size_t length) { >>> >>> ...as a single line, could do with some newlines to make it reasonably >>> wide. >> Ok, I've broken those lines down to have src args in one line, dst args >> in another. This should make it more readable. > > I liked that. Saw that this was not done consistently though. Should > probably be done everywhere. > >>> Now that we copy both from heap to heap, from heap to native, and native >>> to heap, we have to be a bit more careful in modref >>> oop_arraycopy_in_heap. In the covariant case of arraycopy, you >>> unconditionally apply the pre and post barriers on the destination. >>> However, if the destination is native >>> (ArrayAccess::arraycopy_to_native), that does not feel like it will go >>> particularly well, unless I have missed something. >> oop arrays are never copied from/to native. > > Okay, then that makes sense. The general accessor (Access::arraycopy) > should probably be protected so that one has to go through the > ArrayAccess class for arraycopy that has the legal variants exposed for > the public API. > >>> Also your new ArrayAccess class inherits from HeapAccess>> decorators>, which is nice. But its member functions call >>> HeapAccess::arraycopy, which will miss out on the >>> IN_HEAP_ARRAY decorator, which might lead to not using precise marking >>> in barrier backends. If I were you, I would probably typedef a >>> BaseAccess for HeapAccess inside of >>> ArrayAccess, and call that in the member functions. >> I am not sure how to do this typedef trick, but I added | IN_HEAP_ARRAY >> to all the calls. It also required to extend the verify check on the >> decorators to include IN_HEAP_ARRAY. > > Like this inside of the class declaration: > typedef HeapAccess AccessT; > >> Differential: >> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/ >> Full: >> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/ >> >> What do you think? > > Definitely a lot better. I went ahead and incorporated my last feedback > into an incremental webrev instead. If you agree and like my additional > template polishing, then feel free to go with that and consider it > reviewed: > > Incremental webrev: > http://cr.openjdk.java.net/~eosterlund/8198285/webrev.00_01/ > > Full webrev: > http://cr.openjdk.java.net/~eosterlund/8198285/webrev.01/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From synytskyy at jelastic.com Mon Jun 4 11:13:38 2018 From: synytskyy at jelastic.com (Ruslan Synytsky) Date: Mon, 4 Jun 2018 14:13:38 +0300 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: References: Message-ID: On 1 June 2018 at 16:21, Ruslan Synytsky wrote: > Hi Sunny. > > On 1 June 2018 at 07:36, Sunny Chan, CLSA wrote: > >> Hello, >> >> >> >> I have a number of question about the proposed changes for the JEP and I >> would like to make the following suggestions and comments >> >> >> >> 1) Are we planning to make this change the default behavior for G1? >> Or this is an optional switch for people who runs in a container >> environment? >> > Optional. It will be enough for beginning. > >> 2) In a trading systems, sometimes there are period of quiescence >> and then suddenly a burst of activity towards market close or market event. >> The loadavg you suggest to ?monitor? the activity in the system only >> reports up to 14mins and it might not necessary a good measure for this >> type of applications, especially you would trigger a Full GC >> > I'm sure certain number of workloads can't afford it. Such kind of > projects or other mission critical environments simply should not use this > option. > >> 3) You haven?t fill in the details for ?The GCFrequency value is >> ignored and therefore, i.e., no full collection is triggered, if:? >> > Thanks. It's a missed part. Here are the "if" rules: > > - GCFrequency is zero or below > - the average load on the host system is above MaxLoadGC. The > MaxLoadGC is a dynamically user-defined variable. This check is ignored if > MaxLoadGC is zero or below > - the committed memory is above MinCommitted bytes. MinCommitted is a > dynamically user-defined variable. This check is ignored if MinCommitted is > zero or below > - the difference between the current heap capacity and the current > heap usage is below MaxOverCommitted bytes. The MaxOverCommitted is a > dynamically user-defined variable. This check is ignored if > MaxOverCommitted is zero or below > > > The doc will be updated. > > 4) If we are trigging full GC with this we should make sure the GC > reason is populated and log properly in the GC log so we can track it down. > Good point. > >> 5) I have not heard of J9 Gencon/Shenandoah providing similar >> functionality. Can you point me to further documentation on which feature >> you model upon? >> > What we know so far that OpenJ9 provides > > - -XX:+IdleTuningCompactOnIdle > > - this option controls garbage collection processing with compaction > when the status of the JVM is set to idle > - and -Xsoftmx > - > this option sets a "soft" maximum limit for the initial size of the Java? > heap. > > We have not tested it yet, but the idea looks similar. Do we have anyone > in the group involved in OpenJ9 to confirm or refute this statement? > It's getting more attention. A new mentioning of the implementation in OpenJ9 - DZone Java 2018: FEATURES, IMPROVEMENTS, & UPDATES (Page 31). * You can also specify -XX:+IdleTuningGcOnIdle on the command line. When set, OpenJ9 determines whether an application is idle based on CPU utilization and other internal heuristics. When an idle state is recognized, a GC cycle runs if there is significant garbage in the heap and releases unused memory back to the operating system. * > > Thank you > >> >> >> Thanks. >> >> >> >> *Sunny Chan* >> >> *Senior Lead Engineer, Executive Services* >> >> D +852 2600 8907 | M +852 6386 1835 | T +852 2600 8888 >> >> 5/F, One Island East, 18 Westlands Road >> , >> Island East, Hong Kong >> >> >> >> [image: :1. Social Media Icons:CLSA_Social Media Icons_linkedin.png] >> [image: :1. Social Media >> Icons:CLSA_Social Media Icons_twitter.png] >> [image: :1. Social Media >> Icons:CLSA_Social Media Icons_youtube.png] >> [image: :1. >> Social Media Icons:CLSA_Social Media Icons_facebook.png] >> >> >> >> >> *clsa.com* >> >> *Insights. Liquidity. Capital. * >> >> >> >> [image: CLSA_RGB] >> >> >> >> *A CITIC Securities Company* >> >> >> >> The content of this communication is intended for the recipient and is >> subject to CLSA Legal and Regulatory Notices. >> These can be viewed at https://www.clsa.com/disclaimer.html or sent to >> you upon request. >> Please consider before printing. CLSA is ISO14001 certified and committed >> to reducing its impact on the environment. >> > > > > -- > Ruslan > CEO @ Jelastic > -- Ruslan CEO @ Jelastic -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.png Type: image/png Size: 1070 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image010.png Type: image/png Size: 4859 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image008.png Type: image/png Size: 1024 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 1206 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 1122 bytes Desc: not available URL: From HORIE at jp.ibm.com Mon Jun 4 11:30:40 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Mon, 4 Jun 2018 20:30:40 +0900 Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 In-Reply-To: <12a2ffeacafd4577b4c8c5ba7e91c530@sap.com> References: <7e7fc484287e4da4926176e0b5ae1b64@sap.com> <339D50B6-09D5-4342-A687-918A9A096B39@oracle.com> <6D2268EC-C1E7-4C90-BCD3-90D02D21FA08@oracle.com> <3e05cc86c9d4406d8a9875b705fbf1fc@sap.com> <5B0D40F7.5050807@oracle.com> <4dddb18526a745cc83941a0f58af77f5@sap.com> Message-ID: >Hi Michihiro, > >looks good to me, too. Thanks for adding comments where you have changed memory barriers. >Pushed to submission repo and our internal testing. Thanks a lot, Martin! Best regards, -- Michihiro, IBM Research - Tokyo From: "Doerr, Martin" To: Michihiro Horie , "Erik ?sterlund" Cc: "Andrew Haley (aph at redhat.com)" , "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" , "Kim Barrett" , "ppc-aix-port-dev at openjdk.java.net" Date: 2018/06/04 19:17 Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Michihiro, looks good to me, too. Thanks for adding comments where you have changed memory barriers. Pushed to submission repo and our internal testing. Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Freitag, 1. Juni 2018 17:37 To: Erik ?sterlund Cc: Andrew Haley (aph at redhat.com) ; david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net; Kim Barrett ; Doerr, Martin ; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 >Hi Michihiro, > >Looks good to me. Thanks a lot, Erik! Best regards, -- Michihiro, IBM Research - Tokyo Inactive hide details for "Erik ?sterlund" ---2018/06/02 00:15:15---Hi Michihiro, Looks good to me."Erik ?sterlund" ---2018/06/02 00:15:15---Hi Michihiro, Looks good to me. From: "Erik ?sterlund" To: Michihiro Horie , "Doerr, Martin" < martin.doerr at sap.com> Cc: "Andrew Haley (aph at redhat.com)" , " david.holmes at oracle.com" , " hotspot-gc-dev at openjdk.java.net" , Kim Barrett , "ppc-aix-port-dev at openjdk.java.net" < ppc-aix-port-dev at openjdk.java.net> Date: 2018/06/02 00:15 Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Michihiro, Looks good to me. Thanks, /Erik On 2018-06-01 17:08, Michihiro Horie wrote: Hi Kim, Erik, and Martin, Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ This change uses forwardee_acquire(), which would generate better code on ARM. Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. Best regards, -- Michihiro, IBM Research - Tokyo Inactive hide details for "Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of "Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of "MP+sync+addr". From: "Doerr, Martin" To: "Erik ?sterlund" , Kim Barrett , Michihiro Horie , "Andrew Haley (aph at redhat.com)" Cc: "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" , "ppc-aix-port-dev at openjdk.java.net" Date: 2018/05/30 16:18 Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Erik, the current implementation works on PPC because of "MP+sync +addr". So we already rely on ordering of "load volatile field" + "implicit consume" on the reader's side. We have never seen any issues related to this with the compilers we have been using during the ~10 years the PPC implementation exists. PPC supports "MP+lwsync+addr" the same way, so Michihiro's proposal doesn't make it unreliable for PPC. But I'm ok with evaluating acquire barriers although they are not required by the PPC/ARM memory models. ARM/aarch64 will also be affected when the o->forwardee uses load_acquire. So somebody should check the impact. If it is not acceptable we may need to introduce explicit consume. Implicit consume is also bad in shared code because somebody may want to run it on DEC Alpha. Thanks and best regards, Martin -----Original Message----- From: Erik ?sterlund [mailto:erik.osterlund at oracle.com] Sent: Dienstag, 29. Mai 2018 14:01 To: Doerr, Martin ; Kim Barrett ; Michihiro Horie Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 Hi Martin and Michihiro, On 2018-05-29 12:30, Doerr, Martin wrote: > Hi Kim, > > I'm trying to understand how this is related to Michihiro's change. The else path of the initial test is not affected by it AFAICS. > So it sounds like a request to fix the current implementation in addition to what his original intend was. I think we are just trying to nail down the correct fencing and just go for that. And yes, this is arguably a pre-existing problem, but in a race involving the very same accesses that we are changing the fencing for. So it is not completely unrelated I suppose. In particular, hotspot has code that assumes that if you on the writer side issue a full fence before publishing a pointer to newly initialized data, then the initializing stores and their side effects should be globally "visible" across the system before the pointer to it is published, and hence elide the need for acquire on the loading side, without relying on retained data dependencies on the loader side. I believe this code falls under that category. It is assumed that the leading fence of the CAS publishing the forwarding pointer makes the initializing stores globally observable before publishing a pointer to the initialized data, hence assuming that any loads able to observe the new pointer would not rely on acquire or data dependent loads to correctly read the initialized data. Unfortunately, this is not reliable in the IRIW case, as per the litmus test "MP+sync+ctrl" as described in "Understanding POWER multiprocessors" (https://dl.acm.org/citation.cfm?id=1993520), as opposed to "MP+sync+addr" that gets away with it because of the data dependency (not IRIW). Similarly, an isync does the job too on the reader side as shown in MP+sync+ctrlisync. So while what I believe was the previous reasoning that the leading sync of the CAS would elide the necessity for acquire on the reader side without relying on data dependent loads (implicit consume), I think that assumption was wrong in the first place and that we do indeed need explicit acquire (even with the precious conservative CAS fencing) in this context to not rely on implicit consume semantics generating the required data dependent loads on the reader side. In practice though, the leading sync of the CAS has been enough to generate the correct machine code. Now, with the leading sync removed, we are increasing the possible holes in the generated machine code due to this flawed reasoning. So it would be nice to do something more sound instead that does not make such assumptions. > Anyway, I agree with that implicit consume is not good. And I think it would be good to treat both o->forwardee() the same way. > What about keeping memory_order_release for the CAS and using acquire for both o->forwardee()? > The case in which the CAS succeeds is safe because the current thread has created new_obj so it doesn't need memory barriers to access it. Sure, that sounds good to me. Thanks, /Erik > Thanks and best regards, > Martin > > > -----Original Message----- > From: Kim Barrett [mailto:kim.barrett at oracle.com] > Sent: Dienstag, 29. Mai 2018 01:54 > To: Michihiro Horie > Cc: Erik Osterlund ; david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; Doerr, Martin > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > >> On May 28, 2018, at 4:12 AM, Michihiro Horie wrote: >> >> Hi Erik, >> >> Thank you very much for your review. >> >> I understood that implicit consume should not be used in the shared code. Also, I believe performance degradation would be negligible even if we use acquire. >> >> New webrev uses memory_order_acq_rel: http://cr.openjdk.java.net/~mhorie/8154736/webrev.10 > This is missing the acquire barrier on the else branch for the initial test, so fails to meet > the previously described minimal requirements for even possibly being sufficient. Any > analysis of weakening the CAS barriers must consider that test and successor code. > > In the analysis, it?s not just the lexically nearby debugging / logging code that needs to be > considered; the forwardee is being returned to caller(s) that will presumably do something > with that object. > > Since the whole point of this discussion is performance, any proposed change should come > with performance information. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From thomas.schatzl at oracle.com Mon Jun 4 11:44:13 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 04 Jun 2018 13:44:13 +0200 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: <6f9228a19d3c4deca170b2032b887dde@clsa.com> References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> Message-ID: <4d429a22ff1f865c46e48fd56b191364f5e04240.camel@oracle.com> Hi, some comments to your questions; note that the functionality and the JEP are still very much in a flux, so things are going change. Looking at the responses so far, we all appreciate your input. Rodrigo and Ruslan were proposing this feature, so they are the source for authorative answers; apart from being interested in this feature for G1, I am only helping with the process (and missing access rights for them) :) On Fri, 2018-06-01 at 04:41 +0000, Sunny Chan, CLSA wrote: > (resending for hotspot-gc-dev) > > Hello, > > I have a number of question about the proposed changes for the JEP > and I would like to make the following suggestions and comments > > 1) Are we planning to make this change the default behavior for > G1? Or this is an optional switch for people who runs in a container > environment? Optional. Added to the JEP text. > 2) In a trading systems, sometimes there are period of > quiescence and then suddenly a burst of activity towards market close > or market event. The loadavg you suggest to ?monitor? the activity in > the system only reports up to 14mins and it might not necessary a > good measure for this type of applications, especially you would > trigger a Full GC The idea is to have this feature off by default from what I understand. > 3) You haven?t fill in the details for ?The GCFrequency value is > ignored and therefore, i.e., no full collection is triggered, if:? I removed that sentence, some copy&paste error. The details on when the VM is going to detect idleness, see above. Also Ruslan already detailed them a bit more. > 4) If we are trigging full GC with this we should make sure the > GC reason is populated and log properly in the GC log so we can track > it down. Noted. > 5) I have not heard of J9 Gencon/Shenandoah providing similar > functionality. Can you point me to further documentation on which > feature you model upon? This has already been answered in this thread. I added some comments to the JEP to be incorporated soon. Thanks, Thomas From boris.ulasevich at bell-sw.com Mon Jun 4 11:58:23 2018 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Mon, 4 Jun 2018 14:58:23 +0300 Subject: RFR (S) 8202705: ARM32 build crashes on long JavaThread offsets Message-ID: <4840be99-ddbb-16f0-a9cb-31d7efcf0d02@bell-sw.com> Hello all, Please review this patch to allow ARM32 MacroAssembler to handle updated JavaThread offsets: http://cr.openjdk.java.net/~bulasevich/8202705/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8202705 thank you, Boris From shade at redhat.com Mon Jun 4 12:02:32 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Jun 2018 14:02:32 +0200 Subject: RFR: JDK-8198285: More consistent Access API for arraycopy In-Reply-To: <5B150DE3.4090402@oracle.com> References: <5AE09F74.7050005@oracle.com> <5AE1F062.3020303@oracle.com> <3bb2415e-4d15-fbd3-dde2-73a25c7bd65b@redhat.com> <5B1133F3.4040704@oracle.com> <5B150DE3.4090402@oracle.com> Message-ID: On 06/04/2018 12:01 PM, Erik ?sterlund wrote: >> What do you think? > > Definitely a lot better. I went ahead and incorporated my last feedback into an incremental webrev > instead. If you agree and like my additional template polishing, then feel free to go with that and > consider it reviewed: > > Incremental webrev: > http://cr.openjdk.java.net/~eosterlund/8198285/webrev.00_01/ > > Full webrev: > http://cr.openjdk.java.net/~eosterlund/8198285/webrev.01/ Looks good to me. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Mon Jun 4 12:05:26 2018 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 4 Jun 2018 14:05:26 +0200 Subject: RFR: JDK-8198285: More consistent Access API for arraycopy In-Reply-To: References: <5AE09F74.7050005@oracle.com> <5AE1F062.3020303@oracle.com> <3bb2415e-4d15-fbd3-dde2-73a25c7bd65b@redhat.com> <5B1133F3.4040704@oracle.com> <5B150DE3.4090402@oracle.com> Message-ID: <877fe1fb-252e-d849-8591-e2c5a9fd261d@redhat.com> Am 04.06.2018 um 14:02 schrieb Aleksey Shipilev: > On 06/04/2018 12:01 PM, Erik ?sterlund wrote: >>> What do you think? >> >> Definitely a lot better. I went ahead and incorporated my last feedback into an incremental webrev >> instead. If you agree and like my additional template polishing, then feel free to go with that and >> consider it reviewed: >> >> Incremental webrev: >> http://cr.openjdk.java.net/~eosterlund/8198285/webrev.00_01/ >> >> Full webrev: >> http://cr.openjdk.java.net/~eosterlund/8198285/webrev.01/ > > Looks good to me. > > -Aleksey > Thank you for reviewing! Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Mon Jun 4 12:10:28 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Jun 2018 14:10:28 +0200 Subject: RFR (S) 8202705: ARM32 build crashes on long JavaThread offsets In-Reply-To: <4840be99-ddbb-16f0-a9cb-31d7efcf0d02@bell-sw.com> References: <4840be99-ddbb-16f0-a9cb-31d7efcf0d02@bell-sw.com> Message-ID: On 06/04/2018 01:58 PM, Boris Ulasevich wrote: > Hello all, > > Please review this patch to allow ARM32 MacroAssembler to handle updated JavaThread offsets: > ? http://cr.openjdk.java.net/~bulasevich/8202705/webrev.01/ > ? https://bugs.openjdk.java.net/browse/JDK-8202705 Looks okay, but Rthread becomes misnomer in the middle of the method. Maybe like this? // Borrow the Rthread for alloc counter Register Ralloc = Rthread; Rthread = NULL; add(Ralloc, Ralloc, in_bytes(JavaThread::allocated_bytes_offset()); ... ... // Unborrow the Rthread sub(Ralloc, Ralloc, in_bytes(JavaThread::allocated_bytes_offset() Rthread = Ralloc; Ralloc = NULL; -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Mon Jun 4 12:26:21 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 04 Jun 2018 14:26:21 +0200 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> <597AC801-7DD0-4E3E-BDBE-8A4FA9860501@gmail.com> Message-ID: <1f933ce2b62f7ba0a1ebcb45c3bd33404958c3e0.camel@oracle.com> Hi all, On Fri, 2018-06-01 at 16:44 +0300, Ruslan Synytsky wrote: > Hi Kirk, thank you for the additional important highlights. > > On 1 June 2018 at 08:21, Kirk Pepperdine > wrote: > > Hi, [...] > > In an era when the GC engineers are working so very hard to make as > > much of the collection process as concurrent as possible, it just > > feels very wrong to rely on a huge STW event to get something done. > > In that spirit, it behoves us to explore how committed memory maybe > > released at the tail end of a normally triggered GC cycle. I > > beleive at the end of a mixed cycle was mentioned. I believe this > > would give those that want/need to minimize their JVM?s footprint > > an even greater cost advantage then attempting to reduce the > > footprint at some (un???)random point in time. > > > > How hard/expensive is to implement in G1 something similar like > Shenandoah does? Let me detail a bit how G1 triggers collections and when it releases memory: For the first part, let's make a distinction between the trigger for young collections and the start of old generation space reclamation ("marking"). Young collections are at the moment triggered by space exhaustion in the young generation. Old gen reclamation is currently only triggered by old generation space going over a given occupancy threshold. This can be either caused by young collections copying objects into the old generation ("promotion") or humongous (large) object allocation. In both cases, G1 triggers a so-called initial-mark young collection, i.e. a regular young collection with some additional setup for the concurrent marking. Concurrent marking results in Remark and Cleanup pauses that are scheduled on a time basis. During that time, when the young generation is exhausted, regular young collections are performed. After the Cleanup pause, there is some "last" regular young generation pause where GC and marking kind of sync up. After that one, again caused by exhaustion of the young gen, so called mixed collections are triggered. These are young generation collections with minimal young gen size and some old generation regions added. These mixed collections continue (mostly) until G1 thinks enough memory has been freed. Basically, to release memory back to the operating system, the programming effort would be to call G1CollectedHeap::shrink() at the end of the appropriate pause. If you were looking into effort, there are two caveats (compared to simply triggering a full gc): - full gc currently, in addition to compaction, also throws away SoftReferences (cached objects), and cleans out some references from VM internal data structures to the Java heap, making all of these freeable. Assuming you want to have similar impact regarding soft references and internal data structures, you need to start a concurrent cycle, and wait until the Remark pause with your shrinking request (one can still try without!). Only the following mixed GCs compact the Java heap (at the moment G1 will also not start mixed GCs without a previous marking cycle); but the Remark pause will free completely empty regions. - if you also want to compact old gen, the best time for the shrink() call would probably be the end of the mixed gc phase. However garbage collections are currently tied to exhaustion of young gen. So if your application is truly or almost idle, and does not allocate anything, there will be no GC pauses for potentially a long time. One would need to start the mixed gc based on some timout. Note that there are already some timers running for some VM cleanup tasks. To improve the impact of the compaction and shrinking you might also want to tweak some internal variables for the duration of this "idle cycle". Btw, one step in that direction would be to generally just attempt to shrink the heap at the end of mixed gc, e.g. https://bugs.openjdk.java. net/browse/JDK-6490394 . Implementing something like Shenandoah, i.e. uncommit the regions that stayed empty for more than ShenandoahUncommitDelay milliseconds would mean iterating through all regions and check if they are "old" enough to uncommit (and do so) at regular intervals in some STW pause. When allocating, also give them some sort of timestamp. Again, you need to make sure that the regions are looked through in some regular interval (either piggy-backing by some existing regular pauses or force one). This would not be particularly hard to implement either, but only seems extra work: as you might still want to compact the heap in some way (this is optional) by e.g. doing a marking + mixed cycle (and then wait for the ShenandoahUncommitDelay), so you may as well uncommit within these pauses already. ------------ Some other unanswered question so far has been up to what degree memory will be freed during this time: I guess at most until -Xms? Thanks, Thomas From per.liden at oracle.com Mon Jun 4 13:00:31 2018 From: per.liden at oracle.com (Per Liden) Date: Mon, 4 Jun 2018 15:00:31 +0200 Subject: 8204163: Also detect concurrent GCs in MetaspaceBaseGC.java In-Reply-To: References: Message-ID: <85800dc2-b3a7-9725-67f7-21ccce7c71de@oracle.com> Looks good to me. /Per On 05/31/2018 01:52 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to allow MetaspaceBaseGC.java to detect > concurrent GC cycles that are started because of Metaspace allocations. > > http://cr.openjdk.java.net/~stefank/8204163/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8204163 > > The old code only checked for Full GCs, and now we also match concurrent > cycles. > > This has been tested with the vmTestbase/metaspace/gc tests with G1, > CMS, and Z. > > Thanks, > StefanK From stefan.karlsson at oracle.com Mon Jun 4 13:00:39 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 4 Jun 2018 15:00:39 +0200 Subject: 8204163: Also detect concurrent GCs in MetaspaceBaseGC.java In-Reply-To: <85800dc2-b3a7-9725-67f7-21ccce7c71de@oracle.com> References: <85800dc2-b3a7-9725-67f7-21ccce7c71de@oracle.com> Message-ID: <3378f800-0141-70c2-25e1-9868544bd56d@oracle.com> Thanks, Per. StefanK On 2018-06-04 15:00, Per Liden wrote: > Looks good to me. > > /Per > > On 05/31/2018 01:52 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to allow MetaspaceBaseGC.java to detect >> concurrent GC cycles that are started because of Metaspace allocations. >> >> http://cr.openjdk.java.net/~stefank/8204163/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8204163 >> >> The old code only checked for Full GCs, and now we also match >> concurrent cycles. >> >> This has been tested with the vmTestbase/metaspace/gc tests with G1, >> CMS, and Z. >> >> Thanks, >> StefanK From shade at redhat.com Mon Jun 4 13:13:12 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Jun 2018 15:13:12 +0200 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: <1f933ce2b62f7ba0a1ebcb45c3bd33404958c3e0.camel@oracle.com> References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> <597AC801-7DD0-4E3E-BDBE-8A4FA9860501@gmail.com> <1f933ce2b62f7ba0a1ebcb45c3bd33404958c3e0.camel@oracle.com> Message-ID: <7c6e9422-cd54-ba96-bbb9-1db0052bd962@redhat.com> On 06/04/2018 02:26 PM, Thomas Schatzl wrote: > Implementing something like Shenandoah, i.e. uncommit the regions that > stayed empty for more than ShenandoahUncommitDelay milliseconds would > mean iterating through all regions and check if they are "old" enough > to uncommit (and do so) at regular intervals in some STW pause. When > allocating, also give them some sort of timestamp. > Again, you need to make sure that the regions are looked through in > some regular interval (either piggy-backing by some existing regular > pauses or force one). I am not sure why STW pause is needed. Nor why timestamps on allocation path are needed. In Shenandoah, region FSM has a few additional states: Empty-Uncommitted, Empty-Committed, Regular, Humongous. Allocation path makes state transitions a la {Empty-*, Regular} -> Regular, etc. Cleanup moves reclaimed regions to Empty-Committed, and records the timestamp when that transition happened. Uncommit does Empty-Committed -> Empty-Uncommitted for all the regions that have a fitting timestamp. All these transitions are sharing the same mechanics as the allocation path, so it does not require pause to work. The uncommit checks are done by the concurrent control thread that normally drives the GC cycle, but also does auxiliary work outside of GC cycle too. > This would not be particularly hard to implement either, but only seems > extra work: as you might still want to compact the heap in some way > (this is optional) by e.g. doing a marking + mixed cycle (and then wait > for the ShenandoahUncommitDelay), so you may as well uncommit within > these pauses already. Periodic GC does indeed help to compact things, but that is a second-order concern. Even uncommitting the regions that became free after the cycle provides a very substantial win. The beauty of Shenandoah-style uncommit is that it becomes orthogonal to the GC cycles themselves: you can first implement uncommits, and then figure out in which form and shape to issue periodic GC cycles to knock out the rest of the garbage. We might not even bother with that part, and just instruct users to say -XX:+ExplicitGCInvokesConcurrent, and wait for System.gc() to happen for the most compaction. The caveat with piggybacking on pauses, is that you don't really want to uncommit the memory that would be committed back by allocations in active application. It does make sense to do this on explicit GC though! You can do this under the pause too, but let's be mindful about the allocation costs, especially if we are about to commit memory back while holding the allocation lock :) -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From boris.ulasevich at bell-sw.com Mon Jun 4 13:18:41 2018 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Mon, 4 Jun 2018 16:18:41 +0300 Subject: RFR (S) 8202705: ARM32 build crashes on long JavaThread offsets In-Reply-To: References: <4840be99-ddbb-16f0-a9cb-31d7efcf0d02@bell-sw.com> Message-ID: <2cdb6d48-eedc-f057-32c9-5d4349cbb8cc@bell-sw.com> Hi Alexey, good point! But Rthread is not something we can redefine: > register_arm.hpp: > #define Rthread R10 Let us just leave comments and work with Ralloc: // Borrow the Rthread for alloc counter Register Ralloc = Rthread; add(Ralloc, Ralloc, in_bytes(JavaThread::allocated_bytes_offset())); ... (work with Ralloc) // Unborrow the Rthread sub(Rthread, Ralloc, in_bytes(JavaThread::allocated_bytes_offset())); Webrev: http://cr.openjdk.java.net/~bulasevich/8202705/webrev.02/ regards, Boris On 04.06.2018 15:10, Aleksey Shipilev wrote: > On 06/04/2018 01:58 PM, Boris Ulasevich wrote: >> Hello all, >> >> Please review this patch to allow ARM32 MacroAssembler to handle updated JavaThread offsets: >> ? http://cr.openjdk.java.net/~bulasevich/8202705/webrev.01/ >> ? https://bugs.openjdk.java.net/browse/JDK-8202705 > > Looks okay, but Rthread becomes misnomer in the middle of the method. > > Maybe like this? > > // Borrow the Rthread for alloc counter > Register Ralloc = Rthread; > Rthread = NULL; > add(Ralloc, Ralloc, in_bytes(JavaThread::allocated_bytes_offset()); > > ... > > ... > > // Unborrow the Rthread > sub(Ralloc, Ralloc, in_bytes(JavaThread::allocated_bytes_offset() > Rthread = Ralloc; > Ralloc = NULL; > > -Aleksey > From shade at redhat.com Mon Jun 4 13:22:39 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Jun 2018 15:22:39 +0200 Subject: RFR (S) 8202705: ARM32 build crashes on long JavaThread offsets In-Reply-To: <2cdb6d48-eedc-f057-32c9-5d4349cbb8cc@bell-sw.com> References: <4840be99-ddbb-16f0-a9cb-31d7efcf0d02@bell-sw.com> <2cdb6d48-eedc-f057-32c9-5d4349cbb8cc@bell-sw.com> Message-ID: <8cb788d8-e303-2efe-ce58-3390c845a27f@redhat.com> On 06/04/2018 03:18 PM, Boris Ulasevich wrote: > Let us just leave comments and work with Ralloc: > > ? // Borrow the Rthread for alloc counter > ? Register Ralloc = Rthread; > ? add(Ralloc, Ralloc, in_bytes(JavaThread::allocated_bytes_offset())); > > ? ... (work with Ralloc) > > ? // Unborrow the Rthread > ? sub(Rthread, Ralloc, in_bytes(JavaThread::allocated_bytes_offset())); > > Webrev: > ? http://cr.openjdk.java.net/~bulasevich/8202705/webrev.02/ Looks better, thanks! -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From per.liden at oracle.com Mon Jun 4 13:22:48 2018 From: per.liden at oracle.com (Per Liden) Date: Mon, 4 Jun 2018 15:22:48 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <3B1FF3B8-6A10-401A-A11D-3D027DD59702@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <3B1FF3B8-6A10-401A-A11D-3D027DD59702@oracle.com> Message-ID: Hi, On 06/01/2018 11:46 PM, Gerard Ziemski wrote: > Hi, > > Awesome job Robbin! > > I especially like that we now have self resizing StringTable (though I don?t like the power of 2 size constraint, but understand why that needs to be so). > > I do have some feedback, questions and comments below: > > > #1 The initial table size according to this code: > > #define START_SIZE 16 > > _current_size = ((size_t)1) << START_SIZE; > > Hence, it is 65536, but in the old code we had: > > const int defaultStringTableSize = NOT_LP64(1009) LP64_ONLY(60013); > > Which on non-64bit architecture was quite a bit smaller. Is it OK for us not to worry about non 64bit architectures now? > > BTW. It will be extremely interesting to see whether we can lower the initial size, now that we can grow the table. Should we file a followup issue, so we don't forget? > > > #2 Why we have: > > volatile size_t _items; > DEFINE_PAD_MINUS_SIZE(1, 64, sizeof(volatile size_t)); > volatile size_t _uncleaned_items; > DEFINE_PAD_MINUS_SIZE(2, 64, sizeof(volatile size_t)); > > and not > > volatile size_t _items; > DEFINE_PAD_MINUS_SIZE(1, DEFAULT_CACHE_LINE_SIZE, sizeof(volatile size_t)); > volatile size_t _uncleaned_items; > DEFINE_PAD_MINUS_SIZE(2, DEFAULT_CACHE_LINE_SIZE, sizeof(volatile size_t)); > > > #3 Extraneous space here, i.e. " name": > > return StringTable::the_table()->do_lookup( name, len, hash); > > > #4 Instead of: > > double fact = StringTable::get_load_factor(); > double dead_fact = StringTable::get_dead_factor(); > > where "fact" is an actual word on its own, can we consider using full names, ex: > > double load_factor = StringTable::get_load_factor(); > double dead_factor = StringTable::get_dead_factor(); > > > #5 In "static int literal_size(oop obj)": > > a) Why do we need the "else" clause? Will it ever be taken? > > } else { > return obj->size(); > } > > b) Why isn't "java_lang_String::value(obj)->size()" enough in: > > } else if (obj->klass() == SystemDictionary::String_klass()) { > return (obj->size() + java_lang_String::value(obj)->size()) * HeapWordSize; > } > > > #6 Can we rename "StringtableDCmd" to "StringtableDumpCmd?? Note that "D" here stands for "Diagnostic" and not "Dump", and by convention all these types of classes ends with "DCmd". cheers, Per > > > #7 Isn't "#define PREF_AVG_LIST_LEN 2" a bit too aggressive? Where did the value come from? > > > #8 Should we consider adding runtime flag options to control when resizing/cleanup triggers? (i.e. PREF_AVG_LIST_LEN, CLEAN_DEAD_HIGH_WATER_MARK) > > > #9 Why do we only resize up, if we can also resize down? > > > #10 Do we know the impact of the new table on memory usage (at the default initial size)? > > > #11 You mention various benchmarks were ran with no issues, which I take to mean as no regressions, but are there any statistically significant improvements shown that you can report? > > > #12 I mentioned this to you off the list, but in a case anyone else tries to run the code - the changes don?t build (which you pointed out to me is due to 8191798). Once things build again, I?d like an opportunity to be able to run the code for myself to check it out, and then report back with final review. > > > Cheers > >> On May 28, 2018, at 8:19 AM, Robbin Ehn wrote: >> >> Hi all, please review. >> >> This implements the StringTable with the ConcurrentHashtable for managing the >> strings using oopStorage for backing the actual oops via WeakHandles. >> >> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >> which means GC only needs to walk the oopStorage, either concurrently or in a >> safepoint. Walking oopStorage is also faster so there is a good effect on all >> safepoints visiting the oops. >> >> The unlinking and freeing happens during inserts when dead weak oops are >> encountered in that bucket. In any normal workload the stringtable self-cleans >> without needing any additional cleaning. Cleaning/unlinking can also be done >> concurrently via the ServiceThread, it is started when we have a high ?dead >> factor?. E.g. application have a lot of interned string removes the references >> and never interns again. The ServiceThread also concurrently grows the table if >> ?load factor? is high. Both the cleaning and growing take care to not prolonging >> time to safepoint, at the cost of some speed. >> >> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >> changeset, various benchmark such as JMH, specJBB2015. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >> >> Thanks, Robbin > From thomas.schatzl at oracle.com Mon Jun 4 14:37:35 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 04 Jun 2018 16:37:35 +0200 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: <7c6e9422-cd54-ba96-bbb9-1db0052bd962@redhat.com> References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> <597AC801-7DD0-4E3E-BDBE-8A4FA9860501@gmail.com> <1f933ce2b62f7ba0a1ebcb45c3bd33404958c3e0.camel@oracle.com> <7c6e9422-cd54-ba96-bbb9-1db0052bd962@redhat.com> Message-ID: <05d899600565abd9cf194acb23dc4ef27aa83e9d.camel@oracle.com> Hi Aleksey, On Mon, 2018-06-04 at 15:13 +0200, Aleksey Shipilev wrote: > On 06/04/2018 02:26 PM, Thomas Schatzl wrote: > > Implementing something like Shenandoah, i.e. uncommit the regions > > that stayed empty for more than ShenandoahUncommitDelay > > milliseconds would mean iterating through all regions and check if > > they are "old" enough to uncommit (and do so) at regular intervals > > in some STW pause. When allocating, also give them some sort of > > timestamp. Again, you need to make sure that the regions are looked > > through in some regular interval (either piggy-backing by some > > existing regular pauses or force one). > > I am not sure why STW pause is needed. Nor why timestamps on > allocation path are needed. > > In Shenandoah, region FSM has a few additional states: Empty- > Uncommitted, Empty-Committed, Regular, Humongous. Allocation path > makes state transitions a la {Empty-*, Regular} -> Regular, etc. > Cleanup moves reclaimed regions to Empty-Committed, and records the > timestamp when that transition happened. Uncommit does Empty- > Committed -> Empty-Uncommitted for all the regions that have a > fitting timestamp. > All these transitions are sharing the same mechanics as the > allocation path, so it does not require pause to work. I was wrong about the suggestion to take the timestamp in the allocation path, you are right you need them in the deallocation path. Thanks for pointing out my mistake. There are different interesting uses for a timestamp in the allocation path that are not relevant here. > The uncommit checks are done by the concurrent control thread that > normally drives the GC cycle, but also does auxiliary work outside of > GC cycle too. You are right that you do not necessarily need a STW pause to do the check-for-regions-to-uncommit work. However you do need at some point synchronize with the free region list (this is the list of regions new regions are allocated from in G1) before uncommit. At this point this free region list access uses a global lock, and we really want to get rid of this global lock as this restrains (allocation) throughput already in some applications. So *more* global locking is something we would prefer to avoid. (I am aware that this is a very infrequent use anyway, and only grabbed when there is not much application activity). In G1 we do not mind about some small extra work in an STW pause - G1 by definition will always do stw pauses for evacuation work (it wouldn't be G1 at that point), and iterating over all regions is already done there in various places (and is very fast anyway even for many regions, and if needed trivially parallelizable). (I am also talking only about removing these regions from the free region list, and not necessarily the uncommit call which can be deferred; I think however that is very cheap.) The VM already does some (regular) non-GC work in non-GC STW pauses which as mentioned would be candidates for putting that stuff in there. In the future this non-gc work will probably be moved into concurrent phases step-by-step, but at that point, when hopefully an API usable for all collectors for that has emerged, it may be easiest to move this work there at that time. Also somebody needs to implement that, and putting this into an existing stw pause for now seems to be the easiest way to do. But that is just my opinion, we can certainly help Rodrigo and Ruslan implementing that if they want it. Note that having an additional "background worker" thread adds (some) overhead in terms of startup and shutdown (time, memory) that some people might want to avoid too. I am not saying that this would be prohibitive, just that there is that concern. This is my current view on that particular problem :) > > This would not be particularly hard to implement either, but only > > seems extra work: as you might still want to compact the heap in > > some way (this is optional) by e.g. doing a marking + mixed cycle > > (and then wait for the ShenandoahUncommitDelay), so you may as well > > uncommit within these pauses already. > > Periodic GC does indeed help to compact things, but that is a second- > order concern. Even uncommitting the regions that became free after > the cycle provides a very substantial win. I did mention that this additional heap compaction is fully optional. However in case of containers you might want to take these extra space savings with you, assuming that there memory is the main constraint to put more containers on the same machine, not cpu time. As you mentioned, one could do both at the same time. > The beauty of Shenandoah-style uncommit is that it becomes orthogonal > to the GC cycles themselves: you can first implement uncommits, and > then figure out in which form and shape to issue periodic GC cycles > to knock out the rest of the garbage. We might not even bother with > that part, and just instruct users to say > -XX:+ExplicitGCInvokesConcurrent, and wait for System.gc() to happen > for the most compaction. > > The caveat with piggybacking on pauses, is that you don't really want > to uncommit the memory that would be committed back by allocations > in active application. It does make sense to do this on > explicit GC though! You can do this under the pause too, but let's be > mindful about the allocation costs, especially if we are about to > commit memory back while holding the allocation lock :) There are already some safeguards in place to prevent overzealous uncommit in all collectors. I am not sure that leaving heap compaction to an explicit (potentially) concurrent GC is what is wanted here. Thanks, Thomas From robbin.ehn at oracle.com Mon Jun 4 15:12:13 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 4 Jun 2018 17:12:13 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <79bbe8d7-3ef8-683c-9928-1db6b0230814@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <700a4c18-06ea-97c4-8be9-59c5f0503f08@oracle.com> <79bbe8d7-3ef8-683c-9928-1db6b0230814@oracle.com> Message-ID: <27954e21-0f17-70a0-6171-bd18842b32e7@oracle.com> Hi Ioi, On 2018-05-30 01:29, Ioi Lam wrote: > Hi Robin, > > I've looked at the changes related to CDS. It looks OK to me but I have the > following comments: > > One problem I found with reviewing stringTable.cpp is that some of the functions > are moved around, which means it's hard to match the "before" and "after" code > in the diff. For example: StringTable::verify has been moved to below > StringTable::dump(). > > Would it be possible for you to rearrange the new code to minimize the size of > the diffs, so it's easier to see what has been changed? > Tried to make this a bit better. > Also, (although this was an issue in the original code), could you rename > StringTable::copy_shared_string to StringTable::copy_shared_string_table? This > function copies not just than one string, but the entire table. Done, I'll send new version to rfr mail. Thanks, Robbin > > Thanks > > - Ioi > > > On 5/29/18 2:13 AM, Robbin Ehn wrote: >> Hi, >> >> For reference here is the ZGC patch: >> http://mail.openjdk.java.net/pipermail/zgc-dev/2018-May/000354.html >> >> I removed two methods I had left in for an older ZGC patch and fixed some >> minors in stringTable.hpp. >> Inc: http://cr.openjdk.java.net/~rehn/8195097/v1/inc/webrev/ >> Full:http://cr.openjdk.java.net/~rehn/8195097/v1/full/webrev/ >> >> /Robbin >> >> On 05/28/2018 03:19 PM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> This implements the StringTable with the ConcurrentHashtable for managing the >>> strings using oopStorage for backing the actual oops via WeakHandles. >>> >>> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >>> which means GC only needs to walk the oopStorage, either concurrently or in a >>> safepoint. Walking oopStorage is also faster so there is a good effect on all >>> safepoints visiting the oops. >>> >>> The unlinking and freeing happens during inserts when dead weak oops are >>> encountered in that bucket. In any normal workload the stringtable self-cleans >>> without needing any additional cleaning. Cleaning/unlinking can also be done >>> concurrently via the ServiceThread, it is started when we have a high ?dead >>> factor?. E.g. application have a lot of interned string removes the references >>> and never interns again. The ServiceThread also concurrently grows the table if >>> ?load factor? is high. Both the cleaning and growing take care to not prolonging >>> time to safepoint, at the cost of some speed. >>> >>> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >>> changeset, various benchmark such as JMH, specJBB2015. >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >>> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >>> >>> Thanks, Robbin > From robbin.ehn at oracle.com Mon Jun 4 15:15:49 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 4 Jun 2018 17:15:49 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <14230A21-5B54-451C-884C-E9E922967A25@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <14230A21-5B54-451C-884C-E9E922967A25@oracle.com> Message-ID: Hi Jiangli, On 2018-05-30 21:47, Jiangli Zhou wrote: > Hi Robbin, > > I mainly focused on the archived string part during review. Here are my comments, which are minor issues mostly. > > - stringTable.hpp > > Archived string is only supported with INCLUDE_CDS_JAVA_HEAP. Please add NOT_CDS_JAVA_HEAP_RETURN_(NULL) for lookup_shared() and create_archived_string() below so their call sites are handled properly when java heap object archiving is not supported. > > 153 oop lookup_shared(jchar* name, int len, unsigned int hash); > 154 static oop create_archived_string(oop s, Thread* THREAD); > Fixed > - stringTable.cpp > > How about renaming CopyArchive to CopyToArchive, so it?s more descriptive? Fixed > > Looks like the ?bool? return type is not needed since we always return with true and the result is not checked. How able changing it to return ?void?? > > 774 bool operator()(WeakHandle* val) { > The scanning done by the hashtable is stopped if we return false here. So the return value is used by the hashtable to know if it should continue the walk over all items. > > - genCollectedHeap.cpp > > Based on the assert at line 863, looks like ?par_state_string? is not NULL when 'scope->n_threads() > 1?. Maybe the if condition at line 865 could be simplified to be just ?if (scope->n_threads() > 1)?? > > 862 // Either we should be single threaded or have a ParState > 863 assert((scope->n_threads() <= 1) || par_state_string != NULL, "Parallel but not ParState"); > 864 > 865 if (scope->n_threads() > 1 && par_state_string != NULL) { Fixed, I'll send new version to rfr mail. Thanks, Robbin > > > Thanks, > Jiangli > >> On May 28, 2018, at 6:19 AM, Robbin Ehn wrote: >> >> Hi all, please review. >> >> This implements the StringTable with the ConcurrentHashtable for managing the >> strings using oopStorage for backing the actual oops via WeakHandles. >> >> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >> which means GC only needs to walk the oopStorage, either concurrently or in a >> safepoint. Walking oopStorage is also faster so there is a good effect on all >> safepoints visiting the oops. >> >> The unlinking and freeing happens during inserts when dead weak oops are >> encountered in that bucket. In any normal workload the stringtable self-cleans >> without needing any additional cleaning. Cleaning/unlinking can also be done >> concurrently via the ServiceThread, it is started when we have a high ?dead >> factor?. E.g. application have a lot of interned string removes the references >> and never interns again. The ServiceThread also concurrently grows the table if >> ?load factor? is high. Both the cleaning and growing take care to not prolonging >> time to safepoint, at the cost of some speed. >> >> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >> changeset, various benchmark such as JMH, specJBB2015. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >> >> Thanks, Robbin > From robbin.ehn at oracle.com Mon Jun 4 15:17:08 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 4 Jun 2018 17:17:08 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <4D8E57FA-AF13-4246-AE5B-6A310CB36C10@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <4D8E57FA-AF13-4246-AE5B-6A310CB36C10@oracle.com> Message-ID: Hi Gerard, On 2018-05-31 22:06, Gerard Ziemski wrote: > hi Robbin, > > I just started reviewing and need another daye for more in-depth review, but right now I wanted to start a conversation about 2 runtime flags that we seem to longer use in the new code: > > > #1 VerifyStringTableAtExit > > I think we should be able to add the code back that verifies the table in this new implementation, right? (i.e StringTable::verify_and_compare_entries) Added it back. > > > #2 StringTableSize > > We either need to deprecate it, or keep, but change the behavior (to take a power of 2). If we change the behavior, then we need to document it in release notes. I personally favor keeping the option, despite the fact that the new table is capable to resize on its own. Added support for this thanks for noticing I missed this. Thanks, Robbin > > > cheers > > > >> On May 28, 2018, at 8:19 AM, Robbin Ehn wrote: >> >> Hi all, please review. >> >> This implements the StringTable with the ConcurrentHashtable for managing the >> strings using oopStorage for backing the actual oops via WeakHandles. >> >> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >> which means GC only needs to walk the oopStorage, either concurrently or in a >> safepoint. Walking oopStorage is also faster so there is a good effect on all >> safepoints visiting the oops. >> >> The unlinking and freeing happens during inserts when dead weak oops are >> encountered in that bucket. In any normal workload the stringtable self-cleans >> without needing any additional cleaning. Cleaning/unlinking can also be done >> concurrently via the ServiceThread, it is started when we have a high ?dead >> factor?. E.g. application have a lot of interned string removes the references >> and never interns again. The ServiceThread also concurrently grows the table if >> ?load factor? is high. Both the cleaning and growing take care to not prolonging >> time to safepoint, at the cost of some speed. >> >> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >> changeset, various benchmark such as JMH, specJBB2015. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >> >> Thanks, Robbin > From robbin.ehn at oracle.com Mon Jun 4 15:57:27 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 4 Jun 2018 17:57:27 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <3B1FF3B8-6A10-401A-A11D-3D027DD59702@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <3B1FF3B8-6A10-401A-A11D-3D027DD59702@oracle.com> Message-ID: <0285aa65-f990-7281-52d7-968b94d4ee1e@oracle.com> Hi Gerard, On 2018-06-01 23:46, Gerard Ziemski wrote: > Hi, > > Awesome job Robbin! > > I especially like that we now have self resizing StringTable (though I don?t like the power of 2 size constraint, but understand why that needs to be so). Great, thanks! > > I do have some feedback, questions and comments below: > > > #1 The initial table size according to this code: > > #define START_SIZE 16 > > _current_size = ((size_t)1) << START_SIZE; > > Hence, it is 65536, but in the old code we had: > > const int defaultStringTableSize = NOT_LP64(1009) LP64_ONLY(60013); > > Which on non-64bit architecture was quite a bit smaller. Is it OK for us not to worry about non 64bit architectures now? Fixed! > > BTW. It will be extremely interesting to see whether we can lower the initial size, now that we can grow the table. Should we file a followup issue, so we don't forget? Yes, we should, but this involves some start-up benchmarking to see if 'normal' workloads will get effected. > > > #2 Why we have: > > volatile size_t _items; > DEFINE_PAD_MINUS_SIZE(1, 64, sizeof(volatile size_t)); > volatile size_t _uncleaned_items; > DEFINE_PAD_MINUS_SIZE(2, 64, sizeof(volatile size_t)); > > and not > > volatile size_t _items; > DEFINE_PAD_MINUS_SIZE(1, DEFAULT_CACHE_LINE_SIZE, sizeof(volatile size_t)); > volatile size_t _uncleaned_items; > DEFINE_PAD_MINUS_SIZE(2, DEFAULT_CACHE_LINE_SIZE, sizeof(volatile size_t)); > Fixed (I strongly think DEFAULT_CACHE_LINE_SIZE should be 64 and not 128, that's where 64 comes from) > > #3 Extraneous space here, i.e. " name": > > return StringTable::the_table()->do_lookup( name, len, hash); > Fixed > > #4 Instead of: > > double fact = StringTable::get_load_factor(); > double dead_fact = StringTable::get_dead_factor(); > > where "fact" is an actual word on its own, can we consider using full names, ex: > > double load_factor = StringTable::get_load_factor(); > double dead_factor = StringTable::get_dead_factor(); > Fixed > > #5 In "static int literal_size(oop obj)": > > a) Why do we need the "else" clause? Will it ever be taken? > > } else { > return obj->size(); > } > > b) Why isn't "java_lang_String::value(obj)->size()" enough in: > > } else if (obj->klass() == SystemDictionary::String_klass()) { > return (obj->size() + java_lang_String::value(obj)->size()) * HeapWordSize; > } > This is a copy of the method from the old hashtable. I think you are correct, but I'll let it be for now. > #7 Isn't "#define PREF_AVG_LIST_LEN 2" a bit too aggressive? Where did the value come from? The value comes from testing, cleaning and resizing benefits from short chains. Since resizing concurrent is right now done with one thread and some care about safepoints is taken it's not a particular fast operation. Therefore is much preferably to start it before we get long chains. > > > #8 Should we consider adding runtime flag options to control when resizing/cleanup triggers? (i.e. PREF_AVG_LIST_LEN, CLEAN_DEAD_HIGH_WATER_MARK) > I prefer not, we already have to many flags. > > #9 Why do we only resize up, if we can also resize down? Shrinking is a bit tricky heuristics-wise, but we could do it in a follow-up yes. > > > #10 Do we know the impact of the new table on memory usage (at the default initial size)? It have the same size per bucket/node as the old hashtable. The difference is the 4000 more bucket at default size. I think the footprint benchmark showed nothing in differences. > > > #11 You mention various benchmarks were ran with no issues, which I take to mean as no regressions, but are there any statistically significant improvements shown that you can report? The most important one ~2% increase in critical jops specJBB2015. Regarding the interning it self there is to many parameters to give a simple answer. The indirect to oopStorage cost, this means for every Node we need to call equals on we get a performance penalty vs old, which is also a reason to do aggressive resize. A plain lookup can under optimal condition thus be 25% faster in the old hashtable. The fence in global-counter read-side and access api also cost. > > > #12 I mentioned this to you off the list, but in a case anyone else tries to run the code - the changes don?t build (which you pointed out to me is due to 8191798). Once things build again, I?d like an opportunity to be able to run the code for myself to check it out, and then report back with final review. This have been resolved now, I'll send new version to rfr mail. Thanks, Robbin > > > Cheers > >> On May 28, 2018, at 8:19 AM, Robbin Ehn wrote: >> >> Hi all, please review. >> >> This implements the StringTable with the ConcurrentHashtable for managing the >> strings using oopStorage for backing the actual oops via WeakHandles. >> >> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >> which means GC only needs to walk the oopStorage, either concurrently or in a >> safepoint. Walking oopStorage is also faster so there is a good effect on all >> safepoints visiting the oops. >> >> The unlinking and freeing happens during inserts when dead weak oops are >> encountered in that bucket. In any normal workload the stringtable self-cleans >> without needing any additional cleaning. Cleaning/unlinking can also be done >> concurrently via the ServiceThread, it is started when we have a high ?dead >> factor?. E.g. application have a lot of interned string removes the references >> and never interns again. The ServiceThread also concurrently grows the table if >> ?load factor? is high. Both the cleaning and growing take care to not prolonging >> time to safepoint, at the cost of some speed. >> >> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >> changeset, various benchmark such as JMH, specJBB2015. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >> >> Thanks, Robbin > From robbin.ehn at oracle.com Mon Jun 4 15:59:46 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 4 Jun 2018 17:59:46 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> Message-ID: Hi, Here is an updated after reviews: Inc : http://cr.openjdk.java.net/~rehn/8195097/v2/inc/webrev/ Full: http://cr.openjdk.java.net/~rehn/8195097/v2/full/webrev/ Passed tier 1-3. /Robbin On 2018-05-28 15:19, Robbin Ehn wrote: > Hi all, please review. > > This implements the StringTable with the ConcurrentHashtable for managing the > strings using oopStorage for backing the actual oops via WeakHandles. > > The unlinking and freeing of hashtable nodes is moved outside the safepoint, > which means GC only needs to walk the oopStorage, either concurrently or in a > safepoint. Walking oopStorage is also faster so there is a good effect on all > safepoints visiting the oops. > > The unlinking and freeing happens during inserts when dead weak oops are > encountered in that bucket. In any normal workload the stringtable self-cleans > without needing any additional cleaning. Cleaning/unlinking can also be done > concurrently via the ServiceThread, it is started when we have a high ?dead > factor?. E.g. application have a lot of interned string removes the references > and never interns again. The ServiceThread also concurrently grows the table if > ?load factor? is high. Both the cleaning and growing take care to not prolonging > time to safepoint, at the cost of some speed. > > Kitchensink24h, multiple tier1-5 with no issue that I can relate to this > changeset, various benchmark such as JMH, specJBB2015. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 > Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ > > Thanks, Robbin From gerard.ziemski at oracle.com Mon Jun 4 15:02:15 2018 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Mon, 4 Jun 2018 10:02:15 -0500 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <3B1FF3B8-6A10-401A-A11D-3D027DD59702@oracle.com> Message-ID: <1E7E0191-7DAB-484E-B275-A09393AFEB0F@oracle.com> > On Jun 4, 2018, at 8:22 AM, Per Liden wrote: > >> } >> #6 Can we rename "StringtableDCmd" to "StringtableDumpCmd?? > > Note that "D" here stands for "Diagnostic" and not "Dump", and by convention all these types of classes ends with "DCmd". Ah, thanks for the note, but in that case, how about ?StringTableDiagnosticCmd? to make it more readable? cheers From rwestrel at redhat.com Mon Jun 4 12:04:07 2018 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 04 Jun 2018 14:04:07 +0200 Subject: RFR (S) 8202705: ARM32 build crashes on long JavaThread offsets In-Reply-To: <4840be99-ddbb-16f0-a9cb-31d7efcf0d02@bell-sw.com> References: <4840be99-ddbb-16f0-a9cb-31d7efcf0d02@bell-sw.com> Message-ID: Hi Boris, > http://cr.openjdk.java.net/~bulasevich/8202705/webrev.01/ That looks good to me. Roland. From kirk.pepperdine at gmail.com Mon Jun 4 18:27:42 2018 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Mon, 4 Jun 2018 20:27:42 +0200 Subject: Question regarding JEP 8204089 Timely Reducing Unused Committed Memory In-Reply-To: <1f933ce2b62f7ba0a1ebcb45c3bd33404958c3e0.camel@oracle.com> References: <6f9228a19d3c4deca170b2032b887dde@clsa.com> <597AC801-7DD0-4E3E-BDBE-8A4FA9860501@gmail.com> <1f933ce2b62f7ba0a1ebcb45c3bd33404958c3e0.camel@oracle.com> Message-ID: <756682AA-C008-4874-9168-138277CECB0C@gmail.com> Thank you Thomas for this detailed description. Kind regards, Kirk > On Jun 4, 2018, at 2:26 PM, Thomas Schatzl wrote: > > Hi all, > > On Fri, 2018-06-01 at 16:44 +0300, Ruslan Synytsky wrote: >> Hi Kirk, thank you for the additional important highlights. >> >> On 1 June 2018 at 08:21, Kirk Pepperdine >> wrote: >>> Hi, > > [...] > >>> In an era when the GC engineers are working so very hard to make as >>> much of the collection process as concurrent as possible, it just >>> feels very wrong to rely on a huge STW event to get something done. >>> In that spirit, it behoves us to explore how committed memory maybe >>> released at the tail end of a normally triggered GC cycle. I >>> beleive at the end of a mixed cycle was mentioned. I believe this >>> would give those that want/need to minimize their JVM?s footprint >>> an even greater cost advantage then attempting to reduce the >>> footprint at some (un???)random point in time. >>> >> >> How hard/expensive is to implement in G1 something similar like >> Shenandoah does? > > Let me detail a bit how G1 triggers collections and when it releases > memory: > > For the first part, let's make a distinction between the trigger for > young collections and the start of old generation space reclamation > ("marking"). > > Young collections are at the moment triggered by space exhaustion in > the young generation. > > Old gen reclamation is currently only triggered by old generation space > going over a given occupancy threshold. This can be either caused by > young collections copying objects into the old generation ("promotion") > or humongous (large) object allocation. > > In both cases, G1 triggers a so-called initial-mark young collection, > i.e. a regular young collection with some additional setup for the > concurrent marking. > Concurrent marking results in Remark and Cleanup pauses that are > scheduled on a time basis. During that time, when the young generation > is exhausted, regular young collections are performed. > > After the Cleanup pause, there is some "last" regular young generation > pause where GC and marking kind of sync up. After that one, again > caused by exhaustion of the young gen, so called mixed collections are > triggered. These are young generation collections with minimal young > gen size and some old generation regions added. > > These mixed collections continue (mostly) until G1 thinks enough memory > has been freed. > > Basically, to release memory back to the operating system, the > programming effort would be to call G1CollectedHeap::shrink() at the > end of the appropriate pause. > > If you were looking into effort, there are two caveats (compared to > simply triggering a full gc): > > - full gc currently, in addition to compaction, also throws away > SoftReferences (cached objects), and cleans out some references from VM > internal data structures to the Java heap, making all of these > freeable. > Assuming you want to have similar impact regarding soft references and > internal data structures, you need to start a concurrent cycle, and > wait until the Remark pause with your shrinking request (one can still > try without!). > Only the following mixed GCs compact the Java heap (at the moment G1 > will also not start mixed GCs without a previous marking cycle); but > the Remark pause will free completely empty regions. > > - if you also want to compact old gen, the best time for the shrink() > call would probably be the end of the mixed gc phase. However garbage > collections are currently tied to exhaustion of young gen. So if your > application is truly or almost idle, and does not allocate anything, > there will be no GC pauses for potentially a long time. > > One would need to start the mixed gc based on some timout. Note that > there are already some timers running for some VM cleanup tasks. > > To improve the impact of the compaction and shrinking you might also > want to tweak some internal variables for the duration of this "idle > cycle". > > Btw, one step in that direction would be to generally just attempt to > shrink the heap at the end of mixed gc, e.g. https://bugs.openjdk.java. > net/browse/JDK-6490394 . > > Implementing something like Shenandoah, i.e. uncommit the regions that > stayed empty for more than ShenandoahUncommitDelay milliseconds would > mean iterating through all regions and check if they are "old" enough > to uncommit (and do so) at regular intervals in some STW pause. When > allocating, also give them some sort of timestamp. > Again, you need to make sure that the regions are looked through in > some regular interval (either piggy-backing by some existing regular > pauses or force one). > > This would not be particularly hard to implement either, but only seems > extra work: as you might still want to compact the heap in some way > (this is optional) by e.g. doing a marking + mixed cycle (and then wait > for the ShenandoahUncommitDelay), so you may as well uncommit within > these pauses already. > > ------------ > > Some other unanswered question so far has been up to what degree memory > will be freed during this time: I guess at most until -Xms? > > Thanks, > Thomas > From kim.barrett at oracle.com Mon Jun 4 18:31:53 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 4 Jun 2018 14:31:53 -0400 Subject: RFR: 8203319: JDK-8201487 disabled too much queue balancing In-Reply-To: References: <3CFCA83E-82A5-4CA1-BA07-F34BFD0C689C@oracle.com> <670a9fdf-a719-1aae-aea9-6316a3c025f4@oracle.com> Message-ID: > On May 31, 2018, at 3:34 PM, Kim Barrett wrote: > >> On May 31, 2018, at 4:41 AM, Stefan Johansson wrote: >> >> >> >> On 2018-05-30 19:33, Kim Barrett wrote: >>> Please review this change to the ReferenceProcessor's test for whether >>> to balance a set of queues before processing them with multiple >>> threads. The change to that test made by JDK-8201487 doesn't peform >>> balancing for some (potential, see Testing below) states where not >>> doing so will result in some discovered References not being >>> processed. In particular, there are cases where we must ignore >>> -XX:-ParallelRefProcBalancingEnabled and balance anyway. >>> We also now avoid balancing in some cases where we know the set is >>> already balanced, even with -XX:+ParallelRefProcBalancingEnabled. >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8203319 >>> Webrev: >>> http://cr.openjdk.java.net/~kbarrett/8203319/open.00/ >> Looks good, >> Stefan > > Thanks, Stefan. The new need_balance_queues function has an optimization to avoid balancing in a case where we already know the queues are balanced. This assumes it's being called for initial balancing after discovery, and not for some re-balancing after a processing phase. But that's problematic for JDK-8043575 (recently RFR'ed). And if we're already balanced, balance_queues doesn't have very much to do. (And eventually the optimization would be rendered moot anyway, by the elimination of balancing; see JDK-8202328.) So I'm taking that bit out. New webrevs: full: http://cr.openjdk.java.net/~kbarrett/8203319/open.01/ incr: http://cr.openjdk.java.net/~kbarrett/8203319/open.01.inc/ From kim.barrett at oracle.com Mon Jun 4 20:08:40 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 4 Jun 2018 16:08:40 -0400 Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 In-Reply-To: References: <5625a595-1165-8d48-afbd-8229cdc4ac07@oracle.com> <7e7fc484287e4da4926176e0b5ae1b64@sap.com> <339D50B6-09D5-4342-A687-918A9A096B39@oracle.com> <6D2268EC-C1E7-4C90-BCD3-90D02D21FA08@oracle.com> <3e05cc86c9d4406d8a9875b705fbf1fc@sap.com> <5B0D40F7.5050807@oracle.com> <4dddb18526a745cc83941a0f58af77f5@sap.com> Message-ID: > On Jun 1, 2018, at 11:08 AM, Michihiro Horie wrote: > > Hi Kim, Erik, and Martin, > > Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. > > I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ > This change uses forwardee_acquire(), which would generate better code on ARM. > > Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. > > I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. Looks good. > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo > > "Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of "MP+sync+addr". > > From: "Doerr, Martin" > To: "Erik ?sterlund" , Kim Barrett , Michihiro Horie , "Andrew Haley (aph at redhat.com)" > Cc: "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" , "ppc-aix-port-dev at openjdk.java.net" > Date: 2018/05/30 16:18 > Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > > > > Hi Erik, > > the current implementation works on PPC because of "MP+sync+addr". > So we already rely on ordering of "load volatile field" + "implicit consume" on the reader's side. We have never seen any issues related to this with the compilers we have been using during the ~10 years the PPC implementation exists. > > PPC supports "MP+lwsync+addr" the same way, so Michihiro's proposal doesn't make it unreliable for PPC. > > But I'm ok with evaluating acquire barriers although they are not required by the PPC/ARM memory models. > ARM/aarch64 will also be affected when the o->forwardee uses load_acquire. So somebody should check the impact. If it is not acceptable we may need to introduce explicit consume. > > Implicit consume is also bad in shared code because somebody may want to run it on DEC Alpha. > > Thanks and best regards, > Martin > > > -----Original Message----- > From: Erik ?sterlund [mailto:erik.osterlund at oracle.com] > Sent: Dienstag, 29. Mai 2018 14:01 > To: Doerr, Martin ; Kim Barrett ; Michihiro Horie > Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > Hi Martin and Michihiro, > > On 2018-05-29 12:30, Doerr, Martin wrote: > > Hi Kim, > > > > I'm trying to understand how this is related to Michihiro's change. The else path of the initial test is not affected by it AFAICS. > > So it sounds like a request to fix the current implementation in addition to what his original intend was. > > I think we are just trying to nail down the correct fencing and just go > for that. And yes, this is arguably a pre-existing problem, but in a > race involving the very same accesses that we are changing the fencing > for. So it is not completely unrelated I suppose. > > In particular, hotspot has code that assumes that if you on the writer > side issue a full fence before publishing a pointer to newly initialized > data, then the initializing stores and their side effects should be > globally "visible" across the system before the pointer to it is > published, and hence elide the need for acquire on the loading side, > without relying on retained data dependencies on the loader side. I > believe this code falls under that category. It is assumed that the > leading fence of the CAS publishing the forwarding pointer makes the > initializing stores globally observable before publishing a pointer to > the initialized data, hence assuming that any loads able to observe the > new pointer would not rely on acquire or data dependent loads to > correctly read the initialized data. > > Unfortunately, this is not reliable in the IRIW case, as per the litmus > test "MP+sync+ctrl" as described in "Understanding POWER > multiprocessors" (https://dl.acm.org/citation.cfm?id=1993520), as > opposed to "MP+sync+addr" that gets away with it because of the data > dependency (not IRIW). Similarly, an isync does the job too on the > reader side as shown in MP+sync+ctrlisync. So while what I believe was > the previous reasoning that the leading sync of the CAS would elide the > necessity for acquire on the reader side without relying on data > dependent loads (implicit consume), I think that assumption was wrong in > the first place and that we do indeed need explicit acquire (even with > the precious conservative CAS fencing) in this context to not rely on > implicit consume semantics generating the required data dependent loads > on the reader side. In practice though, the leading sync of the CAS has > been enough to generate the correct machine code. Now, with the leading > sync removed, we are increasing the possible holes in the generated > machine code due to this flawed reasoning. So it would be nice to do > something more sound instead that does not make such assumptions. > > > Anyway, I agree with that implicit consume is not good. And I think it would be good to treat both o->forwardee() the same way. > > What about keeping memory_order_release for the CAS and using acquire for both o->forwardee()? > > The case in which the CAS succeeds is safe because the current thread has created new_obj so it doesn't need memory barriers to access it. > > Sure, that sounds good to me. > > Thanks, > /Erik > > > Thanks and best regards, > > Martin > > > > > > -----Original Message----- > > From: Kim Barrett [mailto:kim.barrett at oracle.com] > > Sent: Dienstag, 29. Mai 2018 01:54 > > To: Michihiro Horie > > Cc: Erik Osterlund ; david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; Doerr, Martin > > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > > >> On May 28, 2018, at 4:12 AM, Michihiro Horie wrote: > >> > >> Hi Erik, > >> > >> Thank you very much for your review. > >> > >> I understood that implicit consume should not be used in the shared code. Also, I believe performance degradation would be negligible even if we use acquire. > >> > >> New webrev uses memory_order_acq_rel: http://cr.openjdk.java.net/~mhorie/8154736/webrev.10 > > This is missing the acquire barrier on the else branch for the initial test, so fails to meet > > the previously described minimal requirements for even possibly being sufficient. Any > > analysis of weakening the CAS barriers must consider that test and successor code. > > > > In the analysis, it?s not just the lexically nearby debugging / logging code that needs to be > > considered; the forwardee is being returned to caller(s) that will presumably do something > > with that object. > > > > Since the whole point of this discussion is performance, any proposed change should come > > with performance information. > > From sangheon.kim at oracle.com Mon Jun 4 21:38:36 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 4 Jun 2018 14:38:36 -0700 Subject: RFR(s): 8204094: assert(worker_i < _length) failed: Worker 15 is greater than max: 11 at ReferenceProcessorPhaseTimes In-Reply-To: <684dc326e23e8efb4d94e77bfec3f2b1d3248b27.camel@oracle.com> References: <7236ef64-533d-48ad-ecc2-c6bdd12ed4cd@oracle.com> <684dc326e23e8efb4d94e77bfec3f2b1d3248b27.camel@oracle.com> Message-ID: <71e6170c-f42a-2e15-21b7-72a715ea7f1d@oracle.com> Thank you for the review, Thomas. Sangheon On 6/4/18 1:30 AM, Thomas Schatzl wrote: > Hi, > > On Wed, 2018-05-30 at 19:19 -0700, sangheon.kim at oracle.com wrote: >> Hi all, >> >> Can I have reviews for this patch that fixes assertion failure at >> ReferenceProcessorPhaseTimes? >> >> [..] >> >> This patch is proposing to use maximum queue when create >> ReferenceProcessorPhaseTimes. >> >> CR: https://bugs.openjdk.java.net/browse/JDK-8204094 >> Webrev: http://cr.openjdk.java.net/~sangheki/8204094/webrev.0 >> Testing: hs-tier1-5 with/without ParallelRefProcEnabled > looks good > > Thomas From HORIE at jp.ibm.com Mon Jun 4 22:41:41 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Tue, 5 Jun 2018 07:41:41 +0900 Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 In-Reply-To: References: <5625a595-1165-8d48-afbd-8229cdc4ac07@oracle.com> <7e7fc484287e4da4926176e0b5ae1b64@sap.com> <339D50B6-09D5-4342-A687-918A9A096B39@oracle.com> <6D2268EC-C1E7-4C90-BCD3-90D02D21FA08@oracle.com> <3e05cc86c9d4406d8a9875b705fbf1fc@sap.com> <5B0D40F7.5050807@oracle.com> <4dddb18526a745cc83941a0f58af77f5@sap.com> Message-ID: >> On Jun 1, 2018, at 11:08 AM, Michihiro Horie wrote: >> >> Hi Kim, Erik, and Martin, >> >> Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. >> >> I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ >> This change uses forwardee_acquire(), which would generate better code on ARM. >> >> Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. >> >> I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. > >Looks good. Thanks a lot, Kim! Best regards, -- Michihiro, IBM Research - Tokyo From: Kim Barrett To: Michihiro Horie Cc: "Doerr, Martin" , "Andrew Haley (aph at redhat.com)" , "david.holmes at oracle.com" , "Erik ?sterlund" , "hotspot-gc-dev at openjdk.java.net" , "ppc-aix-port-dev at openjdk.java.net" Date: 2018/06/05 05:08 Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > On Jun 1, 2018, at 11:08 AM, Michihiro Horie wrote: > > Hi Kim, Erik, and Martin, > > Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. > > I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ > This change uses forwardee_acquire(), which would generate better code on ARM. > > Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. > > I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. Looks good. > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo > > "Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of "MP+sync+addr". > > From: "Doerr, Martin" > To: "Erik ?sterlund" , Kim Barrett , Michihiro Horie , "Andrew Haley (aph at redhat.com)" > Cc: "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" , "ppc-aix-port-dev at openjdk.java.net" > Date: 2018/05/30 16:18 > Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > > > > Hi Erik, > > the current implementation works on PPC because of "MP+sync+addr". > So we already rely on ordering of "load volatile field" + "implicit consume" on the reader's side. We have never seen any issues related to this with the compilers we have been using during the ~10 years the PPC implementation exists. > > PPC supports "MP+lwsync+addr" the same way, so Michihiro's proposal doesn't make it unreliable for PPC. > > But I'm ok with evaluating acquire barriers although they are not required by the PPC/ARM memory models. > ARM/aarch64 will also be affected when the o->forwardee uses load_acquire. So somebody should check the impact. If it is not acceptable we may need to introduce explicit consume. > > Implicit consume is also bad in shared code because somebody may want to run it on DEC Alpha. > > Thanks and best regards, > Martin > > > -----Original Message----- > From: Erik ?sterlund [mailto:erik.osterlund at oracle.com] > Sent: Dienstag, 29. Mai 2018 14:01 > To: Doerr, Martin ; Kim Barrett ; Michihiro Horie > Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > Hi Martin and Michihiro, > > On 2018-05-29 12:30, Doerr, Martin wrote: > > Hi Kim, > > > > I'm trying to understand how this is related to Michihiro's change. The else path of the initial test is not affected by it AFAICS. > > So it sounds like a request to fix the current implementation in addition to what his original intend was. > > I think we are just trying to nail down the correct fencing and just go > for that. And yes, this is arguably a pre-existing problem, but in a > race involving the very same accesses that we are changing the fencing > for. So it is not completely unrelated I suppose. > > In particular, hotspot has code that assumes that if you on the writer > side issue a full fence before publishing a pointer to newly initialized > data, then the initializing stores and their side effects should be > globally "visible" across the system before the pointer to it is > published, and hence elide the need for acquire on the loading side, > without relying on retained data dependencies on the loader side. I > believe this code falls under that category. It is assumed that the > leading fence of the CAS publishing the forwarding pointer makes the > initializing stores globally observable before publishing a pointer to > the initialized data, hence assuming that any loads able to observe the > new pointer would not rely on acquire or data dependent loads to > correctly read the initialized data. > > Unfortunately, this is not reliable in the IRIW case, as per the litmus > test "MP+sync+ctrl" as described in "Understanding POWER > multiprocessors" ( https://dl.acm.org/citation.cfm?id=1993520 ), as > opposed to "MP+sync+addr" that gets away with it because of the data > dependency (not IRIW). Similarly, an isync does the job too on the > reader side as shown in MP+sync+ctrlisync. So while what I believe was > the previous reasoning that the leading sync of the CAS would elide the > necessity for acquire on the reader side without relying on data > dependent loads (implicit consume), I think that assumption was wrong in > the first place and that we do indeed need explicit acquire (even with > the precious conservative CAS fencing) in this context to not rely on > implicit consume semantics generating the required data dependent loads > on the reader side. In practice though, the leading sync of the CAS has > been enough to generate the correct machine code. Now, with the leading > sync removed, we are increasing the possible holes in the generated > machine code due to this flawed reasoning. So it would be nice to do > something more sound instead that does not make such assumptions. > > > Anyway, I agree with that implicit consume is not good. And I think it would be good to treat both o->forwardee() the same way. > > What about keeping memory_order_release for the CAS and using acquire for both o->forwardee()? > > The case in which the CAS succeeds is safe because the current thread has created new_obj so it doesn't need memory barriers to access it. > > Sure, that sounds good to me. > > Thanks, > /Erik > > > Thanks and best regards, > > Martin > > > > > > -----Original Message----- > > From: Kim Barrett [mailto:kim.barrett at oracle.com] > > Sent: Dienstag, 29. Mai 2018 01:54 > > To: Michihiro Horie > > Cc: Erik Osterlund ; david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; Doerr, Martin > > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > > >> On May 28, 2018, at 4:12 AM, Michihiro Horie wrote: > >> > >> Hi Erik, > >> > >> Thank you very much for your review. > >> > >> I understood that implicit consume should not be used in the shared code. Also, I believe performance degradation would be negligible even if we use acquire. > >> > >> New webrev uses memory_order_acq_rel: http://cr.openjdk.java.net/~mhorie/8154736/webrev.10 > > This is missing the acquire barrier on the else branch for the initial test, so fails to meet > > the previously described minimal requirements for even possibly being sufficient. Any > > analysis of weakening the CAS barriers must consider that test and successor code. > > > > In the analysis, it?s not just the lexically nearby debugging / logging code that needs to be > > considered; the forwardee is being returned to caller(s) that will presumably do something > > with that object. > > > > Since the whole point of this discussion is performance, any proposed change should come > > with performance information. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From kim.barrett at oracle.com Tue Jun 5 00:11:13 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 4 Jun 2018 20:11:13 -0400 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> Message-ID: > On Jun 1, 2018, at 5:48 PM, sangheon.kim at oracle.com wrote: > > Hi all, > > As webrev.0 is conflicting with webrev.0 of "8203319: JDK-8201487 disabled too much queue balancing"(out for review, but not yet pushed), I'm posting webrev.1. > > http://cr.openjdk.java.net/~sangheki/8043575/webrev.1 > http://cr.openjdk.java.net/~sangheki/8043575/webrev.1_to_0 The hookup of the changes into the various collectors seems okay. My comments are mostly focused on the ReferenceProcessor changes. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessorPhaseTimes.hpp 140 void set_phase_number(RefProcPhaseNumbers phase_number) { _phase_number = phase_number; } 141 RefProcPhaseNumbers phase_number() const { return _phase_number; } I think I dislike these and how they are used. And see other discussion below, suggesting they might not be needed. (Though they are similar to other related states. But maybe I dislike those too.) ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.hpp 641 class RefProcMTDegreeAdjuster : public StackObj { ... 649 RefProcMTDegreeAdjuster(ReferenceProcessor* rp, ReferenceProcessorPhaseTimes* times, size_t ref_count); The times argument is here to provide information about what phase we're in. That seems really indirect, and I'd really prefer that information to be provided directly as arguments here. I think that might also remove the need for the new phase_number and set_phase_number functions for the phase times class. This also would flow through to ergo_proc_thread_count and use_max_threads. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 727 bool must_balance = mt_processing && need_balance_queues(refs_lists); need_balance_queues returns the right answer for an initial balancing, but not for a re-balance after some processing. Specifically, further reducing the mt-degree may require rebalancing, even if it wasn't previously needed. (This comment is assuming the updated version of need_balance_queues that I sent out today. The situation is worse without that.) ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 787 // For discovery_is_atomic() is true, phase 3 should be processed to call do_void() from VoidClosure. 788 // For discovery_is_atomic() is false, phase 3 can be skipped if there are no references because do_void() is 789 // already called at phase 2. 790 if (!discovery_is_atomic()) { 791 if (total_count(refs_lists) == 0) { 792 return; 793 } 794 } The discovery_is_atomic stuff is left-over from before JDK-8203028. Phase2 now never calls the complete_gc closure, because it is never needed, regardless of the value of discovery_is_atomic. So checking that can be removed here. At least, that's what's expected by the ReferenceProcessing code. The atomic discovery case for phase2 has "always" (since at least the beginning of the mercurial age) ignored the complete_gc argument, and only used it when dealing with the concurrent discovery case. But G1CopyingKeepAliveClosure appears to have always passed the buck to the complete_gc closure, contrary to the expectations of the RP phase2 code. So G1 young/mixed collections only work because phase3 will eventually call the complete_gc closure (by each of the threads). (G1 concurrent and full gc's have a cuttoff where no work is left for the complete_gc closure if the object is already marked. I spot checked other collectors, and at a quick glance it looks like the others meet the expectations of the ReferenceProcessor code; it's just G1 young/mixed collections that don't.) I'm not entirely sure what happens if the number of threads in phase3 is less than the number in phase2 in that case. I think any pending phase2 work associated with the missing threads will get stolen by phase3 working threads, but haven't verified that. I think the simplest fix for this is to change phase2 to always call the complete_gc closure after all. For some collections that's a (perhaps somewhat expensive) nop, but in the overall context of reference processing that's probably in the noise. And also update the relevant comment in phase2. The alternative would be to make G1CopyingKeepAliveClosure meet the (not explicitly stated in the API) ReferenceProcessor expectations, perhaps bypassing some of the G1ParScanThreadState machinery. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 800 RefProcMTDegreeAdjuster a(this, phase_times, ref_count); This is using ref_count, which was last set on line 759, and not updated after phase2. It should have been updated as part of the new code block around line 790. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 1193 uint ReferenceProcessor::ergo_proc_thread_count(ReferenceProcessorPhaseTimes* times) const { 1194 size_t ref_count = total_count(_discovered_refs); 1195 1196 return ergo_proc_thread_count(ref_count, num_queues(), times); 1197 } I don't think this overload should exist. The ref_count used here is the SoftReference count, which isn't particularly interesting here. (It's not the total number of discovered references, which is also not an interesting number for this purpose.) It seems to only exist as part of the workarounds for making ParNew kind of limp along with these changes. But wouldn't it be simpler to leave ParNewRefProceTaskExecutor::execute alone, using the active_workers as before? (And recall that I suggested above that execute should just use the currently configured mt-processing degree, rather than adding an ergo_workers argument.) It will be called in the context of RefProcMTDegreeAdjuster, that won't do anything because of ParNew disables mt-degree adjustment. So I think by leaving things alone here we retain the current behavior, e.g. CMS ParNew uses ParallelRefProcEnabled as an all-or-nothing flag, and doesn't pay attention to the new ReferencesPerThread. And that seems reasonable to me for a deprecated collector. ------------------------------------------------------------------------------ src/hotspot/share/gc/cms/parNewGeneration.cpp 1455 false); // disable adjusting queue size when processing references It's not the queue size that adjusted (or not), it's the number of queues. And really, it's the MT processing degree. That last probably should also apply to the name of the variable in the reference processor, and related names. Although given that this is all a bit of a kludge to work around (deprecated) CMS deficiencies, maybe I shouldn't be too concerned. And you did file JDK-8203951. Though I see Thomas also requested a name change... If you change the name, remember to update JDK-8203951. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 1199 uint ReferenceProcessor::ergo_proc_thread_count(size_t ref_count, 1200 uint max_threads, 1201 ReferenceProcessorPhaseTimes* times) const { ... 1204 if (ReferencesPerThread == 0) { 1205 return _num_queues; 1206 } Why is _num_queues used here, but max_threads used elsewhere in this function. I *think* this ought to also be max_threads. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 1229 RefProcMTDegreeAdjuster::RefProcMTDegreeAdjuster(ReferenceProcessor* rp, ... 1235 if (!_rp->has_adjustable_queue() || (ReferencesPerThread == 0)) { This checks ReferencesPerThread for 0 to do nothing. And then ergo_proc_thread_count similarly checks for 0 to (effectively) do nothing. If the earlier suggestion to kill the 1-arg overload is taken, then ergo_proc_thread_count is really just a helper for RefProcMTDegreeAdjuster, and we don't need that special case twice. And perhaps it should be a private helper in RefProcMTDegreeAdjuster? Similarly for use_max_threads. Or maybe just inline them into that class's constructor. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 1199 uint ReferenceProcessor::ergo_proc_thread_count(size_t ref_count, ... 1213 return (uint)MIN3(thread_count, 1214 static_cast(max_threads), 1215 (size_t)os::initial_active_processor_count()); I don't see why this should look at initial_active_processor_count? ------------------------------------------------------------------------------ From kim.barrett at oracle.com Tue Jun 5 00:26:27 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 4 Jun 2018 20:26:27 -0400 Subject: RFR (XS): 8202049: G1: ReferenceProcessor doesn't handle mark stack overflow In-Reply-To: <7f43835383d1ebf1f7d3a192c7949ffdb80541d3.camel@oracle.com> References: <7f43835383d1ebf1f7d3a192c7949ffdb80541d3.camel@oracle.com> Message-ID: <5C078D8C-72A4-41F3-ACF7-620B5918799D@oracle.com> > On Jun 4, 2018, at 6:35 AM, Thomas Schatzl wrote: > CR: > https://bugs.openjdk.java.net/browse/JDK-8202049 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8202049/webrev > Testing: > running this patch for months in my usual testing routine for other > patches without any error. > > Thanks, > Thomas I think this ought to use fatal(...) rather than log_error + ShouldNotReachHere. From kim.barrett at oracle.com Tue Jun 5 00:48:44 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 4 Jun 2018 20:48:44 -0400 Subject: RFR(s): 8204094: assert(worker_i < _length) failed: Worker 15 is greater than max: 11 at ReferenceProcessorPhaseTimes In-Reply-To: <7236ef64-533d-48ad-ecc2-c6bdd12ed4cd@oracle.com> References: <7236ef64-533d-48ad-ecc2-c6bdd12ed4cd@oracle.com> Message-ID: <44A646FD-F617-4915-82D5-E7651FD632A1@oracle.com> > On May 30, 2018, at 10:19 PM, sangheon.kim at oracle.com wrote: > > Hi all, > > Can I have reviews for this patch that fixes assertion failure at ReferenceProcessorPhaseTimes? > > This failure only happens when ParallelRefProcEnabled option is set on CMS and Parallel GC. > The problem is that we are using more workers than we created a storage for workers. > When we create ReferenceProcessorPhaseTimes, we set how many workers will be monitored (i.e. prepare an array to save time information). And currently we are setting it with ReferenceProcessor queue size(ReferenceProcessor::_num_queues). But this is problematic because the queue size is continuously updated by ReferenceProcessor::set_active_mt_degree() with active workers every GC. And the queue size is decided later than ReferenceProcessorPhaseTimes is created. So if active workers repeats to increase/decrease, ReferenceProcessorPhaseTimes would have smaller than active workers. > > This patch is proposing to use maximum queue when create ReferenceProcessorPhaseTimes. > > CR: https://bugs.openjdk.java.net/browse/JDK-8204094 > Webrev: http://cr.openjdk.java.net/~sangheki/8204094/webrev.0 > Testing: hs-tier1-5 with/without ParallelRefProcEnabled > > Thanks, > Sangheon > ------------------------ > For example: > 1) Set 2 when create ReferenceProcessorPhaseTimes(xx, 2): ReferenceProcessor::num_queues() = 2, because previously we had only 2 active workers. > 2) Set 23 for ReferenceProcessor::set_active_mt_degree(23), current active workers are increased. > 3) Assertion failure as we are using more workers than set from #1. Looks good. From sangheon.kim at oracle.com Tue Jun 5 04:14:20 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 4 Jun 2018 21:14:20 -0700 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: <1a4986a8d19adc3f25efc707b7d026eb81161438.camel@oracle.com> References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <1a4986a8d19adc3f25efc707b7d026eb81161438.camel@oracle.com> Message-ID: Hi Thomas, Thank you for reviewing this. On 6/4/18 1:56 AM, Thomas Schatzl wrote: > Hi Sangheon, > > On Fri, 2018-06-01 at 14:48 -0700, sangheon.kim at oracle.com wrote: >> Hi all, >> >> As webrev.0 is conflicting with webrev.0 of "8203319: JDK-8201487 >> disabled too much queue balancing"(out for review, but not yet >> pushed), I'm posting webrev.1. >> >> http://cr.openjdk.java.net/~sangheki/8043575/webrev.1 >> http://cr.openjdk.java.net/~sangheki/8043575/webrev.1_to_0 >> > Some minor comments: > > - g1ConcurrentMark.cpp:1521: maybe update the comment or remove it. Removed the comment. > > - g1FullGCReferenceProcessorExecutor.cpp: I think > G1FullGCReferenceProcessingExecutor::run_task(AbstractGangTask*) is not > used any more. Right, removed it. > > - I think that in line with not supporting this feature in CMS, the > assert in concurrentMarkSweepGeneration.cpp:5126 should not check >= > but ==? Same in parNewGeneration.cpp:796. Right, should be same. > > - gc_globals.hpp: I would prefer something like the following for the > text for ReferencesPerThread: > > "Ergonomically start one thread for this amount of references for > reference processing if ParallelRefProcEnabled is true. Specify 0 to > disable and use all threads." Updated as you suggested above. > > - would you mind renaming ReferenceProcessor::has_adjustable_queue? Not > the queues are adjusted, but the number of processing threads. How about ReferenceProcessor::has_adjustable_num_of_proc_threads? > > - since the feature changes the number of threads, it should also > naturally affect the need_balance_queues() method and impact the code > in process_discovered_reflist(). I think must_balance must be re- > evaluated after every decision to set the number of threads. > > At the moment the number of threads is adjusted after deciding whether > we need balancing. Changed to always check. This is webrev.1 bug :) When I started working on this, that condition was fixed but not now. > > - the patch might actually be much simpler after JDK-8202845: Refactor > reference processing for improved parallelism. Since I do not want you > to spend time on fixing this change after I mess it up, would you mind > me taking over this work/CR? Sure, as we discussed separately, I will reassign to you at some point. Let me post webrev.2 after addressing Kim's comment. Thanks, Sangheon > > Thanks, > Thomas > >> Thanks, >> Sangheon >> >> >> On 5/30/18 9:43 PM, sangheon.kim at oracle.com wrote: >>> Hi all, >>> >>> Could I have some reviews for this patch? >>> >>> This patch is suggesting ergonomically choosing worker thread count >>> from given reference count. >>> We have ParallelRefProcEnabled command-line option which enables to >>> use ALL workers during reference processing however this option has >>> a drawback when there's limited number of references. i.e. spends >>> more time on thread start-up/tear-down than actual processing time >>> if there are less references. And also we use all threads or single >>> thread during reference processing which seems less flexible on >>> thread counts. This patch calculates the worker counts from >>> dividing reference count by ReferencesPerThread(newly added >>> experimental option). >>> My suggestion for the default value of ReferencePerThread is 1000 >>> as it showed good results from some benchmarks. >>> >>> Notes: >>> 1. CMS ParNew is excluded from this patch because: >>> a) There is a separate CR for CMS (JDK-6938732). >>> b) It is tricky to manage switching single <-> MT processing >>> inside of ReferenceProcessor class for ParNew. Tony explained quite >>> well about the reason here ( https://bugs.openjdk.java.net/browse/J >>> DK- >>> 6938732?focusedCommentId=13932462&page=com.atlassian.jira.plugin.sy >>> stem.issuetabpanels:comment-tabpanel#comment-13932462 ). >>> c) CMS will be obsoleted in the future so not motivated to fix >>> within this patch. >>> 2. JDK-8203951 is the CR for removing temporarily added >>> flag(ReferenceProcessor::_has_adjustable_queue from webrev.0) to >>> manage ParNew. So the flag should be removed when CMS is obsoleted. >>> 3. Current logic of dividing by ReferencesPerThread would be >>> replaced with better implementation. e.g. time measuring >>> implementation etc. But I think current approach is simply and good >>> enough. >>> 4. This patch is based on JDK-8204094 and JDK-8204095, both are not >>> yet pushed so far. >>> >>> CR: https://bugs.openjdk.java.net/browse/JDK-8043575 >>> Webrev: http://cr.openjdk.java.net/~sangheki/8043575/webrev.0/ >>> Testing: hs-tier 1~5 with/without ParallelRefProcEnabled >>> >>> Thanks, >>> Sangheon >> From sangheon.kim at oracle.com Tue Jun 5 04:14:53 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 4 Jun 2018 21:14:53 -0700 Subject: RFR(s): 8204094: assert(worker_i < _length) failed: Worker 15 is greater than max: 11 at ReferenceProcessorPhaseTimes In-Reply-To: <44A646FD-F617-4915-82D5-E7651FD632A1@oracle.com> References: <7236ef64-533d-48ad-ecc2-c6bdd12ed4cd@oracle.com> <44A646FD-F617-4915-82D5-E7651FD632A1@oracle.com> Message-ID: Thank you for reviewing this, Kim. Sangheon On 6/4/18 5:48 PM, Kim Barrett wrote: >> On May 30, 2018, at 10:19 PM, sangheon.kim at oracle.com wrote: >> >> Hi all, >> >> Can I have reviews for this patch that fixes assertion failure at ReferenceProcessorPhaseTimes? >> >> This failure only happens when ParallelRefProcEnabled option is set on CMS and Parallel GC. >> The problem is that we are using more workers than we created a storage for workers. >> When we create ReferenceProcessorPhaseTimes, we set how many workers will be monitored (i.e. prepare an array to save time information). And currently we are setting it with ReferenceProcessor queue size(ReferenceProcessor::_num_queues). But this is problematic because the queue size is continuously updated by ReferenceProcessor::set_active_mt_degree() with active workers every GC. And the queue size is decided later than ReferenceProcessorPhaseTimes is created. So if active workers repeats to increase/decrease, ReferenceProcessorPhaseTimes would have smaller than active workers. >> >> This patch is proposing to use maximum queue when create ReferenceProcessorPhaseTimes. >> >> CR: https://bugs.openjdk.java.net/browse/JDK-8204094 >> Webrev: http://cr.openjdk.java.net/~sangheki/8204094/webrev.0 >> Testing: hs-tier1-5 with/without ParallelRefProcEnabled >> >> Thanks, >> Sangheon >> ------------------------ >> For example: >> 1) Set 2 when create ReferenceProcessorPhaseTimes(xx, 2): ReferenceProcessor::num_queues() = 2, because previously we had only 2 active workers. >> 2) Set 23 for ReferenceProcessor::set_active_mt_degree(23), current active workers are increased. >> 3) Assertion failure as we are using more workers than set from #1. > Looks good. > From martin.doerr at sap.com Tue Jun 5 07:42:53 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 5 Jun 2018 07:42:53 +0000 Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 In-Reply-To: References: <5625a595-1165-8d48-afbd-8229cdc4ac07@oracle.com> <7e7fc484287e4da4926176e0b5ae1b64@sap.com> <339D50B6-09D5-4342-A687-918A9A096B39@oracle.com> <6D2268EC-C1E7-4C90-BCD3-90D02D21FA08@oracle.com> <3e05cc86c9d4406d8a9875b705fbf1fc@sap.com> <5B0D40F7.5050807@oracle.com> <4dddb18526a745cc83941a0f58af77f5@sap.com> Message-ID: <02f1fa11496746d6a5e9111259c869e9@sap.com> Thanks for the contribution, and thanks everybody for discussing and reviewing. I?ve pushed it. Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Dienstag, 5. Juni 2018 00:42 To: Kim Barrett Cc: Andrew Haley (aph at redhat.com) ; david.holmes at oracle.com; Erik ?sterlund ; hotspot-gc-dev at openjdk.java.net; Doerr, Martin ; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 >> On Jun 1, 2018, at 11:08 AM, Michihiro Horie > wrote: >> >> Hi Kim, Erik, and Martin, >> >> Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. >> >> I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ >> This change uses forwardee_acquire(), which would generate better code on ARM. >> >> Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. >> >> I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. > >Looks good. Thanks a lot, Kim! Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for Kim Barrett ---2018/06/05 05:08:48---> On Jun 1, 2018, at 11:08 AM, Michihiro Horie On Jun 1, 2018, at 11:08 AM, Michihiro Horie > wrote: > From: Kim Barrett > To: Michihiro Horie > Cc: "Doerr, Martin" >, "Andrew Haley (aph at redhat.com)" >, "david.holmes at oracle.com" >, "Erik ?sterlund" >, "hotspot-gc-dev at openjdk.java.net" >, "ppc-aix-port-dev at openjdk.java.net" > Date: 2018/06/05 05:08 Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 ________________________________ > On Jun 1, 2018, at 11:08 AM, Michihiro Horie > wrote: > > Hi Kim, Erik, and Martin, > > Thank you very much for reminding me that an acquire barrier in the else-statement for ?!test_mark->is_marked()? is necessary under the criteria of not relying on the consume. > > I uploaded a new webrev : http://cr.openjdk.java.net/~mhorie/8154736/webrev.13/ > This change uses forwardee_acquire(), which would generate better code on ARM. > > Necessary barriers are located in all the paths in copy_to_survivor_space, and the returned new_obj can be safely handled in the caller sites. > > I measured SPECjbb2015 with the latest webrev. Critical-jOPS improved by 5%. Since my previous measurement with implicit consume showed 6% improvement, adding acquire barriers degraded the performance a little, but 5% is still good enough. Looks good. > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo > > "Doerr, Martin" ---2018/05/30 16:18:09---Hi Erik, the current implementation works on PPC because of "MP+sync+addr". > > From: "Doerr, Martin" > > To: "Erik ?sterlund" >, Kim Barrett >, Michihiro Horie >, "Andrew Haley (aph at redhat.com)" > > Cc: "david.holmes at oracle.com" >, "hotspot-gc-dev at openjdk.java.net" >, "ppc-aix-port-dev at openjdk.java.net" > > Date: 2018/05/30 16:18 > Subject: RE: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > > > > Hi Erik, > > the current implementation works on PPC because of "MP+sync+addr". > So we already rely on ordering of "load volatile field" + "implicit consume" on the reader's side. We have never seen any issues related to this with the compilers we have been using during the ~10 years the PPC implementation exists. > > PPC supports "MP+lwsync+addr" the same way, so Michihiro's proposal doesn't make it unreliable for PPC. > > But I'm ok with evaluating acquire barriers although they are not required by the PPC/ARM memory models. > ARM/aarch64 will also be affected when the o->forwardee uses load_acquire. So somebody should check the impact. If it is not acceptable we may need to introduce explicit consume. > > Implicit consume is also bad in shared code because somebody may want to run it on DEC Alpha. > > Thanks and best regards, > Martin > > > -----Original Message----- > From: Erik ?sterlund [mailto:erik.osterlund at oracle.com] > Sent: Dienstag, 29. Mai 2018 14:01 > To: Doerr, Martin >; Kim Barrett >; Michihiro Horie > > Cc: david.holmes at oracle.com; Gustavo Bueno Romero >; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > Hi Martin and Michihiro, > > On 2018-05-29 12:30, Doerr, Martin wrote: > > Hi Kim, > > > > I'm trying to understand how this is related to Michihiro's change. The else path of the initial test is not affected by it AFAICS. > > So it sounds like a request to fix the current implementation in addition to what his original intend was. > > I think we are just trying to nail down the correct fencing and just go > for that. And yes, this is arguably a pre-existing problem, but in a > race involving the very same accesses that we are changing the fencing > for. So it is not completely unrelated I suppose. > > In particular, hotspot has code that assumes that if you on the writer > side issue a full fence before publishing a pointer to newly initialized > data, then the initializing stores and their side effects should be > globally "visible" across the system before the pointer to it is > published, and hence elide the need for acquire on the loading side, > without relying on retained data dependencies on the loader side. I > believe this code falls under that category. It is assumed that the > leading fence of the CAS publishing the forwarding pointer makes the > initializing stores globally observable before publishing a pointer to > the initialized data, hence assuming that any loads able to observe the > new pointer would not rely on acquire or data dependent loads to > correctly read the initialized data. > > Unfortunately, this is not reliable in the IRIW case, as per the litmus > test "MP+sync+ctrl" as described in "Understanding POWER > multiprocessors" (https://dl.acm.org/citation.cfm?id=1993520), as > opposed to "MP+sync+addr" that gets away with it because of the data > dependency (not IRIW). Similarly, an isync does the job too on the > reader side as shown in MP+sync+ctrlisync. So while what I believe was > the previous reasoning that the leading sync of the CAS would elide the > necessity for acquire on the reader side without relying on data > dependent loads (implicit consume), I think that assumption was wrong in > the first place and that we do indeed need explicit acquire (even with > the precious conservative CAS fencing) in this context to not rely on > implicit consume semantics generating the required data dependent loads > on the reader side. In practice though, the leading sync of the CAS has > been enough to generate the correct machine code. Now, with the leading > sync removed, we are increasing the possible holes in the generated > machine code due to this flawed reasoning. So it would be nice to do > something more sound instead that does not make such assumptions. > > > Anyway, I agree with that implicit consume is not good. And I think it would be good to treat both o->forwardee() the same way. > > What about keeping memory_order_release for the CAS and using acquire for both o->forwardee()? > > The case in which the CAS succeeds is safe because the current thread has created new_obj so it doesn't need memory barriers to access it. > > Sure, that sounds good to me. > > Thanks, > /Erik > > > Thanks and best regards, > > Martin > > > > > > -----Original Message----- > > From: Kim Barrett [mailto:kim.barrett at oracle.com] > > Sent: Dienstag, 29. Mai 2018 01:54 > > To: Michihiro Horie > > > Cc: Erik Osterlund >; david.holmes at oracle.com; Gustavo Bueno Romero >; hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; Doerr, Martin > > > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > > >> On May 28, 2018, at 4:12 AM, Michihiro Horie > wrote: > >> > >> Hi Erik, > >> > >> Thank you very much for your review. > >> > >> I understood that implicit consume should not be used in the shared code. Also, I believe performance degradation would be negligible even if we use acquire. > >> > >> New webrev uses memory_order_acq_rel: http://cr.openjdk.java.net/~mhorie/8154736/webrev.10 > > This is missing the acquire barrier on the else branch for the initial test, so fails to meet > > the previously described minimal requirements for even possibly being sufficient. Any > > analysis of weakening the CAS barriers must consider that test and successor code. > > > > In the analysis, it?s not just the lexically nearby debugging / logging code that needs to be > > considered; the forwardee is being returned to caller(s) that will presumably do something > > with that object. > > > > Since the whole point of this discussion is performance, any proposed change should come > > with performance information. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From thomas.schatzl at oracle.com Tue Jun 5 07:56:53 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 05 Jun 2018 09:56:53 +0200 Subject: RFR (XS): 8202049: G1: ReferenceProcessor doesn't handle mark stack overflow In-Reply-To: <5C078D8C-72A4-41F3-ACF7-620B5918799D@oracle.com> References: <7f43835383d1ebf1f7d3a192c7949ffdb80541d3.camel@oracle.com> <5C078D8C-72A4-41F3-ACF7-620B5918799D@oracle.com> Message-ID: Hi, On Mon, 2018-06-04 at 20:26 -0400, Kim Barrett wrote: > > On Jun 4, 2018, at 6:35 AM, Thomas Schatzl > om> wrote: > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8202049 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8202049/webrev > > Testing: > > running this patch for months in my usual testing routine for other > > patches without any error. > > > > Thanks, > > Thomas > > I think this ought to use fatal(...) rather than log_error + > ShouldNotReachHere. > fixed in place. Due to precedent in other code I was swaying between both, apparently to the wrong side. Thanks, Thomas From stefan.johansson at oracle.com Tue Jun 5 10:14:03 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 5 Jun 2018 12:14:03 +0200 Subject: RFR: 8203319: JDK-8201487 disabled too much queue balancing In-Reply-To: References: <3CFCA83E-82A5-4CA1-BA07-F34BFD0C689C@oracle.com> <670a9fdf-a719-1aae-aea9-6316a3c025f4@oracle.com> Message-ID: On 2018-06-04 20:31, Kim Barrett wrote: >> On May 31, 2018, at 3:34 PM, Kim Barrett wrote: >> >>> On May 31, 2018, at 4:41 AM, Stefan Johansson wrote: >>> >>> >>> >>> On 2018-05-30 19:33, Kim Barrett wrote: >>>> Please review this change to the ReferenceProcessor's test for whether >>>> to balance a set of queues before processing them with multiple >>>> threads. The change to that test made by JDK-8201487 doesn't peform >>>> balancing for some (potential, see Testing below) states where not >>>> doing so will result in some discovered References not being >>>> processed. In particular, there are cases where we must ignore >>>> -XX:-ParallelRefProcBalancingEnabled and balance anyway. >>>> We also now avoid balancing in some cases where we know the set is >>>> already balanced, even with -XX:+ParallelRefProcBalancingEnabled. >>>> CR: >>>> https://bugs.openjdk.java.net/browse/JDK-8203319 >>>> Webrev: >>>> http://cr.openjdk.java.net/~kbarrett/8203319/open.00/ >>> Looks good, >>> Stefan >> >> Thanks, Stefan. > > The new need_balance_queues function has an optimization to avoid > balancing in a case where we already know the queues are balanced. > This assumes it's being called for initial balancing after discovery, > and not for some re-balancing after a processing phase. But that's > problematic for JDK-8043575 (recently RFR'ed). And if we're already > balanced, balance_queues doesn't have very much to do. (And > eventually the optimization would be rendered moot anyway, by the > elimination of balancing; see JDK-8202328.) So I'm taking that bit out. > > New webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8203319/open.01/ > incr: http://cr.openjdk.java.net/~kbarrett/8203319/open.01.inc/ Still good. From thomas.schatzl at oracle.com Tue Jun 5 10:16:25 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 05 Jun 2018 12:16:25 +0200 Subject: RFR: 8203319: JDK-8201487 disabled too much queue balancing In-Reply-To: References: <3CFCA83E-82A5-4CA1-BA07-F34BFD0C689C@oracle.com> <670a9fdf-a719-1aae-aea9-6316a3c025f4@oracle.com> Message-ID: <9a03ed17291417618689b91042846acd084e57b4.camel@oracle.com> Hi, On Mon, 2018-06-04 at 14:31 -0400, Kim Barrett wrote: > > On May 31, 2018, at 3:34 PM, Kim Barrett > > wrote: > > > > > On May 31, 2018, at 4:41 AM, Stefan Johansson > > racle.com> wrote: > > > > > > > > > > > > On 2018-05-30 19:33, Kim Barrett wrote: > > > > Please review this change to the ReferenceProcessor's test for > > > > whether > > > > to balance a set of queues before processing them with multiple > > > > threads. The change to that test made by JDK-8201487 doesn't > > > > peform > > > > balancing for some (potential, see Testing below) states where > > > > not > > > > doing so will result in some discovered References not being > > > > processed. In particular, there are cases where we must ignore > > > > -XX:-ParallelRefProcBalancingEnabled and balance anyway. > > > > We also now avoid balancing in some cases where we know the set > > > > is > > > > already balanced, even with > > > > -XX:+ParallelRefProcBalancingEnabled. > > > > CR: > > > > https://bugs.openjdk.java.net/browse/JDK-8203319 > > > > Webrev: > > > > http://cr.openjdk.java.net/~kbarrett/8203319/open.00/ > > > > > > Looks good, > > > Stefan > > > > Thanks, Stefan. > > The new need_balance_queues function has an optimization to avoid > balancing in a case where we already know the queues are balanced. > This assumes it's being called for initial balancing after discovery, > and not for some re-balancing after a processing phase. But that's > problematic for JDK-8043575 (recently RFR'ed). And if we're already > balanced, balance_queues doesn't have very much to do. (And > eventually the optimization would be rendered moot anyway, by the > elimination of balancing; see JDK-8202328.) So I'm taking that bit > out. > > New webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8203319/open.01/ > incr: http://cr.openjdk.java.net/~kbarrett/8203319/open.01.inc/ still good. Thomas From stefan.johansson at oracle.com Tue Jun 5 11:41:06 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 5 Jun 2018 13:41:06 +0200 Subject: RFR: 8204287: Phase timings not updated correctly after JDK-6672778 Message-ID: Hi, Please review this fix to do correct timings in G1. Issue: https://bugs.openjdk.java.net/browse/JDK-8204287 Webrev: http://cr.openjdk.java.net/~sjohanss/8204287/00/ Summary: When the partial trimming was introduced a while back the timing code in G1 became a bit more complicated. In one of the timer helpers a member structure is used before it is destructed and updated. This leads to the times being accounted to the wrong phases and there for the G1 ergonomics doing some less than optimal decisions. Testing: * Local testing with jtreg * Mach5 testing tier 1-3 looks good * Sanity performance testing is running Thanks, Stefan From thomas.schatzl at oracle.com Tue Jun 5 11:47:05 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 05 Jun 2018 13:47:05 +0200 Subject: RFR: 8204287: Phase timings not updated correctly after JDK-6672778 In-Reply-To: References: Message-ID: Hi, On Tue, 2018-06-05 at 13:41 +0200, Stefan Johansson wrote: > Hi, > > Please review this fix to do correct timings in G1. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8204287 > Webrev: http://cr.openjdk.java.net/~sjohanss/8204287/00/ > > Summary: > When the partial trimming was introduced a while back the timing code > in G1 became a bit more complicated. In one of the timer helpers a > member structure is used before it is destructed and updated. This > leads to the times being accounted to the wrong phases and there for > the G1 ergonomics doing some less than optimal decisions. > > Testing: > * Local testing with jtreg > * Mach5 testing tier 1-3 looks good > * Sanity performance testing is running looks good. Thanks for finding this! Thomas From thomas.schatzl at oracle.com Tue Jun 5 12:27:04 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 05 Jun 2018 14:27:04 +0200 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism Message-ID: Hi all, can I have reviews for this change that improves the efficiency of existing j.l.ref processing in all garbage collectors by squashing together phases for the different types of References as much as possible. Kim outlined what can be done when in the CR in detail, so please look there for the idea. This improves efficiency in terms of time parallel Reference Processing takes: instead of starting up reference processors 9 times, you actually only need to do that 4 times. That can save a significant amount of gc pause time too :) One side effect of this is unfortunately another change in the log output: as we do not have separate timings for every phase of every Reference type any more, the idea was to turn the logging inside-out: Instead of this structure (at trace level): Reference Processing: 0.0ms SoftReference: 0.5ms Balance queues: 0.0ms Phase1: 0.2ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Phase2: 0.1ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Phase3: 0.2ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Discovered: 0 Cleared: 0 WeakReference: 0.3ms Balance queues: 0.0ms Phase2: 0.1ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Phase3: 0.2ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Discovered: 40 Cleared: 28 FinalReference: 0.3ms Balance queues: 0.0ms Phase2: 0.1ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Phase3: 0.2ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Discovered: 0 Cleared: 0 PhantomReference: 0.3ms Balance queues: 0.0ms Phase2: 0.1ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Phase3: 0.2ms Process lists (ms) Min: 0.0, Avg: ... Workers: 28 Discovered: 6 Cleared: 5 it will look like this: Reference Processing: 0.1ms Phase1: 0.0ms Balance queues: 0.0ms SoftRef (ms): Min: 0.0, Avg: ... Workers: 3 0.0 0.0 0.0 - ... - - - - - - (T) Phase2: 0.0ms Balance queues: 0.0ms SoftRef (ms): Min: 0.0, Avg: ... Workers: 3 0.0 0.0 0.0 - ... - - - - - - (T) WeakRef (ms): Min: 0.0, Avg: ... Workers: 3 0.0 0.0 0.0 - ... - - - - - - (T) FinalRef (ms): Min: 0.0, Avg: ... Workers: 3 0.0 0.0 0.0 - ... - - - - - - (T) Phase3: 0.0ms Balance queues: 0.0ms FinalRef (ms): Min: 0.0, Avg: ... Workers: 3 0.0 0.0 0.0 - ... - - - - - - (T) Phase4: 0.0ms Balance queues: 0.0ms PhantomRef (ms): Min: 0.0, Avg: ... Workers: 3 0.0 0.0 0.0 - ... - - - - - - (T) SoftReference: Discovered: 0 Cleared: 0 WeakReference: Discovered: 30 Cleared: 30 FinalReference: Discovered: 0 Cleared: 0 PhantomReference: Discovered: 5 Cleared: 5 I.e. instead of showing for every reference type their phases with timing, now the four phases with the reference timings that are processed within that reference are shown. Also (that has been a bug), the lines with a "(T)" are added at trace level. At debug level the timing output has been updated to be similar to other gc+phases output, i.e. is now Reference Processing: 0.2ms Phase1: 0.0ms Balance queues: 0.0ms SoftRef (ms): Min: 0.0, Avg: ... Workers: 3 Phase2: 0.1ms Balance queues: 0.0ms SoftRef (ms): Min: 0.0, Avg: ... Workers: 3 WeakRef (ms): Min: 0.0, Avg: ... Workers: 3 FinalRef (ms): Min: 0.0, Avg: ... Workers: 3 Phase3: 0.0ms Balance queues: 0.0ms FinalRef (ms): Min: 0.0, Avg: ... Workers: 3 Phase4: 0.0ms Balance queues: 0.0ms PhantomRef (ms): Min: 0.0, Avg: ... Workers: 3 SoftReference: Discovered: 0 Cleared: 0 WeakReference: Discovered: 38 Cleared: 30 FinalReference: Discovered: 0 Cleared: 0 PhantomReference: Discovered: 7 Cleared: 6 instead of the old output as follows: Reference Processing: 0.0ms SoftReference: 0.5ms Balance queues: 0.0ms Phase1: 0.2ms Phase2: 0.1ms Phase3: 0.2ms Discovered: 0 Cleared: 0 WeakReference: 0.3ms Balance queues: 0.0ms Phase2: 0.1ms Phase3: 0.2ms Discovered: 0 Cleared: 0 FinalReference: 0.3ms Balance queues: 0.0ms Phase2: 0.1ms Phase3: 0.2ms Discovered: 0 Cleared: 0 PhantomReference: 0.3ms Balance queues: 0.0ms Phase2: 0.1ms Phase3: 0.2ms Discovered: 5 Cleared: 5 which did not show any per-thread information. In case of using serial ref processing, this is the proposed output: Reference Processing: 0.1ms Phase1: 0.0ms SoftRef: 0.0ms Phase2: 0.0ms SoftRef: 0.0ms WeakRef: 0.0ms FinalRef: 0.0ms Phase3: 0.0ms FinalRef: 0.0ms Phase4: 0.0ms PhantomRef: 0.0ms SoftReference: Discovered: 0 Cleared: 0 WeakReference: Discovered: 30 Cleared: 30 FinalReference: Discovered: 0 Cleared: 0 PhantomReference: Discovered: 5 Cleared: 5 Note that in the previous version, the output of parallel/debug was the same as serial/debug. Discovered/Cleared counts per reference type are shown afterwards instead of within in all cases. Note that I think we should improve and probably streamline the output of reference counts, but that is imho a different issue (and this one is big enough as is). This change builds on "JDK-8203319: JDK-8201487 disabled too much queue balancing". There is also a review of "JDK-8043575: Dynamically parallelize reference processing work" out for review. I also talked to Sangheon that I will do the merging as this change messes up the same code (in the referenceProcessor* files) so much. CR: https://bugs.openjdk.java.net/browse/JDK-8202845 Webrev: http://cr.openjdk.java.net/~tschatzl/8202845/webrev/ Testing: hs-tier1-5,jdk-tier1-3 Thanks, Thomas From kim.barrett at oracle.com Tue Jun 5 12:57:03 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 5 Jun 2018 08:57:03 -0400 Subject: RFR: 8203319: JDK-8201487 disabled too much queue balancing In-Reply-To: <9a03ed17291417618689b91042846acd084e57b4.camel@oracle.com> References: <3CFCA83E-82A5-4CA1-BA07-F34BFD0C689C@oracle.com> <670a9fdf-a719-1aae-aea9-6316a3c025f4@oracle.com> <9a03ed17291417618689b91042846acd084e57b4.camel@oracle.com> Message-ID: <0FAD9901-7E25-4D9F-9716-7D28A0D8182D@oracle.com> > On Jun 5, 2018, at 6:16 AM, Thomas Schatzl wrote: > > Hi, > > On Mon, 2018-06-04 at 14:31 -0400, Kim Barrett wrote: >>> On May 31, 2018, at 3:34 PM, Kim Barrett >>> wrote: >>> >>>> On May 31, 2018, at 4:41 AM, Stefan Johansson >>> racle.com> wrote: >>>> >>>> >>>> >>>> On 2018-05-30 19:33, Kim Barrett wrote: >>>>> Please review this change to the ReferenceProcessor's test for >>>>> whether >>>>> to balance a set of queues before processing them with multiple >>>>> threads. The change to that test made by JDK-8201487 doesn't >>>>> peform >>>>> balancing for some (potential, see Testing below) states where >>>>> not >>>>> doing so will result in some discovered References not being >>>>> processed. In particular, there are cases where we must ignore >>>>> -XX:-ParallelRefProcBalancingEnabled and balance anyway. >>>>> We also now avoid balancing in some cases where we know the set >>>>> is >>>>> already balanced, even with >>>>> -XX:+ParallelRefProcBalancingEnabled. >>>>> CR: >>>>> https://bugs.openjdk.java.net/browse/JDK-8203319 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~kbarrett/8203319/open.00/ >>>> >>>> Looks good, >>>> Stefan >>> >>> Thanks, Stefan. >> >> The new need_balance_queues function has an optimization to avoid >> balancing in a case where we already know the queues are balanced. >> This assumes it's being called for initial balancing after discovery, >> and not for some re-balancing after a processing phase. But that's >> problematic for JDK-8043575 (recently RFR'ed). And if we're already >> balanced, balance_queues doesn't have very much to do. (And >> eventually the optimization would be rendered moot anyway, by the >> elimination of balancing; see JDK-8202328.) So I'm taking that bit >> out. >> >> New webrevs: >> full: http://cr.openjdk.java.net/~kbarrett/8203319/open.01/ >> incr: http://cr.openjdk.java.net/~kbarrett/8203319/open.01.inc/ > > still good. > > Thomas Thanks Stefan and Thomas. From kim.barrett at oracle.com Tue Jun 5 13:08:33 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 5 Jun 2018 09:08:33 -0400 Subject: RFR (XS): 8202049: G1: ReferenceProcessor doesn't handle mark stack overflow In-Reply-To: References: <7f43835383d1ebf1f7d3a192c7949ffdb80541d3.camel@oracle.com> <5C078D8C-72A4-41F3-ACF7-620B5918799D@oracle.com> Message-ID: > On Jun 5, 2018, at 3:56 AM, Thomas Schatzl wrote: > > Hi, > > On Mon, 2018-06-04 at 20:26 -0400, Kim Barrett wrote: >>> On Jun 4, 2018, at 6:35 AM, Thomas Schatzl >> om> wrote: >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8202049 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8202049/webrev >>> Testing: >>> running this patch for months in my usual testing routine for other >>> patches without any error. >>> >>> Thanks, >>> Thomas >> >> I think this ought to use fatal(...) rather than log_error + >> ShouldNotReachHere. >> > > fixed in place. > > Due to precedent in other code I was swaying between both, apparently > to the wrong side. > > Thanks, > Thomas Looks good. From thomas.schatzl at oracle.com Tue Jun 5 13:28:19 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 05 Jun 2018 15:28:19 +0200 Subject: RFR (XS): 8202049: G1: ReferenceProcessor doesn't handle mark stack overflow In-Reply-To: References: <7f43835383d1ebf1f7d3a192c7949ffdb80541d3.camel@oracle.com> <5C078D8C-72A4-41F3-ACF7-620B5918799D@oracle.com> Message-ID: <092945e99e9bd62f8032c243d0492c1a8530f4f9.camel@oracle.com> Hi Kim, On Tue, 2018-06-05 at 09:08 -0400, Kim Barrett wrote: > > On Jun 5, 2018, at 3:56 AM, Thomas Schatzl > om> wrote: > > > > Hi, > > > > On Mon, 2018-06-04 at 20:26 -0400, Kim Barrett wrote: > > > > On Jun 4, 2018, at 6:35 AM, Thomas Schatzl > > > le.c > > > > om> wrote: > > > > CR: > > > > https://bugs.openjdk.java.net/browse/JDK-8202049 > > > > Webrev: > > > > http://cr.openjdk.java.net/~tschatzl/8202049/webrev > > > > Testing:[...] > > Thanks, > > Thomas > > Looks good. > Thanks for your review. Btw I filed JDK-8204337 to look into creating a test. Thomas From kim.barrett at oracle.com Tue Jun 5 14:54:01 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 5 Jun 2018 10:54:01 -0400 Subject: RFR: 8204287: Phase timings not updated correctly after JDK-6672778 In-Reply-To: References: Message-ID: <5A17448D-7B59-4A25-A1B7-8F4B18C7C2CF@oracle.com> > On Jun 5, 2018, at 7:41 AM, Stefan Johansson wrote: > > Hi, > > Please review this fix to do correct timings in G1. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8204287 > Webrev: http://cr.openjdk.java.net/~sjohanss/8204287/00/ > > Summary: > When the partial trimming was introduced a while back the timing code in G1 became a bit more complicated. In one of the timer helpers a member structure is used before it is destructed and updated. This leads to the times being accounted to the wrong phases and there for the G1 ergonomics doing some less than optimal decisions. > > Testing: > * Local testing with jtreg > * Mach5 testing tier 1-3 looks good > * Sanity performance testing is running > > Thanks, > Stefan Looks good. From shade at redhat.com Tue Jun 5 15:46:27 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Jun 2018 17:46:27 +0200 Subject: RFR (round 4), JEP-318: Epsilon GC In-Reply-To: <0b5c827a-2763-98aa-06af-24df9028aed7@oracle.com> References: <094e1093-5a13-4853-aa34-d4e987a069b0@redhat.com> <0b5c827a-2763-98aa-06af-24df9028aed7@oracle.com> Message-ID: Hi Jini, Thanks for taking a look, see comments inline. On 06/01/2018 10:13 AM, Jini George wrote: > ==> share/classes/sun/jvm/hotspot/oops/ObjectHeap.java > > 444??????? liveRegions.add(eh.space()); > > We would need to add an object of type 'Address' to the liveRegions list, instead of type > VirtualSpace. Not doing so results in exceptions of the following form from the compare() method for > various clhsdb commands like 'jhisto': > > Exception in thread "main" java.lang.ClassCastException: > jdk.hotspot.agent/sun.jvm.hotspot.memory.VirtualSpace cannot be cast to > jdk.hotspot.agent/sun.jvm.hotspot.debugger.Address Oh, I see! Fixed: http://hg.openjdk.java.net/jdk/sandbox/rev/d999bdb8173c > ==>? share/classes/sun/jvm/hotspot/oops/ObjectHeap.java > > 445???? } else { > 446??????? if (Assert.ASSERTS_ENABLED) { > 447?????????? Assert.that(false, "Expecting GenCollectedHeap, G1CollectedHeap, " + > 448?????????????????????? "or ParallelScavengeHeap, but got " + > 449?????????????????????? heap.getClass().getName()); > 450??????? } > 451???? } > > * Please add EpsilonGC also to the assertion statement. I prefer to change it to non-GC-specific message as Per suggested: http://hg.openjdk.java.net/jdk/sandbox/rev/5cd17d3d3f83 > ==> share/classes/sun/jvm/hotspot/tools/HeapSummary.java > > The run() method would need to handle Epsilon GC here to avoid the Unknown CollectedHeap type error > with jhsdb jmap --heap. Right, implemented in both places, run() and detectGCAlgo(): http://hg.openjdk.java.net/jdk/sandbox/rev/b20a56352d78 > ==> share/classes/sun/jvm/hotspot/HSDB.java > > In? showThreadStackMemory(), we have: > > 1101?????????????????????????? } > 1102???????????????????????? } else if (collHeap instanceof ParallelScavengeHeap) { > 1103?????????????????????????? ParallelScavengeHeap heap = (ParallelScavengeHeap) collHeap; > 1104?????????????????????????? if (heap.youngGen().isIn(handle)) { > 1105???????????????????????????? anno = "PSYoungGen "; > 1106???????????????????????????? bad = false; > 1107?????????????????????????? } else if (heap.oldGen().isIn(handle)) { > 1108???????????????????????????? anno = "PSOldGen "; > 1109???????????????????????????? bad = false; > 1110?????????????????????????? } > 1111???????????????????????? } else { > 1112?????????????????????????? // Optimistically assume the oop isn't bad > 1113?????????????????????????? anno = "[Unknown generation] "; > 1114?????????????????????????? bad = false; > 1115???????????????????????? } > 1116 > > We would need to add the case of collHeap being an instanceof EpsilonHeap too. It would display > "Unknown generation" while viewing the stack memory for the Java threads otherwise. Right, fixed: http://hg.openjdk.java.net/jdk/sandbox/rev/f26c4a196a15 > ==> It would be great if test/hotspot/jtreg/serviceability/sa/TestUniverse.java is enhanced to add > the minimalistic test for EpsilonGC. Right: http://hg.openjdk.java.net/jdk/sandbox/rev/c559de946c7d This still passes gc/epsilon and serviceability/sa tests. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From gerard.ziemski at oracle.com Tue Jun 5 18:27:48 2018 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Tue, 5 Jun 2018 13:27:48 -0500 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> Message-ID: <1852755B-5CC6-4499-BCDD-B672FA4F6CA0@oracle.com> hi Robbin, A few minor issues for which I don't need to see a webrev: #1 Change: log_trace(stringtable)("Started to growed"); to log_trace(stringtable)("Started to grow"); #2 Add white space after "while? from (there is one more instance of this): while(gt.doTask(jt)) { to while (gt.doTask(jt)) { #3 Change: log_debug(stringtable)("Growed to size:" SIZE_FORMAT, _current_size); to log_debug(stringtable)("Grown to size:" SIZE_FORMAT, _current_size); #4 "nearest_pow_2" is inaptly named, if the user passes "129" we select "256", not "128", so maybe rename it to "ceil_pow_2?? PS. I would have liked to see whether Netbeans or Eclipse trigger the table resizing during the startup, but I was unable to trivially run either of them using jdk11, oh well. cheers > On Jun 4, 2018, at 10:59 AM, Robbin Ehn wrote: > > Hi, > > Here is an updated after reviews: > Inc : http://cr.openjdk.java.net/~rehn/8195097/v2/inc/webrev/ > Full: http://cr.openjdk.java.net/~rehn/8195097/v2/full/webrev/ > > Passed tier 1-3. > > /Robbin > > > On 2018-05-28 15:19, Robbin Ehn wrote: >> Hi all, please review. >> This implements the StringTable with the ConcurrentHashtable for managing the >> strings using oopStorage for backing the actual oops via WeakHandles. >> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >> which means GC only needs to walk the oopStorage, either concurrently or in a >> safepoint. Walking oopStorage is also faster so there is a good effect on all >> safepoints visiting the oops. >> The unlinking and freeing happens during inserts when dead weak oops are >> encountered in that bucket. In any normal workload the stringtable self-cleans >> without needing any additional cleaning. Cleaning/unlinking can also be done >> concurrently via the ServiceThread, it is started when we have a high ?dead >> factor?. E.g. application have a lot of interned string removes the references >> and never interns again. The ServiceThread also concurrently grows the table if >> ?load factor? is high. Both the cleaning and growing take care to not prolonging >> time to safepoint, at the cost of some speed. >> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >> changeset, various benchmark such as JMH, specJBB2015. >> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >> Thanks, Robbin From jiangli.zhou at oracle.com Tue Jun 5 20:20:51 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 5 Jun 2018 13:20:51 -0700 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> Message-ID: <6249C3D2-41DA-4784-A120-A07FF45A004D@oracle.com> Hi Robbin, The latest version looks good to me too! Thanks, Jiangli > On Jun 4, 2018, at 8:59 AM, Robbin Ehn wrote: > > Hi, > > Here is an updated after reviews: > Inc : http://cr.openjdk.java.net/~rehn/8195097/v2/inc/webrev/ > Full: http://cr.openjdk.java.net/~rehn/8195097/v2/full/webrev/ > > Passed tier 1-3. > > /Robbin > > > On 2018-05-28 15:19, Robbin Ehn wrote: >> Hi all, please review. >> This implements the StringTable with the ConcurrentHashtable for managing the >> strings using oopStorage for backing the actual oops via WeakHandles. >> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >> which means GC only needs to walk the oopStorage, either concurrently or in a >> safepoint. Walking oopStorage is also faster so there is a good effect on all >> safepoints visiting the oops. >> The unlinking and freeing happens during inserts when dead weak oops are >> encountered in that bucket. In any normal workload the stringtable self-cleans >> without needing any additional cleaning. Cleaning/unlinking can also be done >> concurrently via the ServiceThread, it is started when we have a high ?dead >> factor?. E.g. application have a lot of interned string removes the references >> and never interns again. The ServiceThread also concurrently grows the table if >> ?load factor? is high. Both the cleaning and growing take care to not prolonging >> time to safepoint, at the cost of some speed. >> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >> changeset, various benchmark such as JMH, specJBB2015. >> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >> Thanks, Robbin From thomas.schatzl at oracle.com Tue Jun 5 20:26:56 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 05 Jun 2018 22:26:56 +0200 Subject: RFR (S): 8204082: Add indication that this is the "Last Young GC before Mixed" to logs In-Reply-To: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> Message-ID: Hi, On Wed, 2018-05-30 at 14:50 +0200, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that adds an indication about > the "last young" gc before a mixed phase to the logs? > > I.e. instead of "GC Pause (Young)", that GC is called "GC Pause (Last > Young)". > > The reason for this potentially intrusive change to log parsers is > that in quite a few situations it is useful to be really sure what > kind of GC is being started without needing to either add gc+ergo > logging or restarting. This situation where I was trying to figure > out whether the given young GC I was looking at was the young gc > before the mixed gcs or just a young-gc because the Cleanup pause > figured out that we do not need a mixed phase. > > I am not too hung up about naming it "Last Young" in particular, but > I really would like to have this or a similar indication that is > generally available with minimal logging. Alternatively we could just call it "Mixed" that does not collect any old gen regions at this time. Later we could start gc as soon as the pause time goal permits and add a few old gen regions. This would probably be a little confusing to users, but at least help diagnosing logs. There also has been the suggestion to call it "Final Young" instead of "Last Young". Any comments? Thomas > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204082 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8204084/webrev/ > Testing: > hs-tier 1-3 From kim.barrett at oracle.com Tue Jun 5 21:41:14 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 5 Jun 2018 17:41:14 -0400 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: References: Message-ID: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> > On Jun 5, 2018, at 8:27 AM, Thomas Schatzl wrote: > CR: > https://bugs.openjdk.java.net/browse/JDK-8202845 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8202845/webrev/ > Testing: > hs-tier1-5,jdk-tier1-3 > > Thanks, > Thomas ------------------------------------------------------------------------------ Logging: I'm not sure using Phase{1,2,3,4} is the right approach. I think it might be better to use more informative descriptions here, e.g. something like Phase1 => Reconsider SoftReferences Phase2 => Notify Soft/WeakReferences Phase3 => Notify and keep alive finalizable objects Phase4 => Notify PhantomReferences That gives more information to the reader. It also doesn't suffer from renumbering when we make further planned changes, like moving Phase1 elsewhere, and (someday) elimination of finalization. ------------------------------------------------------------------------------ Logging: Phase2 timing information is broken down by reference type, which may be useful. But timing information about the phase as a whole only contains the total time, without any per-thread breakdown for the phase as a whole. That information seems equally useful. Also, the number of workers is being repeated, even though it doesn't change over the phase. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp Throughout, need_balance_queues is being applied to _discovered_refs, which is incorrect. (That's effectively a synonym for _discoveredSoftRefs.) It needs to be applied to each specific refs_lists[]. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp process_soft_weak_final_drop_live_null seems like an excessively long name, and isn't really more accurate or informative than something like process_soft_weak_final_refs. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.hpp Please ensure blank lines between the various process_xxx declarations and the comments that follow them. Circa lines 285 and 293. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 425 complete_gc->do_void(); 426 iter.complete_enqueue(); ... 456 complete_gc->do_void(); 457 iter.complete_enqueue(); These are changing to the ordering of those two operations; they were done in the other order in the old process_phase3. I'm not sure it matters, but I wasn't expecting such a change. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 327 size_t ReferenceProcessor::process_soft_ref_reevaluate_work(DiscoveredList& refs_list, ... 340 log_develop_trace(gc, ref)("Dropping reference (" INTPTR_FORMAT ": %s" ") by policy", 341 p2i(iter.obj()), iter.obj()->klass()->internal_name()); [pre-existing] Klass::internal_name() may perform resource allocation, but there's no obvious ResourceMark near here, unlike in process_final_keep_alive_work and process_phantom_refs. Same issue in process_soft_weak_final_drop_live_null_work. I wonder if some of the ResourceMarks should be moved into log_(dropped|enqueued)_ref, rather than being outside a loop with an unknown number of resource allocations. And similarly in process_soft_ref_reevaluate_work. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 792 removed += process_soft_ref_reevaluate_work(_discoveredSoftRefs[i], _current_soft_ref_policy, 793 is_alive, keep_alive, complete_gc); Abnormal indentation. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.hpp Changed to remove marks_oops_alive from ProcessTask. I'm not sure why this change was made. It would seem to make some collectors terminate Phase2 more slowly than necessary, since every thread will end with an unnecessary stealing segment that should never find anything to do, but needs to rendezvous with all the other threads. Note that JDK-8203028 missed changing the construction of the RefProcPhase2Task; it should have changed the marks_oops_alive argument unconditionally false. This is another piece of the mostly implicit / lightly documented contract for the is_alive closure that G1 is violating. ------------------------------------------------------------------------------ From shade at redhat.com Wed Jun 6 12:41:54 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 6 Jun 2018 14:41:54 +0200 Subject: RFR (S/M): 8202017: Merge Reference Enqueuing phase with phase 3 of Reference processing In-Reply-To: <1524165246.2765.14.camel@oracle.com> References: <1524165246.2765.14.camel@oracle.com> Message-ID: <44777e67-4cc2-c0e0-e374-55981f492b96@redhat.com> On 04/19/2018 09:14 PM, Thomas Schatzl wrote: > This change affects all collectors using the reference processing > framework - I assume that Shenandoah, if it is using it, does not need > a split reference enqueuing phase either. Yup, we merged it to Shenandoah without problems, thanks! -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Wed Jun 6 12:51:42 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 06 Jun 2018 14:51:42 +0200 Subject: RFR (S/M): 8202017: Merge Reference Enqueuing phase with phase 3 of Reference processing In-Reply-To: <44777e67-4cc2-c0e0-e374-55981f492b96@redhat.com> References: <1524165246.2765.14.camel@oracle.com> <44777e67-4cc2-c0e0-e374-55981f492b96@redhat.com> Message-ID: Hi, On Wed, 2018-06-06 at 14:41 +0200, Aleksey Shipilev wrote: > On 04/19/2018 09:14 PM, Thomas Schatzl wrote: > > This change affects all collectors using the reference processing > > framework - I assume that Shenandoah, if it is using it, does not > > need a split reference enqueuing phase either. > > Yup, we merged it to Shenandoah without problems, thanks! good to hear. Please also follow progress on JDK-8202845 (out for review), JDK-8043575 (out for review, but needs some merging work with JDK-8202845) and JDK-8202328 (out soon). All mentioned changes should also positively impact Shenandoah reference processing as it does for others. Thanks, Thomas From shade at redhat.com Wed Jun 6 16:30:54 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 6 Jun 2018 18:30:54 +0200 Subject: RFR: 8204180: Implementation: JEP 318: Epsilon GC (round 5) Message-ID: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> Hi, This is fifth (and hopefully final) round of code review for Epsilon GC changes. It includes the fixes done as the result of fourth round of reviews, mostly in Serviceability. The build parts are the same since last few reviews, so this is not posted to build-dev at . Webrev: http://cr.openjdk.java.net/~shade/epsilon/webrev.09/ If we are good -- should somebody, e.g. Project Lead ack this? -- I am going to push this with the following changeset metadata: 8204180: Implementation: JEP 318: Epsilon, A No-Op Garbage Collector Summary: Introduce Epsilon GC Reviewed-by: rkennke, ihse, pliden, eosterlund, lmesnik, jgeorge Builds: server X {x86_64, x86_32, aarch64, arm32, ppc64le, s390x} minimal X {x86, x86_64} zero X {x86_64} Testing: gc/epsilon on x86_64 Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Wed Jun 6 17:13:04 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 06 Jun 2018 19:13:04 +0200 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> Message-ID: Hi, On Tue, 2018-06-05 at 17:41 -0400, Kim Barrett wrote: > > On Jun 5, 2018, at 8:27 AM, Thomas Schatzl > om> wrote: > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8202845 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8202845/webrev/ > > Testing: > > hs-tier1-5,jdk-tier1-3 > > > > Thanks, > > Thomas > > > ------------------------------------------------------------------- > ----------- > Logging: > > I'm not sure using Phase{1,2,3,4} is the right approach. I think it > might be better to use more informative descriptions here, e.g. > something like > > Phase1 => Reconsider SoftReferences > Phase2 => Notify Soft/WeakReferences > Phase3 => Notify and keep alive finalizable objects > Phase4 => Notify PhantomReferences > > That gives more information to the reader. It also doesn't suffer > from renumbering when we make further planned changes, like moving > > Phase1 elsewhere, and (someday) elimination of finalization. Okay. The reason I used "PhaseX" was because any identifiers I could come up with were very long, particularly for phase 3. I changed it to your naming (I allowed myselves to drop the "object", please tell me if you think it is required), but I am hoping that somebody has some more succinct suggestions. > ----------- > Logging: > > Phase2 timing information is broken down by reference type, which may > be useful. But timing information about the phase as a whole only > contains the total time, without any per-thread breakdown for the > phase as a whole. That information seems equally useful. > Okay, added a "Total" line. > Also, the number of workers is being repeated, even though it doesn't > change over the phase. > Since this would make the output inconsist with the evacuation phase where thread numbers are printed always I left that out. Also these are the numbers of threads contributing work, not the number of work items the task has been started with. This is only the same because at the moment the time tracker scoped objects are encapsulating the whole work methods regardless of whether there is work to do (i.e. their list is empty), so the always get a "0.0" entry. That information would be interesting, but I would prefer to do that in an extra CR. > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > > Throughout, need_balance_queues is being applied to _discovered_refs, > which is incorrect. (That's effectively a synonym for > _discoveredSoftRefs.) It needs to be applied to each specific > refs_lists[]. Fixed. > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > > process_soft_weak_final_drop_live_null seems like an excessively long > name, and isn't really more accurate or informative than something > like process_soft_weak_final_refs. :) Fixed. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.hpp > > Please ensure blank lines between the various process_xxx > declarations > and the comments that follow them. > > Circa lines 285 and 293. Fixed. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > 425 complete_gc->do_void(); > 426 iter.complete_enqueue(); > ... > 456 complete_gc->do_void(); > 457 iter.complete_enqueue(); > > These are changing to the ordering of those two operations; they were > done in the other order in the old process_phase3. I'm not sure it > matters, but I wasn't expecting such a change. Fixed. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > 327 size_t > ReferenceProcessor::process_soft_ref_reevaluate_work(DiscoveredList& > refs_list, > ... > 340 log_develop_trace(gc, ref)("Dropping reference (" > INTPTR_FORMAT ": %s" ") by policy", > 341 p2i(iter.obj()), iter.obj()- > >klass()->internal_name()); > > [pre-existing] > > Klass::internal_name() may perform resource allocation, but there's > no > obvious ResourceMark near here, unlike in > process_final_keep_alive_work and process_phantom_refs. > > Same issue in process_soft_weak_final_drop_live_null_work. > > I wonder if some of the ResourceMarks should be moved into > log_(dropped|enqueued)_ref, rather than being outside a loop with an > unknown number of resource allocations. And similarly in > process_soft_ref_reevaluate_work. Fixed, since this is about trace messages. > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > 792 removed += > process_soft_ref_reevaluate_work(_discoveredSoftRefs[i], > _current_soft_ref_policy, > 793 is_alive, keep_alive, > complete_gc); > > Abnormal indentation. Fixed. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.hpp > > Changed to remove marks_oops_alive from ProcessTask. > > I'm not sure why this change was made. It would seem to make some > collectors terminate Phase2 more slowly than necessary, since every > thread will end with an unnecessary stealing segment that should > never > find anything to do, but needs to rendezvous with all the other > threads. > > Note that JDK-8203028 missed changing the construction of the > RefProcPhase2Task; it should have changed the marks_oops_alive > argument unconditionally false. > > This is another piece of the mostly implicit / lightly documented > contract for the is_alive closure that G1 is violating. > > ------------------------------------------------------------------- > I thought marks_oops_alive is an unnecessary quirk of the collectors that found its way into the reference processing interface; its purpose is to indicate whether the gc should perform stealing at the end of a phase, as the complete closures drain the stacks completely by themselves anyway. Stealing or not should be part of the complete closure (i.e. the gc, and attempted by default by the gc) imho (it "completes" the phase), like G1 does. I fixed it though. New webrevs: http://cr.openjdk.java.net/~tschatzl/8202845/webrev.0_to_1 (diff) http://cr.openjdk.java.net/~tschatzl/8202845/webrev.1 (full) Testing: hs-tier1-5,jdk-tier-1-3 (with +/-ParallelRefProcEnabled) Thanks for your quick review, Thomas From jini.george at oracle.com Thu Jun 7 05:19:28 2018 From: jini.george at oracle.com (Jini George) Date: Thu, 7 Jun 2018 10:49:28 +0530 Subject: RFR (round 4), JEP-318: Epsilon GC In-Reply-To: References: <094e1093-5a13-4853-aa34-d4e987a069b0@redhat.com> <0b5c827a-2763-98aa-06af-24df9028aed7@oracle.com> Message-ID: <5375c956-089b-2bba-a39c-3f051b895ba1@oracle.com> Thanks for making the changes, Aleksey. The changes look good. Thanks, Jini. On 6/5/2018 9:16 PM, Aleksey Shipilev wrote: > Hi Jini, > > Thanks for taking a look, see comments inline. > > On 06/01/2018 10:13 AM, Jini George wrote: >> ==> share/classes/sun/jvm/hotspot/oops/ObjectHeap.java >> >> 444??????? liveRegions.add(eh.space()); >> >> We would need to add an object of type 'Address' to the liveRegions list, instead of type >> VirtualSpace. Not doing so results in exceptions of the following form from the compare() method for >> various clhsdb commands like 'jhisto': >> >> Exception in thread "main" java.lang.ClassCastException: >> jdk.hotspot.agent/sun.jvm.hotspot.memory.VirtualSpace cannot be cast to >> jdk.hotspot.agent/sun.jvm.hotspot.debugger.Address > > Oh, I see! Fixed: > http://hg.openjdk.java.net/jdk/sandbox/rev/d999bdb8173c > > >> ==>? share/classes/sun/jvm/hotspot/oops/ObjectHeap.java >> >> 445???? } else { >> 446??????? if (Assert.ASSERTS_ENABLED) { >> 447?????????? Assert.that(false, "Expecting GenCollectedHeap, G1CollectedHeap, " + >> 448?????????????????????? "or ParallelScavengeHeap, but got " + >> 449?????????????????????? heap.getClass().getName()); >> 450??????? } >> 451???? } >> >> * Please add EpsilonGC also to the assertion statement. > > I prefer to change it to non-GC-specific message as Per suggested: > http://hg.openjdk.java.net/jdk/sandbox/rev/5cd17d3d3f83 > > >> ==> share/classes/sun/jvm/hotspot/tools/HeapSummary.java >> >> The run() method would need to handle Epsilon GC here to avoid the Unknown CollectedHeap type error >> with jhsdb jmap --heap. > > Right, implemented in both places, run() and detectGCAlgo(): > http://hg.openjdk.java.net/jdk/sandbox/rev/b20a56352d78 > > >> ==> share/classes/sun/jvm/hotspot/HSDB.java >> >> In? showThreadStackMemory(), we have: >> >> 1101?????????????????????????? } >> 1102???????????????????????? } else if (collHeap instanceof ParallelScavengeHeap) { >> 1103?????????????????????????? ParallelScavengeHeap heap = (ParallelScavengeHeap) collHeap; >> 1104?????????????????????????? if (heap.youngGen().isIn(handle)) { >> 1105???????????????????????????? anno = "PSYoungGen "; >> 1106???????????????????????????? bad = false; >> 1107?????????????????????????? } else if (heap.oldGen().isIn(handle)) { >> 1108???????????????????????????? anno = "PSOldGen "; >> 1109???????????????????????????? bad = false; >> 1110?????????????????????????? } >> 1111???????????????????????? } else { >> 1112?????????????????????????? // Optimistically assume the oop isn't bad >> 1113?????????????????????????? anno = "[Unknown generation] "; >> 1114?????????????????????????? bad = false; >> 1115???????????????????????? } >> 1116 >> >> We would need to add the case of collHeap being an instanceof EpsilonHeap too. It would display >> "Unknown generation" while viewing the stack memory for the Java threads otherwise. > > Right, fixed: > http://hg.openjdk.java.net/jdk/sandbox/rev/f26c4a196a15 > > >> ==> It would be great if test/hotspot/jtreg/serviceability/sa/TestUniverse.java is enhanced to add >> the minimalistic test for EpsilonGC. > > Right: > http://hg.openjdk.java.net/jdk/sandbox/rev/c559de946c7d > > This still passes gc/epsilon and serviceability/sa tests. > > -Aleksey > From HORIE at jp.ibm.com Thu Jun 7 06:01:25 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Thu, 7 Jun 2018 15:01:25 +0900 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Message-ID: Dear all, Would you please review the following change? Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 G1ParScanThreadState::copy_to_survivor_space tries to move live objects to a different location. It uses a forwarding technique and allows multiple threads to compete for performing the copy step. A copy is performed after a thread succeeds in the CAS. CAS-failed threads are not allowed to dereference the forwardee concurrently. Current code is already written so that CAS-failed threads do not dereference the forwardee. Also, this constraint is documented in a caller function mark_forwarded_object as ?the object might be in the process of being copied by another worker so we cannot trust that its to-space image is well-formed?. There is no copy that must finish before the CAS. Threads that failed in the CAS must not dereference the forwardee. Therefore, no fence is necessary before and after the CAS. I measured SPECjbb2015 with this change. As a result, critical-jOPS performance improved by 27% on POWER8. Best regards, -- Michihiro, IBM Research - Tokyo -------------- next part -------------- An HTML attachment was scrubbed... URL: From sangheon.kim at oracle.com Thu Jun 7 06:38:57 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Wed, 6 Jun 2018 23:38:57 -0700 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> Message-ID: Hi Kim, Thanks for the review and sorry for replying late. I had to spend some time to figure out complete_gc closure stuff and re-running all tests again. On 6/4/18 5:11 PM, Kim Barrett wrote: >> On Jun 1, 2018, at 5:48 PM,sangheon.kim at oracle.com wrote: >> >> Hi all, >> >> As webrev.0 is conflicting with webrev.0 of "8203319: JDK-8201487 disabled too much queue balancing"(out for review, but not yet pushed), I'm posting webrev.1. >> >> http://cr.openjdk.java.net/~sangheki/8043575/webrev.1 >> http://cr.openjdk.java.net/~sangheki/8043575/webrev.1_to_0 > The hookup of the changes into the various collectors seems okay. My > comments are mostly focused on the ReferenceProcessor changes. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessorPhaseTimes.hpp > 140 void set_phase_number(RefProcPhaseNumbers phase_number) { _phase_number = phase_number; } > 141 RefProcPhaseNumbers phase_number() const { return _phase_number; } > > I think I dislike these and how they are used. And see other > discussion below, suggesting they might not be needed. > > (Though they are similar to other related states. But maybe I dislike > those too.) I can avoid using it as you suggested below. But it is still needed at other locations in referenceProcessorPhaseTimes.cpp. If you are saying we need to avoid using this stuffs, I think it is out of scope for this patch. So, please file a CR. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessor.hpp > 641 class RefProcMTDegreeAdjuster : public StackObj { > ... > 649 RefProcMTDegreeAdjuster(ReferenceProcessor* rp, ReferenceProcessorPhaseTimes* times, size_t ref_count); > > The times argument is here to provide information about what phase > we're in. That seems really indirect, and I'd really prefer that > information to be provided directly as arguments here. Webrev.2 uses ReferenceType and RefProcPhaseNumbers directly, please have a chance to compare with previous version. > I think that might also remove the need for the new phase_number and > set_phase_number functions for the phase times class. As I noted above, those are still used in other locations. > This also would flow through to ergo_proc_thread_count and use_max_threads. Done. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessor.cpp > 727 bool must_balance = mt_processing && need_balance_queues(refs_lists); > > need_balance_queues returns the right answer for an initial balancing, > but not for a re-balance after some processing. Specifically, further > reducing the mt-degree may require rebalancing, even if it wasn't > previously needed. > > (This comment is assuming the updated version of need_balance_queues > that I sent out today. The situation is worse without that.) Done. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessor.cpp > 787 // For discovery_is_atomic() is true, phase 3 should be processed to call do_void() from VoidClosure. > 788 // For discovery_is_atomic() is false, phase 3 can be skipped if there are no references because do_void() is > 789 // already called at phase 2. > 790 if (!discovery_is_atomic()) { > 791 if (total_count(refs_lists) == 0) { > 792 return; > 793 } > 794 } > > The discovery_is_atomic stuff is left-over from before JDK-8203028. > Phase2 now never calls the complete_gc closure, because it is never > needed, regardless of the value of discovery_is_atomic. So checking > that can be removed here. Here are 2 things. 1) If we call complete_gc closure before exit, is it okay? ??? - The short answer is no, as we face an assertion failure at CMS ParNew. The long answer is: For ParallelRefProcEnabled=true case, CMS ParNew is relying on to do some extra work by AbstractExecutor. So eventually ParScanThreadState::_young_old_boundary is set and then ParEvacuateFollowersClosure is used as a complete_gc closure. Just calling complete_gc closure that passed as a parameter doesn't help as it is EvacuateFollowersClosureGeneral. i.e. complete_gc closure is different at process_discovered_references() vs. phase3, ProcessTask::work(). 2) Why 'ref_count' is not updated and then line 800 is just using old 'ref_count'? ??? - If we use different number of workers(not exit at zero ref cases), we face a few types of assertion failures. IIRC, faced SIGSEGV too at copy_to_survivor_space(). ??? - Each collector has own per worker state information, so I think it will not properly updated if we use different number of workers between phase2 and phase3. ??? - This could be applied between phase1 and phase2, but I couldn't see any problem so far. So I left as is the exit path between phase1 and phase2. An alternative would be excluding exit among phases and address it from the CR. Previously there were another CR of JDK-8181214. * My previous version allowed only !discovery_is_atomic() cases would exit. With that condition, most cases are filtered out so CMS old gc and G1 CM would get the benefit. I never faced any crashes or assertion failures with it. But for the reason of 2) I'm removing exit path between phase2 and phase3 with some explanation why we can't exit. Please give me a better wording. > At least, that's what's expected by the ReferenceProcessing code. The > atomic discovery case for phase2 has "always" (since at least the > beginning of the mercurial age) ignored the complete_gc argument, and > only used it when dealing with the concurrent discovery case. > > But G1CopyingKeepAliveClosure appears to have always passed the buck > to the complete_gc closure, contrary to the expectations of the RP > phase2 code. So G1 young/mixed collections only work because phase3 > will eventually call the complete_gc closure (by each of the threads). > (G1 concurrent and full gc's have a cuttoff where no work is left for > the complete_gc closure if the object is already marked. I spot > checked other collectors, and at a quick glance it looks like the > others meet the expectations of the ReferenceProcessor code; it's just > G1 young/mixed collections that don't.) > > I'm not entirely sure what happens if the number of threads in phase3 > is less than the number in phase2 in that case. I think any pending > phase2 work associated with the missing threads will get stolen by > phase3 working threads, but haven't verified that. > > I think the simplest fix for this is to change phase2 to always call > the complete_gc closure after all. For some collections that's a > (perhaps somewhat expensive) nop, but in the overall context of > reference processing that's probably in the noise. And also update > the relevant comment in phase2. > > The alternative would be to make G1CopyingKeepAliveClosure meet the > (not explicitly stated in the API) ReferenceProcessor expectations, > perhaps bypassing some of the G1ParScanThreadState machinery. I think this is already covered above. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessor.cpp > 800 RefProcMTDegreeAdjuster a(this, phase_times, ref_count); > > This is using ref_count, which was last set on line 759, and not > updated after phase2. It should have been updated as part of the new > code block around line 790. I think this is already answered above. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessor.cpp > 1193 uint ReferenceProcessor::ergo_proc_thread_count(ReferenceProcessorPhaseTimes* times) const { > 1194 size_t ref_count = total_count(_discovered_refs); > 1195 > 1196 return ergo_proc_thread_count(ref_count, num_queues(), times); > 1197 } > > I don't think this overload should exist. The ref_count used here is > the SoftReference count, which isn't particularly interesting > here. (It's not the total number of discovered references, which is > also not an interesting number for this purpose.) > > It seems to only exist as part of the workarounds for making ParNew > kind of limp along with these changes. But wouldn't it be simpler to > leave ParNewRefProceTaskExecutor::execute alone, using the > active_workers as before? (And recall that I suggested above that > execute should just use the currently configured mt-processing degree, > rather than adding an ergo_workers argument.) It will be called in the > context of RefProcMTDegreeAdjuster, that won't do anything because of > ParNew disables mt-degree adjustment. So I think by leaving things > alone here we retain the current behavior, e.g. CMS ParNew uses > ParallelRefProcEnabled as an all-or-nothing flag, and doesn't pay > attention to the new ReferencesPerThread. And that seems reasonable > to me for a deprecated collector. Done. I agree for not supporting CMS so we don't need to add the overload version for CMS. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/cms/parNewGeneration.cpp > 1455 false); // disable adjusting queue size when processing references > > It's not the queue size that adjusted (or not), it's the number of > queues. And really, it's the MT processing degree. > > That last probably should also apply to the name of the variable in > the reference processor, and related names. Although given that this > is all a bit of a kludge to work around (deprecated) CMS deficiencies, > maybe I shouldn't be too concerned. And you did file JDK-8203951. > > Though I see Thomas also requested a name change... > > If you change the name, remember to update JDK-8203951. Sure, I will update the name at JDK-8203951. webrev.2 is proposing "bool _adjust_no_of_processing_threads". > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessor.cpp > 1199 uint ReferenceProcessor::ergo_proc_thread_count(size_t ref_count, > 1200 uint max_threads, > 1201 ReferenceProcessorPhaseTimes* times) const { > ... > 1204 if (ReferencesPerThread == 0) { > 1205 return _num_queues; > 1206 } > > Why is _num_queues used here, but max_threads used elsewhere in this > function. I *think* this ought to also be max_threads. Agreed, so changed to return max_threads. Basically _num_queues and max_threads are same value here but for consistency, max_threads is better. :) As you may know, the intent of exiting here is to avoid SIGFPE at size_t thread_count = 1 + (ref_count / ReferencesPerThread); Basically this method will not be called if ReferencesPerThread == 0 because it will be filtered out at line 1235. i.e. ctor of RefProcMTDegreeAdjuster. But I wanted to have something to avoid SIGFPE within this method too. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessor.cpp > 1229 RefProcMTDegreeAdjuster::RefProcMTDegreeAdjuster(ReferenceProcessor* rp, > ... > 1235 if (!_rp->has_adjustable_queue() || (ReferencesPerThread == 0)) { > > This checks ReferencesPerThread for 0 to do nothing. And then > ergo_proc_thread_count similarly checks for 0 to (effectively) do > nothing. If the earlier suggestion to kill the 1-arg overload is > taken, then ergo_proc_thread_count is really just a helper for > RefProcMTDegreeAdjuster, and we don't need that special case twice. The check for ReferencesPerThread is to disable this feature. > And perhaps it should be a private helper in RefProcMTDegreeAdjuster? > Similarly for use_max_threads. Or maybe just inline them into that > class's constructor. Good idea and it was one of my original patch before considering to support CMS. Done. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/shared/referenceProcessor.cpp > 1199 uint ReferenceProcessor::ergo_proc_thread_count(size_t ref_count, > ... > 1213 return (uint)MIN3(thread_count, > 1214 static_cast(max_threads), > 1215 (size_t)os::initial_active_processor_count()); > > I don't see why this should look at initial_active_processor_count? > > ------------------------------------------------------------------------------ As I noted in pre-review email thread, it showed better results if I avoid using more than active cpus. This is a kind of enhancement for better performance, so even though the command-line option was set to use more than cpus(via ParalelGCThreads), I think it is good to limit. This is different from command-line option processing which lets users decide even though the given options would results in worse performance. Or if you are saying I should use "active_processor_count()"? If it is the case, I updated to use 'active_processor_count()' instead of 'initial_active_processor_count()'. I would prefer to have this unless there's any reason of not having it. ---- In addition: 1) I removed condition check of (num_queues() >1) because it would affect to a caller side expectation for per worker state information etc.. I never faced any problem though. 2) Minor changes such as renaming. e.g. RefProcMTDegreeAdjuster::_saved_num_queue to _saved_num_queues. Webrev: http://cr.openjdk.java.net/~sangheki/8043575/webrev.2 http://cr.openjdk.java.net/~sangheki/8043575/webrev.2_to_1 Testing: hs-tier1~5 with/without ParallelRefProcEnabled Thanks, Sangheon -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Thu Jun 7 09:23:12 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 07 Jun 2018 11:23:12 +0200 Subject: RFR (S): 8204081: Mismatch in rebuild policy and collection set chooser causes remembered sets to be kept errorneously In-Reply-To: <0d60a410-3a1e-e693-428c-14c3359baff6@oracle.com> References: <50252eb56d357b11e9a16cf132de52d12a63839b.camel@oracle.com> <0d60a410-3a1e-e693-428c-14c3359baff6@oracle.com> Message-ID: <351094d9fdcdd53d36eb2ad92778b5db9be2075c.camel@oracle.com> Hi Stefan, Kim, On Thu, 2018-05-31 at 10:35 +0200, Stefan Johansson wrote: > > On 2018-05-30 14:58, Thomas Schatzl wrote: > > Hi all, > > > > [...] > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8204081 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8204081/webrev/ > > Nice fix, looks good! > Cheers, > Stefan > thanks for your reviews. Thomas From thomas.schatzl at oracle.com Thu Jun 7 09:47:17 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 07 Jun 2018 11:47:17 +0200 Subject: RFC: Patch for 8203848: Missing remembered set entry in j.l.ref.references after JDK-8203028 Message-ID: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> Hi all, I would like to ask for comments on the fix for the issue handled in JDK-8203848. In particular, the problem is that currently the "discovered" field of j.l.ref.References is managed completely opaque to the rest of the VM, which causes the described error: remembered set entries are missing for that reference when doing *concurrent* reference discovery. There is no problem with liveness of objects referenced by that because a) a j.l.ref.Reference object found by reference discovery will be automatically kept alive and b) no GC in Hotspot at this time evacuates old gen objects during marking (and Z does not use the reference processing framework at all), so that reference in the "discovered" field will never be outdated. However in the future, G1 might want to move objects in old gen at any time for e.g. defragmentation purposes, and I am a bit unclear about Shenandoah tbh :) I see two solutions for this issue: - improve the access modifier so that at least the post-barrier that is responsible for adding remembered set entries is invoked on this field. E.g. in ReferenceProcessor::add_to_discovered_list_mt(), instead of oop retest = RawAccess<>::oop_atomic_cmpxchg(next_discovered, discovered_addr, oop(NULL)); do a oop retest = HeapAccess::oop_atomic_cmpxchg(next_discovered, discovered_addr, oop(NULL)); Note that I am almost confident that this only makes G1 work as far as I understand the access framework; since the previous value is NULL when we cmpxchg, G1 can skip the pre-barrier; maybe more is needed for Shenandoah, but I hope that Shenandoah devs can chime in here. I tested this, and with this change the problem does not occur after 2000 iterations of the test. (see the preliminary webrev at http://cr.openjdk.java.net/~tschatzl/820 3848/webrev/ ; only the change to referenceProcessor.cpp is relevant here). - the other "solution" is to fix the remembered set verification to ignore this field, and try to fix this again in the future when/if G1 evacuates old regions during marking. Any comments? Thanks, Thomas From robbin.ehn at oracle.com Thu Jun 7 09:57:46 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 7 Jun 2018 11:57:46 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <1852755B-5CC6-4499-BCDD-B672FA4F6CA0@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <1852755B-5CC6-4499-BCDD-B672FA4F6CA0@oracle.com> Message-ID: <6719d050-f4d3-22db-848b-7121574df709@oracle.com> Thanks, Gerard! Fixed below. /Robbin On 2018-06-05 20:27, Gerard Ziemski wrote: > hi Robbin, > > A few minor issues for which I don't need to see a webrev: > > > #1 Change: > > log_trace(stringtable)("Started to growed"); > > to > > log_trace(stringtable)("Started to grow"); > > > #2 Add white space after "while? from (there is one more instance of this): > > while(gt.doTask(jt)) { > > to > > while (gt.doTask(jt)) { > > > #3 Change: > > log_debug(stringtable)("Growed to size:" SIZE_FORMAT, _current_size); > > to > > log_debug(stringtable)("Grown to size:" SIZE_FORMAT, _current_size); > > > #4 "nearest_pow_2" is inaptly named, if the user passes "129" we select "256", not "128", so maybe rename it to "ceil_pow_2?? > > > PS. I would have liked to see whether Netbeans or Eclipse trigger the table resizing during the startup, but I was unable to trivially run either of them using jdk11, oh well. > > > cheers > >> On Jun 4, 2018, at 10:59 AM, Robbin Ehn wrote: >> >> Hi, >> >> Here is an updated after reviews: >> Inc : http://cr.openjdk.java.net/~rehn/8195097/v2/inc/webrev/ >> Full: http://cr.openjdk.java.net/~rehn/8195097/v2/full/webrev/ >> >> Passed tier 1-3. >> >> /Robbin >> >> >> On 2018-05-28 15:19, Robbin Ehn wrote: >>> Hi all, please review. >>> This implements the StringTable with the ConcurrentHashtable for managing the >>> strings using oopStorage for backing the actual oops via WeakHandles. >>> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >>> which means GC only needs to walk the oopStorage, either concurrently or in a >>> safepoint. Walking oopStorage is also faster so there is a good effect on all >>> safepoints visiting the oops. >>> The unlinking and freeing happens during inserts when dead weak oops are >>> encountered in that bucket. In any normal workload the stringtable self-cleans >>> without needing any additional cleaning. Cleaning/unlinking can also be done >>> concurrently via the ServiceThread, it is started when we have a high ?dead >>> factor?. E.g. application have a lot of interned string removes the references >>> and never interns again. The ServiceThread also concurrently grows the table if >>> ?load factor? is high. Both the cleaning and growing take care to not prolonging >>> time to safepoint, at the cost of some speed. >>> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >>> changeset, various benchmark such as JMH, specJBB2015. >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >>> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >>> Thanks, Robbin > From robbin.ehn at oracle.com Thu Jun 7 09:58:17 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 7 Jun 2018 11:58:17 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: <6249C3D2-41DA-4784-A120-A07FF45A004D@oracle.com> References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <6249C3D2-41DA-4784-A120-A07FF45A004D@oracle.com> Message-ID: <0be0d1cd-7f23-81d2-f206-a3be9443875c@oracle.com> Thanks Jiangli! /Robbin On 2018-06-05 22:20, Jiangli Zhou wrote: > Hi Robbin, > > The latest version looks good to me too! > > Thanks, > Jiangli > >> On Jun 4, 2018, at 8:59 AM, Robbin Ehn wrote: >> >> Hi, >> >> Here is an updated after reviews: >> Inc : http://cr.openjdk.java.net/~rehn/8195097/v2/inc/webrev/ >> Full: http://cr.openjdk.java.net/~rehn/8195097/v2/full/webrev/ >> >> Passed tier 1-3. >> >> /Robbin >> >> >> On 2018-05-28 15:19, Robbin Ehn wrote: >>> Hi all, please review. >>> This implements the StringTable with the ConcurrentHashtable for managing the >>> strings using oopStorage for backing the actual oops via WeakHandles. >>> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >>> which means GC only needs to walk the oopStorage, either concurrently or in a >>> safepoint. Walking oopStorage is also faster so there is a good effect on all >>> safepoints visiting the oops. >>> The unlinking and freeing happens during inserts when dead weak oops are >>> encountered in that bucket. In any normal workload the stringtable self-cleans >>> without needing any additional cleaning. Cleaning/unlinking can also be done >>> concurrently via the ServiceThread, it is started when we have a high ?dead >>> factor?. E.g. application have a lot of interned string removes the references >>> and never interns again. The ServiceThread also concurrently grows the table if >>> ?load factor? is high. Both the cleaning and growing take care to not prolonging >>> time to safepoint, at the cost of some speed. >>> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >>> changeset, various benchmark such as JMH, specJBB2015. >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >>> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >>> Thanks, Robbin > From per.liden at oracle.com Thu Jun 7 10:13:56 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 7 Jun 2018 12:13:56 +0200 Subject: RFR: 8204180: Implementation: JEP 318: Epsilon GC (round 5) In-Reply-To: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> References: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> Message-ID: <7211f820-4999-2988-e06c-1067b1fb138f@oracle.com> On 06/06/2018 06:30 PM, Aleksey Shipilev wrote: > Hi, > > This is fifth (and hopefully final) round of code review for Epsilon GC changes. It includes the > fixes done as the result of fourth round of reviews, mostly in Serviceability. The build parts are > the same since last few reviews, so this is not posted to build-dev at . > > Webrev: > http://cr.openjdk.java.net/~shade/epsilon/webrev.09/ Still looks good to me! > > If we are good -- should somebody, e.g. Project Lead ack this? -- I am going to push this with the > following changeset metadata: You should be good to go, if the JEP is targeted and reviewers are happy. cheers, Per > > 8204180: Implementation: JEP 318: Epsilon, A No-Op Garbage Collector > Summary: Introduce Epsilon GC > Reviewed-by: rkennke, ihse, pliden, eosterlund, lmesnik, jgeorge > > Builds: > server X {x86_64, x86_32, aarch64, arm32, ppc64le, s390x} > minimal X {x86, x86_64} > zero X {x86_64} > > Testing: gc/epsilon on x86_64 > > Thanks, > -Aleksey > From stefan.karlsson at oracle.com Thu Jun 7 10:16:35 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 7 Jun 2018 12:16:35 +0200 Subject: RFR: 8204538: Split ScanClosure and ScanClosureWithParBarrier Message-ID: Hi all, Please review this patch to change ScanClosureWithParBarrier to not inherit from ScanClosure. http://cr.openjdk.java.net/~stefank/8204538/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8204538 ScanClosureWithParBarrier is the only ExtendedOopClosure sub class that overrides the do_oop functions of a super class. This patch cleans this up to pave the way for: https://bugs.openjdk.java.net/browse/JDK-8204540 - Automatic oop closure devirtualization thanks, StefanK From stefan.johansson at oracle.com Thu Jun 7 11:13:36 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Thu, 7 Jun 2018 13:13:36 +0200 Subject: RFR (S): 8204082: Add indication that this is the "Last Young GC before Mixed" to logs In-Reply-To: References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> Message-ID: Hi Thomas, On 2018-06-05 22:26, Thomas Schatzl wrote: > Hi, > > On Wed, 2018-05-30 at 14:50 +0200, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that adds an indication about >> the "last young" gc before a mixed phase to the logs? >> >> I.e. instead of "GC Pause (Young)", that GC is called "GC Pause (Last >> Young)". >> >> The reason for this potentially intrusive change to log parsers is >> that in quite a few situations it is useful to be really sure what >> kind of GC is being started without needing to either add gc+ergo >> logging or restarting. This situation where I was trying to figure >> out whether the given young GC I was looking at was the young gc >> before the mixed gcs or just a young-gc because the Cleanup pause >> figured out that we do not need a mixed phase. >> >> I am not too hung up about naming it "Last Young" in particular, but >> I really would like to have this or a similar indication that is >> generally available with minimal logging. > > Alternatively we could just call it "Mixed" that does not collect any > old gen regions at this time. Later we could start gc as soon as the > pause time goal permits and add a few old gen regions. > I think I like this better than "Last Young" or "Final Young". > This would probably be a little confusing to users, but at least help > diagnosing logs. > > There also has been the suggestion to call it "Final Young" instead of > "Last Young". > > Any comments? Another possible solution, not sure it's better would be to an extra tag to all Young GCs, something like: Pause Young (Initial Mark) ... Pause Young (Normal) ... Pause Young (Finished Mark) ... Pause Young (Mixed) ... But I guess it's a bit harsh to call the Mixed GCs Young =) A third solution would be to somehow use the GC cause to mark this state. As usual naming is hard, I think I'm leaning towards just calling it Mixed and then we know that the first mixed GC is always young only =) Thanks, Stefan > > Thomas > >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8204082 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8204084/webrev/ >> Testing: >> hs-tier 1-3 From erik.osterlund at oracle.com Thu Jun 7 12:34:48 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 7 Jun 2018 14:34:48 +0200 Subject: RFR: 8204538: Split ScanClosure and ScanClosureWithParBarrier In-Reply-To: References: Message-ID: <5B192668.5050508@oracle.com> Hi Stefan, Looks good. Thanks, /Erik On 2018-06-07 12:16, Stefan Karlsson wrote: > Hi all, > > Please review this patch to change ScanClosureWithParBarrier to not > inherit from ScanClosure. > > http://cr.openjdk.java.net/~stefank/8204538/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8204538 > > ScanClosureWithParBarrier is the only ExtendedOopClosure sub class > that overrides the do_oop functions of a super class. This patch > cleans this up to pave the way for: > > https://bugs.openjdk.java.net/browse/JDK-8204540 - Automatic oop > closure devirtualization > > thanks, > StefanK From stefan.karlsson at oracle.com Thu Jun 7 12:29:48 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 7 Jun 2018 14:29:48 +0200 Subject: RFR: 8204538: Split ScanClosure and ScanClosureWithParBarrier In-Reply-To: <5B192668.5050508@oracle.com> References: <5B192668.5050508@oracle.com> Message-ID: <338ce6c4-c9c1-1148-20dd-5a070bde202a@oracle.com> Thanks, Erik. StefanK On 2018-06-07 14:34, Erik ?sterlund wrote: > Hi Stefan, > > Looks good. > > Thanks, > /Erik > > On 2018-06-07 12:16, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to change ScanClosureWithParBarrier to not >> inherit from ScanClosure. >> >> http://cr.openjdk.java.net/~stefank/8204538/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8204538 >> >> ScanClosureWithParBarrier is the only ExtendedOopClosure sub class >> that overrides the do_oop functions of a super class. This patch >> cleans this up to pave the way for: >> >> ?https://bugs.openjdk.java.net/browse/JDK-8204540 - Automatic oop >> closure devirtualization >> >> thanks, >> StefanK > From shade at redhat.com Thu Jun 7 12:38:08 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Jun 2018 14:38:08 +0200 Subject: RFR: 8204180: Implementation: JEP 318: Epsilon GC (round 5) In-Reply-To: <7211f820-4999-2988-e06c-1067b1fb138f@oracle.com> References: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> <7211f820-4999-2988-e06c-1067b1fb138f@oracle.com> Message-ID: On 06/07/2018 12:13 PM, Per Liden wrote: > On 06/06/2018 06:30 PM, Aleksey Shipilev wrote: >> Hi, >> >> This is fifth (and hopefully final) round of code review for Epsilon GC changes. It includes the >> fixes done as the result of fourth round of reviews, mostly in Serviceability. The build parts are >> the same since last few reviews, so this is not posted to build-dev at . >> >> Webrev: >> ?? http://cr.openjdk.java.net/~shade/epsilon/webrev.09/ > > Still looks good to me! > >> >> If we are good -- should somebody, e.g. Project Lead ack this? -- I am going to push this with the >> following changeset metadata: > > You should be good to go, if the JEP is targeted and reviewers are happy. Thanks Per, pending no other reviews, I am going to push this on Monday, June 11, after I get back from vacation. ZGC webrev is quite probably clashing with Epsilon in shared parts (e.g. CollectedHeap enums), so trivial rebases would be needed. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From per.liden at oracle.com Thu Jun 7 12:42:45 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 7 Jun 2018 14:42:45 +0200 Subject: RFR: 8204180: Implementation: JEP 318: Epsilon GC (round 5) In-Reply-To: References: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> <7211f820-4999-2988-e06c-1067b1fb138f@oracle.com> Message-ID: <0e63a401-d28d-2bdb-d024-3b48f3fba0cb@oracle.com> On 06/07/2018 02:38 PM, Aleksey Shipilev wrote: > On 06/07/2018 12:13 PM, Per Liden wrote: >> On 06/06/2018 06:30 PM, Aleksey Shipilev wrote: >>> Hi, >>> >>> This is fifth (and hopefully final) round of code review for Epsilon GC changes. It includes the >>> fixes done as the result of fourth round of reviews, mostly in Serviceability. The build parts are >>> the same since last few reviews, so this is not posted to build-dev at . >>> >>> Webrev: >>> ?? http://cr.openjdk.java.net/~shade/epsilon/webrev.09/ >> >> Still looks good to me! >> >>> >>> If we are good -- should somebody, e.g. Project Lead ack this? -- I am going to push this with the >>> following changeset metadata: >> >> You should be good to go, if the JEP is targeted and reviewers are happy. > > Thanks Per, pending no other reviews, I am going to push this on Monday, June 11, after I get back > from vacation. ZGC webrev is quite probably clashing with Epsilon in shared parts (e.g. > CollectedHeap enums), so trivial rebases would be needed. Yep, any conflicts there should be trivial to resolve. /Per From thomas.schatzl at oracle.com Thu Jun 7 13:07:40 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 07 Jun 2018 15:07:40 +0200 Subject: RFR: 8204538: Split ScanClosure and ScanClosureWithParBarrier In-Reply-To: References: Message-ID: <97fc6099d801dfb0e7f10fde5210b7a0e3ce9810.camel@oracle.com> Hi, On Thu, 2018-06-07 at 12:16 +0200, Stefan Karlsson wrote: > Hi all, > > Please review this patch to change ScanClosureWithParBarrier to not > inherit from ScanClosure. > > http://cr.openjdk.java.net/~stefank/8204538/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8204538 > > ScanClosureWithParBarrier is the only ExtendedOopClosure sub class > that overrides the do_oop functions of a super class. This patch > cleans this up to pave the way for: > > https://bugs.openjdk.java.net/browse/JDK-8204540 - Automatic oop > closure devirtualization looks good. Is there a way to automatically determine such issues? Thanks, Thomas From stefan.karlsson at oracle.com Thu Jun 7 13:13:20 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 7 Jun 2018 15:13:20 +0200 Subject: RFR: 8204538: Split ScanClosure and ScanClosureWithParBarrier In-Reply-To: <97fc6099d801dfb0e7f10fde5210b7a0e3ce9810.camel@oracle.com> References: <97fc6099d801dfb0e7f10fde5210b7a0e3ce9810.camel@oracle.com> Message-ID: <90b2b5c7-f012-703f-d9b4-3e3da31a04f9@oracle.com> Hi Thomas, On 2018-06-07 15:07, Thomas Schatzl wrote: > Hi, > > On Thu, 2018-06-07 at 12:16 +0200, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to change ScanClosureWithParBarrier to not >> inherit from ScanClosure. >> >> http://cr.openjdk.java.net/~stefank/8204538/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8204538 >> >> ScanClosureWithParBarrier is the only ExtendedOopClosure sub class >> that overrides the do_oop functions of a super class. This patch >> cleans this up to pave the way for: >> >> https://bugs.openjdk.java.net/browse/JDK-8204540 - Automatic oop >> closure devirtualization > > looks good. Thanks! > > Is there a way to automatically determine such issues? I mused a bit about that in the RFE linked above. StefanK > > Thanks, > Thomas > From thomas.schatzl at oracle.com Thu Jun 7 14:12:09 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 07 Jun 2018 16:12:09 +0200 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> Message-ID: <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> Hi Sangheon, On Wed, 2018-06-06 at 23:38 -0700, sangheon.kim at oracle.com wrote: > Hi Kim, > > Thanks for the review and sorry for replying late. > I had to spend some time to figure out complete_gc closure stuff and > re-running all tests again. I took these .2 changes and started merging it with the JDK-8202845 changes. As I suspected, the change gets a lot smaller and nicer with that. More about that below. > > On 6/4/18 5:11 PM, Kim Barrett wrote: > > > On Jun 1, 2018, at 5:48 PM, sangheon.kim at oracle.com wrote: > > > > > > Hi all, > > > > > > As webrev.0 is conflicting with webrev.0 of "8203319: JDK-8201487 > > > disabled too much queue balancing"(out for review, but not yet > > > pushed), I'm posting webrev.1. > > > > > > http://cr.openjdk.java.net/~sangheki/8043575/webrev.1 > > > http://cr.openjdk.java.net/~sangheki/8043575/webrev.1_to_0 > > > > The hookup of the changes into the various collectors seems > > okay. My comments are mostly focused on the ReferenceProcessor > > changes. > > CMS is okay with latest webrev, G1 looks it too. However there are issues with parallel. From what I understand for all workers that are "idle", you need to put "idle" tasks in the queue. Parallel always seems to start all non-idle ("working" on an IdleTask) worker threads all the time. So in case with this change you want select a lower number of threads, you first need to start all threads with some amount of IdleTask in the queue (there are some helper methods in GCTaskManager). If you want a higher number of threads, you need to release all idle workers, and then prepopulate the task queue with a new amount of IdleTasks and the actual worker tasks you want. Overall I think we should defer this feature for parallel gc too. > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessorPhaseTimes.hpp > > 140 void set_phase_number(RefProcPhaseNumbers phase_number) { > > _phase_number = phase_number; } > > 141 RefProcPhaseNumbers phase_number() const { return > > _phase_number; } > > > > I think I dislike these and how they are used. And see other > > discussion below, suggesting they might not be needed. > > > > (Though they are similar to other related states. But maybe I > > dislike those too.) > I can avoid using it as you suggested below. > > But it is still needed at other locations in > referenceProcessorPhaseTimes.cpp. If you are saying we need to avoid > using this stuffs, I think it is out of scope for this patch. So, > please file a CR. > I think most of these issues are kind of resolved with the JDK-8202845 changes; it contains a lot of refactoring of that code. > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessor.hpp > > 641 class RefProcMTDegreeAdjuster : public StackObj { > > ... > > 649 RefProcMTDegreeAdjuster(ReferenceProcessor* rp, > > ReferenceProcessorPhaseTimes* times, size_t ref_count); > > > > The times argument is here to provide information about what phase > > we're in. That seems really indirect, and I'd really prefer that > > information to be provided directly as arguments here. > Webrev.2 uses ReferenceType and RefProcPhaseNumbers directly, please > have a chance to compare with previous version. > > > I think that might also remove the need for the new phase_number > > and set_phase_number functions for the phase times class. > As I noted above, those are still used in other locations. > > > This also would flow through to ergo_proc_thread_count and > > use_max_threads. > Done. > > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessor.cpp > > 727 bool must_balance = mt_processing && > > need_balance_queues(refs_lists); > > > > need_balance_queues returns the right answer for an initial > > balancing, but not for a re-balance after some > > processing. Specifically, further reducing the mt-degree may > > require rebalancing, even if it wasn't previously needed. > > > > (This comment is assuming the updated version of > > need_balance_queues that I sent out today. The situation is worse > > without that.) > Done. > > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessor.cpp > > 787 // For discovery_is_atomic() is true, phase 3 should be > > processed to call do_void() from VoidClosure. > > 788 // For discovery_is_atomic() is false, phase 3 can be > > skipped if there are no references because do_void() is > > 789 // already called at phase 2. > > 790 if (!discovery_is_atomic()) { > > 791 if (total_count(refs_lists) == 0) { > > 792 return; > > 793 } > > 794 } > > > > The discovery_is_atomic stuff is left-over from before JDK-8203028. > > Phase2 now never calls the complete_gc closure, because it is never > > needed, regardless of the value of discovery_is_atomic. So > > checking that can be removed here. > Here are 2 things. > 1) If we call complete_gc closure before exit, is it okay? > - The short answer is no, as we face an assertion failure at CMS > ParNew. The long answer is: For ParallelRefProcEnabled=true case, CMS > ParNew is relying on to do some extra work by AbstractExecutor. So :) > eventually ParScanThreadState::_young_old_boundary is set and then > ParEvacuateFollowersClosure is used as a complete_gc closure. Just > calling complete_gc closure that passed as a parameter doesn't help > as it is EvacuateFollowersClosureGeneral ScanClosureWithParBarrier>. i.e. complete_gc closure is different at > process_discovered_references() vs. phase3, ProcessTask::work(). Let's just look at the JDK-8202845 changes which seem to make this issue non-existent. Also there is no CMS support, so there should be no changes. > 2) Why 'ref_count' is not updated and then line 800 is just using old > 'ref_count'? > - If we use different number of workers(not exit at zero ref > cases), we face a few types of assertion failures. IIRC, faced > SIGSEGV too at copy_to_survivor_space(). > - Each collector has own per worker state information, so I think > it will not properly updated if we use different number of workers > between phase2 and phase3. > - This could be applied between phase1 and phase2, but I couldn't > see any problem so far. So I left as is the exit path between phase1 > and phase2. An alternative would be excluding exit among phases and > address it from the CR. Previously there were another CR of JDK- > 8181214. ... and this one too. > * My previous version allowed only !discovery_is_atomic() cases would > exit. With that condition, most cases are filtered out so CMS old gc > and G1 CM would get the benefit. I never faced any crashes or > assertion failures with it. But for the reason of 2) I'm removing > exit path between phase2 and phase3 with some explanation why we > can't exit. Please give me a better wording. ... and that one. > > At least, that's what's expected by the ReferenceProcessing > > code. The atomic discovery case for phase2 has "always" (since at > > least the beginning of the mercurial age) ignored the complete_gc > > argument, and only used it when dealing with the concurrent > > discovery case. > > > > But G1CopyingKeepAliveClosure appears to have always passed the > > buck to the complete_gc closure, contrary to the expectations of > > the RP phase2 code. So G1 young/mixed collections only work > > because phase3 will eventually call the complete_gc closure (by > > each of the threads). > > (G1 concurrent and full gc's have a cuttoff where no work is left > > for the complete_gc closure if the object is already marked. I spot > > checked other collectors, and at a quick glance it looks like the > > others meet the expectations of the ReferenceProcessor code; it's > > just G1 young/mixed collections that don't.) > > > > I'm not entirely sure what happens if the number of threads in > > phase3 is less than the number in phase2 in that case. I think any > > pending phase2 work associated with the missing threads will get > > stolen by phase3 working threads, but haven't verified that. > > > > I think the simplest fix for this is to change phase2 to always > > call the complete_gc closure after all. For some collections > > that's a (perhaps somewhat expensive) nop, but in the overall > > context of reference processing that's probably in the noise. And > > also update the relevant comme > > The alternative would be to make G1CopyingKeepAliveClosure meet the > > (not explicitly stated in the API) ReferenceProcessor expectations, > > perhaps bypassing some of the G1ParScanThreadState machinery. > I think this is already covered above. > > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessor.cpp > > 800 RefProcMTDegreeAdjuster a(this, phase_times, ref_count); > > > > This is using ref_count, which was last set on line 759, and not > > updated after phase2. It should have been updated as part of the > > new > > code block around line 790. > I think this is already answered above. > > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessor.cpp > > 1193 uint > > ReferenceProcessor::ergo_proc_thread_count(ReferenceProcessorPhaseT > > imes* times) const { > > 1194 size_t ref_count = total_count(_discovered_refs); > > 1195 > > 1196 return ergo_proc_thread_count(ref_count, num_queues(), > > times); > > 1197 } > > > > I don't think this overload should exist. The ref_count used here > > is the SoftReference count, which isn't particularly interesting > > here. (It's not the total number of discovered references, which is > > also not an interesting number for this purpose.) > > > > It seems to only exist as part of the workarounds for making ParNew > > kind of limp along with these changes. But wouldn't it be simpler > > to leave ParNewRefProceTaskExecutor::execute alone, using the > > active_workers as before? (And recall that I suggested above that > > execute should just use the currently configured mt-processing > > degree, rather than adding an ergo_workers argument.) It will be > > called in the context of RefProcMTDegreeAdjuster, that won't do > > anything because of ParNew disables mt-degree adjustment. So I > > think by leaving things alone here we retain the current behavior, > > e.g. CMS ParNew uses ParallelRefProcEnabled as an all-or-nothing > > flag, and doesn't pay attention to the new > > ReferencesPerThread. And that seems reasonable to me for a > > deprecated collector. > Done. > I agree for not supporting CMS so we don't need to add the overload > version for CMS. > > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/cms/parNewGeneration.cpp > > 1455 false); // > > disable adjusting queue size when processing references > > > > It's not the queue size that adjusted (or not), it's the number of > > queues. And really, it's the MT processing degree. > > > > That last probably should also apply to the name of the variable in > > the reference processor, and related names. Although given that > > this is all a bit of a kludge to work around (deprecated) CMS > > deficiencies, maybe I shouldn't be too concerned. And you did file > > JDK-8203951. > > > > Though I see Thomas also requested a name change... > > > > If you change the name, remember to update JDK-8203951. > Sure, I will update the name at JDK-8203951. > webrev.2 is proposing "bool _adjust_no_of_processing_threads". > > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessor.cpp > > 1199 uint ReferenceProcessor::ergo_proc_thread_count(size_t > > ref_count, > > 1200 uint > > max_threads, > > 1201 ReferenceProce > > ssorPhaseTimes* times) const { > > ... > > 1204 if (ReferencesPerThread == 0) { > > 1205 return _num_queues; > > 1206 } > > > > Why is _num_queues used here, but max_threads used elsewhere in > > this function. I *think* this ought to also be max_threads. > Agreed, so changed to return max_threads. > Basically _num_queues and max_threads are same value here but for > consistency, max_threads is better. :) > > As you may know, the intent of exiting here is to avoid SIGFPE at > size_t thread_count = 1 + (ref_count / ReferencesPerThread); > > Basically this method will not be called if ReferencesPerThread == 0 > because it will be filtered out at line 1235. i.e. ctor of > RefProcMTDegreeAdjuster. > But I wanted to have something to avoid SIGFPE within this method > too. > too. > > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessor.cpp > > 1229 > > RefProcMTDegreeAdjuster::RefProcMTDegreeAdjuster(ReferenceProcessor > > * rp, > > ... > > 1235 if (!_rp->has_adjustable_queue() || (ReferencesPerThread == > > 0)) { > > > > This checks ReferencesPerThread for 0 to do nothing. And then > > ergo_proc_thread_count similarly checks for 0 to (effectively) do > > nothing. If the earlier suggestion to kill the 1-arg overload is > > taken, then ergo_proc_thread_count is really just a helper for > > RefProcMTDegreeAdjuster, and we don't need that special case twice. > The check for ReferencesPerThread is to disable this feature. > > > And perhaps it should be a private helper in > > RefProcMTDegreeAdjuster? > > Similarly for use_max_threads. Or maybe just inline them into that > > class's constructor. Good idea and it was one of my original patch > > before considering to support CMS. > Done. > > > ----------------------------------------------------------------- > > ------------- > > src/hotspot/share/gc/shared/referenceProcessor.cpp > > 1199 uint ReferenceProcessor::ergo_proc_thread_count(size_t > > ref_count, > > ... > > 1213 return (uint)MIN3(thread_count, > > 1214 static_cast(max_threads), > > 1215 (size_t)os::initial_active_processor_count > > ()); > > > > I don't see why this should look at initial_active_processor_count? > > > > ----------------------------------------------------------------- > > ------------- > As I noted in pre-review email thread, it showed better results if I > avoid using more than active cpus. > This is a kind of enhancement for better performance, so even though > the command-line option was set to use more than cpus(via > ParalelGCThreads), I think it is good to limit. This is different > from command-line option processing which lets users decide even > though the given options would results in worse performance. > > Or if you are saying I should use "active_processor_count()"? If it > is the case, I updated to use 'active_processor_count()' instead of > 'initial_active_processor_count()'. > > I would prefer to have this unless there's any reason of not having > it. First, always use active_processor_count() for current thread decisions. Initial_active_processor_count() is only here to get a consistent value of the processor count during initialization. What has been your test to determine this? GCOld? I think this may be outdated as with recent changes scalability (with G1) should have improved a lot. > ---- > In addition: > 1) I removed condition check of (num_queues() >1) because it would > affect to a caller side expectation for per worker state information > etc.. I never faced any problem though. > 2) Minor changes such as renaming. e.g. > RefProcMTDegreeAdjuster::_saved_num_queue to _saved_num_queues. > > Webrev: > http://cr.openjdk.java.net/~sangheki/8043575/webrev.2 > http://cr.openjdk.java.net/~sangheki/8043575/webrev.2_to_1 > Testing: hs-tier1~5 with/without ParallelRefProcEnabled There is a WIP webrev at http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ for the curious (I think accomodated the review comments so far too). Still needs at least some more testing. And I need to remove the parallel gc support. I am also not sure whether there has been a decision on how this feature should be enabled, i.e. which switches need to be active to use it. I made it so that if ParallelRefProcEnabled is true, and ReferencesPerThread > 0, we enable the dynamic number of reference processing threads. I.e. if ParallelRefProcEnabled is false, dynamic number of reference processing threads is also turned off (because the user requested to not do parallel ref processing at all). Thanks, Thomas From kim.barrett at oracle.com Thu Jun 7 17:20:38 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 7 Jun 2018 13:20:38 -0400 Subject: RFR: 8204538: Split ScanClosure and ScanClosureWithParBarrier In-Reply-To: References: Message-ID: > On Jun 7, 2018, at 6:16 AM, Stefan Karlsson wrote: > > Hi all, > > Please review this patch to change ScanClosureWithParBarrier to not inherit from ScanClosure. > > http://cr.openjdk.java.net/~stefank/8204538/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8204538 > > ScanClosureWithParBarrier is the only ExtendedOopClosure sub class that overrides the do_oop functions of a super class. This patch cleans this up to pave the way for: > > https://bugs.openjdk.java.net/browse/JDK-8204540 - Automatic oop closure devirtualization > > thanks, > StefanK I think the data members of ScanClosure can now be made private rather than protected. Otherwise, looks good. I don?t need a new webrev for that access change. From sangheon.kim at oracle.com Thu Jun 7 17:50:22 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Thu, 7 Jun 2018 10:50:22 -0700 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> Message-ID: <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> Hi Thomas, On 6/7/18 7:12 AM, Thomas Schatzl wrote: > Hi Sangheon, > > On Wed, 2018-06-06 at 23:38 -0700, sangheon.kim at oracle.com wrote: >> Hi Kim, >> >> Thanks for the review and sorry for replying late. >> I had to spend some time to figure out complete_gc closure stuff and >> re-running all tests again. > I took these .2 changes and started merging it with the JDK-8202845 > changes. > > As I suspected, the change gets a lot smaller and nicer with that. More > about that below. I briefly reviewed JDK-8202845, it is a nice enhancement! Probably I can't review it before my vacation though. :) > >> On 6/4/18 5:11 PM, Kim Barrett wrote: >>>> On Jun 1, 2018, at 5:48 PM, sangheon.kim at oracle.com wrote: >>>> >>>> Hi all, >>>> >>>> As webrev.0 is conflicting with webrev.0 of "8203319: JDK-8201487 >>>> disabled too much queue balancing"(out for review, but not yet >>>> pushed), I'm posting webrev.1. >>>> >>>> http://cr.openjdk.java.net/~sangheki/8043575/webrev.1 >>>> http://cr.openjdk.java.net/~sangheki/8043575/webrev.1_to_0 >>> The hookup of the changes into the various collectors seems >>> okay. My comments are mostly focused on the ReferenceProcessor >>> changes. >>> > CMS is okay with latest webrev, G1 looks it too. However there are > issues with parallel. From what I understand for all workers that are > "idle", you need to put "idle" tasks in the queue. > Parallel always seems to start all non-idle ("working" on an IdleTask) > worker threads all the time. > So in case with this change you want select a lower number of threads, > you first need to start all threads with some amount of IdleTask in the > queue (there are some helper methods in GCTaskManager). If you want a > higher number of threads, you need to release all idle workers, and > then prepopulate the task queue with a new amount of IdleTasks and the > actual worker tasks you want. > > Overall I think we should defer this feature for parallel gc too. Okay! > > >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessorPhaseTimes.hpp >>> 140 void set_phase_number(RefProcPhaseNumbers phase_number) { >>> _phase_number = phase_number; } >>> 141 RefProcPhaseNumbers phase_number() const { return >>> _phase_number; } >>> >>> I think I dislike these and how they are used. And see other >>> discussion below, suggesting they might not be needed. >>> >>> (Though they are similar to other related states. But maybe I >>> dislike those too.) >> I can avoid using it as you suggested below. >> >> But it is still needed at other locations in >> referenceProcessorPhaseTimes.cpp. If you are saying we need to avoid >> using this stuffs, I think it is out of scope for this patch. So, >> please file a CR. >> > I think most of these issues are kind of resolved with the JDK-8202845 > changes; it contains a lot of refactoring of that code. > >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessor.hpp >>> 641 class RefProcMTDegreeAdjuster : public StackObj { >>> ... >>> 649 RefProcMTDegreeAdjuster(ReferenceProcessor* rp, >>> ReferenceProcessorPhaseTimes* times, size_t ref_count); >>> >>> The times argument is here to provide information about what phase >>> we're in. That seems really indirect, and I'd really prefer that >>> information to be provided directly as arguments here. >> Webrev.2 uses ReferenceType and RefProcPhaseNumbers directly, please >> have a chance to compare with previous version. >> >>> I think that might also remove the need for the new phase_number >>> and set_phase_number functions for the phase times class. >> As I noted above, those are still used in other locations. >> >>> This also would flow through to ergo_proc_thread_count and >>> use_max_threads. >> Done. >> >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessor.cpp >>> 727 bool must_balance = mt_processing && >>> need_balance_queues(refs_lists); >>> >>> need_balance_queues returns the right answer for an initial >>> balancing, but not for a re-balance after some >>> processing. Specifically, further reducing the mt-degree may >>> require rebalancing, even if it wasn't previously needed. >>> >>> (This comment is assuming the updated version of >>> need_balance_queues that I sent out today. The situation is worse >>> without that.) >> Done. >> >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessor.cpp >>> 787 // For discovery_is_atomic() is true, phase 3 should be >>> processed to call do_void() from VoidClosure. >>> 788 // For discovery_is_atomic() is false, phase 3 can be >>> skipped if there are no references because do_void() is >>> 789 // already called at phase 2. >>> 790 if (!discovery_is_atomic()) { >>> 791 if (total_count(refs_lists) == 0) { >>> 792 return; >>> 793 } >>> 794 } >>> >>> The discovery_is_atomic stuff is left-over from before JDK-8203028. >>> Phase2 now never calls the complete_gc closure, because it is never >>> needed, regardless of the value of discovery_is_atomic. So >>> checking that can be removed here. >> Here are 2 things. >> 1) If we call complete_gc closure before exit, is it okay? >> - The short answer is no, as we face an assertion failure at CMS >> ParNew. The long answer is: For ParallelRefProcEnabled=true case, CMS >> ParNew is relying on to do some extra work by AbstractExecutor. So > :) > >> eventually ParScanThreadState::_young_old_boundary is set and then >> ParEvacuateFollowersClosure is used as a complete_gc closure. Just >> calling complete_gc closure that passed as a parameter doesn't help >> as it is EvacuateFollowersClosureGeneral> ScanClosureWithParBarrier>. i.e. complete_gc closure is different at >> process_discovered_references() vs. phase3, ProcessTask::work(). > Let's just look at the JDK-8202845 changes which seem to make this > issue non-existent. Also there is no CMS support, so there should be no > changes. > >> 2) Why 'ref_count' is not updated and then line 800 is just using old >> 'ref_count'? >> - If we use different number of workers(not exit at zero ref >> cases), we face a few types of assertion failures. IIRC, faced >> SIGSEGV too at copy_to_survivor_space(). >> - Each collector has own per worker state information, so I think >> it will not properly updated if we use different number of workers >> between phase2 and phase3. >> - This could be applied between phase1 and phase2, but I couldn't >> see any problem so far. So I left as is the exit path between phase1 >> and phase2. An alternative would be excluding exit among phases and >> address it from the CR. Previously there were another CR of JDK- >> 8181214. > ... and this one too. > >> * My previous version allowed only !discovery_is_atomic() cases would >> exit. With that condition, most cases are filtered out so CMS old gc >> and G1 CM would get the benefit. I never faced any crashes or >> assertion failures with it. But for the reason of 2) I'm removing >> exit path between phase2 and phase3 with some explanation why we >> can't exit. Please give me a better wording. > ... and that one. > > >>> At least, that's what's expected by the ReferenceProcessing >>> code. The atomic discovery case for phase2 has "always" (since at >>> least the beginning of the mercurial age) ignored the complete_gc >>> argument, and only used it when dealing with the concurrent >>> discovery case. >>> >>> But G1CopyingKeepAliveClosure appears to have always passed the >>> buck to the complete_gc closure, contrary to the expectations of >>> the RP phase2 code. So G1 young/mixed collections only work >>> because phase3 will eventually call the complete_gc closure (by >>> each of the threads). >>> (G1 concurrent and full gc's have a cuttoff where no work is left >>> for the complete_gc closure if the object is already marked. I spot >>> checked other collectors, and at a quick glance it looks like the >>> others meet the expectations of the ReferenceProcessor code; it's >>> just G1 young/mixed collections that don't.) >>> >>> I'm not entirely sure what happens if the number of threads in >>> phase3 is less than the number in phase2 in that case. I think any >>> pending phase2 work associated with the missing threads will get >>> stolen by phase3 working threads, but haven't verified that. >>> >>> I think the simplest fix for this is to change phase2 to always >>> call the complete_gc closure after all. For some collections >>> that's a (perhaps somewhat expensive) nop, but in the overall >>> context of reference processing that's probably in the noise. And >>> also update the relevant comme >>> The alternative would be to make G1CopyingKeepAliveClosure meet the >>> (not explicitly stated in the API) ReferenceProcessor expectations, >>> perhaps bypassing some of the G1ParScanThreadState machinery. >> I think this is already covered above. >> >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessor.cpp >>> 800 RefProcMTDegreeAdjuster a(this, phase_times, ref_count); >>> >>> This is using ref_count, which was last set on line 759, and not >>> updated after phase2. It should have been updated as part of the >>> new >>> code block around line 790. >> I think this is already answered above. >> >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessor.cpp >>> 1193 uint >>> ReferenceProcessor::ergo_proc_thread_count(ReferenceProcessorPhaseT >>> imes* times) const { >>> 1194 size_t ref_count = total_count(_discovered_refs); >>> 1195 >>> 1196 return ergo_proc_thread_count(ref_count, num_queues(), >>> times); >>> 1197 } >>> >>> I don't think this overload should exist. The ref_count used here >>> is the SoftReference count, which isn't particularly interesting >>> here. (It's not the total number of discovered references, which is >>> also not an interesting number for this purpose.) >>> >>> It seems to only exist as part of the workarounds for making ParNew >>> kind of limp along with these changes. But wouldn't it be simpler >>> to leave ParNewRefProceTaskExecutor::execute alone, using the >>> active_workers as before? (And recall that I suggested above that >>> execute should just use the currently configured mt-processing >>> degree, rather than adding an ergo_workers argument.) It will be >>> called in the context of RefProcMTDegreeAdjuster, that won't do >>> anything because of ParNew disables mt-degree adjustment. So I >>> think by leaving things alone here we retain the current behavior, >>> e.g. CMS ParNew uses ParallelRefProcEnabled as an all-or-nothing >>> flag, and doesn't pay attention to the new >>> ReferencesPerThread. And that seems reasonable to me for a >>> deprecated collector. >> Done. >> I agree for not supporting CMS so we don't need to add the overload >> version for CMS. >> >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/cms/parNewGeneration.cpp >>> 1455 false); // >>> disable adjusting queue size when processing references >>> >>> It's not the queue size that adjusted (or not), it's the number of >>> queues. And really, it's the MT processing degree. >>> >>> That last probably should also apply to the name of the variable in >>> the reference processor, and related names. Although given that >>> this is all a bit of a kludge to work around (deprecated) CMS >>> deficiencies, maybe I shouldn't be too concerned. And you did file >>> JDK-8203951. >>> >>> Though I see Thomas also requested a name change... >>> >>> If you change the name, remember to update JDK-8203951. >> Sure, I will update the name at JDK-8203951. >> webrev.2 is proposing "bool _adjust_no_of_processing_threads". >> >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessor.cpp >>> 1199 uint ReferenceProcessor::ergo_proc_thread_count(size_t >>> ref_count, >>> 1200 uint >>> max_threads, >>> 1201 ReferenceProce >>> ssorPhaseTimes* times) const { >>> ... >>> 1204 if (ReferencesPerThread == 0) { >>> 1205 return _num_queues; >>> 1206 } >>> >>> Why is _num_queues used here, but max_threads used elsewhere in >>> this function. I *think* this ought to also be max_threads. >> Agreed, so changed to return max_threads. >> Basically _num_queues and max_threads are same value here but for >> consistency, max_threads is better. :) >> >> As you may know, the intent of exiting here is to avoid SIGFPE at >> size_t thread_count = 1 + (ref_count / ReferencesPerThread); >> >> Basically this method will not be called if ReferencesPerThread == 0 >> because it will be filtered out at line 1235. i.e. ctor of >> RefProcMTDegreeAdjuster. >> But I wanted to have something to avoid SIGFPE within this method >> too. >> too. >> >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessor.cpp >>> 1229 >>> RefProcMTDegreeAdjuster::RefProcMTDegreeAdjuster(ReferenceProcessor >>> * rp, >>> ... >>> 1235 if (!_rp->has_adjustable_queue() || (ReferencesPerThread == >>> 0)) { >>> >>> This checks ReferencesPerThread for 0 to do nothing. And then >>> ergo_proc_thread_count similarly checks for 0 to (effectively) do >>> nothing. If the earlier suggestion to kill the 1-arg overload is >>> taken, then ergo_proc_thread_count is really just a helper for >>> RefProcMTDegreeAdjuster, and we don't need that special case twice. >> The check for ReferencesPerThread is to disable this feature. >> >>> And perhaps it should be a private helper in >>> RefProcMTDegreeAdjuster? >>> Similarly for use_max_threads. Or maybe just inline them into that >>> class's constructor. Good idea and it was one of my original patch >>> before considering to support CMS. >> Done. >> >>> ----------------------------------------------------------------- >>> ------------- >>> src/hotspot/share/gc/shared/referenceProcessor.cpp >>> 1199 uint ReferenceProcessor::ergo_proc_thread_count(size_t >>> ref_count, >>> ... >>> 1213 return (uint)MIN3(thread_count, >>> 1214 static_cast(max_threads), >>> 1215 (size_t)os::initial_active_processor_count >>> ()); >>> >>> I don't see why this should look at initial_active_processor_count? >>> >>> ----------------------------------------------------------------- >>> ------------- >> As I noted in pre-review email thread, it showed better results if I >> avoid using more than active cpus. >> This is a kind of enhancement for better performance, so even though >> the command-line option was set to use more than cpus(via >> ParalelGCThreads), I think it is good to limit. This is different >> from command-line option processing which lets users decide even >> though the given options would results in worse performance. >> >> Or if you are saying I should use "active_processor_count()"? If it >> is the case, I updated to use 'active_processor_count()' instead of >> 'initial_active_processor_count()'. >> >> I would prefer to have this unless there's any reason of not having >> it. > First, always use active_processor_count() for current thread > decisions. Initial_active_processor_count() is only here to get a > consistent value of the processor count during initialization. > > What has been your test to determine this? GCOld? Micro-benchmark which can create 1~3digit K of references. (mostly 8k, 32k, 128k) Specjvm2008.Derby as it generates maximum of 12k references. GCOld is my favorite to play with but it doesn't stress much references. > > I think this may be outdated as with recent changes scalability (with > G1) should have improved a lot. I tested before your rebuildRset patch, so a bit old. But I'm not sure how recent changes would affect to this case. i.e. if users sets quite big number than actual cpus, and then use those threads on ref.proc. etc.. This patch is suggesting not to rely on current setting in worst case. But I will not argue here, as you will the person to push! :) > >> ---- >> In addition: >> 1) I removed condition check of (num_queues() >1) because it would >> affect to a caller side expectation for per worker state information >> etc.. I never faced any problem though. >> 2) Minor changes such as renaming. e.g. >> RefProcMTDegreeAdjuster::_saved_num_queue to _saved_num_queues. >> >> Webrev: >> http://cr.openjdk.java.net/~sangheki/8043575/webrev.2 >> http://cr.openjdk.java.net/~sangheki/8043575/webrev.2_to_1 >> Testing: hs-tier1~5 with/without ParallelRefProcEnabled > There is a WIP webrev at > > http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ > > for the curious (I think accomodated the review comments so far too). > Still needs at least some more testing. > > And I need to remove the parallel gc support. Okay. > > I am also not sure whether there has been a decision on how this > feature should be enabled, i.e. which switches need to be active to use > it. > I made it so that if ParallelRefProcEnabled is true, and > ReferencesPerThread > 0, we enable the dynamic number of reference > processing threads. > I.e. if ParallelRefProcEnabled is false, dynamic number of reference > processing threads is also turned off (because the user requested to > not do parallel ref processing at all). That is how works on my patch, too. I should mention on my review request, thank you for bringing up this topic!! Previously we discussed for changing ParallelRefProcEnabled option to on/off/auto(enabling ReferencesPerThread). Of course, the name also should be changed too. It would be good to discuss under separate CR. thomas.webrev.3 looks good to me. Some minor comments: -------------------------- - I guess you didn't merge TestPrintReferences.java yet? So intentionally not included the file on webrev.3? -------------------------- src/hotspot/share/gc/shared/referenceProcessor.hpp 215?? bool??????? _adjust_no_of_processing_threads; 556?? // True, to allow ergonomically changing a number of processing threads based on references. 557?? bool _adjust_no_of_processing_threads; - There are 2 declaration of? _adjust_no_of_processing_threads. The latter one exists on other class. -------------------------- (pre-existing from my webrev) src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp 5128?? assert(workers->active_workers() == ergo_workers, 5129????????? "Ergonomically chosen workers(%u) should be less than or equal to active workers(%u)", 5130????????? ergo_workers, workers->active_workers()); - We can remove the assert or change the statement to remove 'less than or'. -------------------------- src/hotspot/share/gc/cms/parNewGeneration.cpp 797?? assert(workers->active_workers() == ergo_workers, 798????????? "Ergonomically chosen workers(%u) should be less than or equal to active workers(%u)", 799????????? ergo_workers, workers->active_workers()); - Same as above. -------------------------- - Same for pcTasks.cpp and psScavenge.cpp too if we decide not to include parallel gc in this patch. -------------------------- src/hotspot/share/gc/shared/referenceProcessor.cpp ?786?? RefProcMTDegreeAdjuster a(this, ?787???????????????????????????? RefPhase1, ?788???????????????????????????? num_soft_refs); - We can put at the same line. Same for other cases. Again thank you for taking this CR, Thomas! Sangheon > > Thanks, > Thomas > From stefan.karlsson at oracle.com Thu Jun 7 19:41:54 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 7 Jun 2018 21:41:54 +0200 Subject: RFR: 8204538: Split ScanClosure and ScanClosureWithParBarrier In-Reply-To: References: Message-ID: On 2018-06-07 19:20, Kim Barrett wrote: >> On Jun 7, 2018, at 6:16 AM, Stefan Karlsson wrote: >> >> Hi all, >> >> Please review this patch to change ScanClosureWithParBarrier to not inherit from ScanClosure. >> >> http://cr.openjdk.java.net/~stefank/8204538/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8204538 >> >> ScanClosureWithParBarrier is the only ExtendedOopClosure sub class that overrides the do_oop functions of a super class. This patch cleans this up to pave the way for: >> >> https://bugs.openjdk.java.net/browse/JDK-8204540 - Automatic oop closure devirtualization >> >> thanks, >> StefanK > I think the data members of ScanClosure can now be made private rather than protected. Sure. I'll update the patch. > Otherwise, looks good. I don?t need a new webrev for that access change. > Thanks for reviewing. StefanK From thomas.schatzl at oracle.com Fri Jun 8 07:35:27 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 08 Jun 2018 09:35:27 +0200 Subject: Potential optimization to the GC termination protocol in JDK8 In-Reply-To: <4aa5e12a-94b9-6ac8-f0e0-e2d48f6593f7@redhat.com> References: <9b20437c-86e5-f413-e5e4-7f2089fc4182@oracle.com> <4aa5e12a-94b9-6ac8-f0e0-e2d48f6593f7@redhat.com> Message-ID: <3ad6900186a0f630d5fed599c543ce885e431127.camel@oracle.com> Hi, sorry for being somewhat late in that thread... busy lately... On Thu, 2018-04-26 at 08:53 +0200, Roman Kennke wrote: > > > [Problem] > > > The work stealing during the GC takes lots of time to terminate. > > > The Parallel Scavenge in JDK 8 uses a distributed termination > > > protocol to synchronize GC threads. After 2 ? N consecutive > > > unsuccessful steal attempts (steal_best_of_2 function), a GC > > > thread enters the termination procedure, where N is the number of > > > GC threads. > > > > > > Suppose there are N GC threads, it takes 2 * N * N failed > > > attempts before a GC stops. It is inefficient and takes too much > > > time, especially when there are very few GC threads alive. If > > > there are hundreds or thousands of GC happen during the app > > > execution, that is a big waste of time. > > > > > > [Solution] > > > Is that possible to reduce the number of steal attempt during the > > > end of GC? My idea is to record the number of active GC threads > > > (i.e., N_live) that are not yet in the termination protocol. A > > > thread only steals from the pool of active GC threads because the > > > inactive GC threads have no task to be stolen. Accordingly, the > > > criteria for thread termination becomes 2 ? N_live consecutive > > > failed steal attempts. It can reduce the steal attempts to half. > > > > > > Good forward for others' feedbacks. Sure we are interested in such an enhancement and would appreciate a contribution. However new work should be conducted in the latest version only, i.e. at this time JDK 11 (and probably JDK 12 by the time such a change would be pushed given the FC date of end of June). Maybe by coincidence you might be one of the authors of a recent paper [1] demonstrating this? If you are interested in contributing, please have a look at the corresponding wiki page [4]. > > > Thanks. > > > Tony > > Hi Tony, > in Shenandoah we have implemented an improved termination protocol. > See the comment here: > > http://hg.openjdk.java.net/shenandoah/jdk/file/b4a3595f7c56/src/hotsp > ot/share/gc/shenandoah/shenandoahTaskqueue.hpp#l93 > > We intend to upstream this stuff very soon, at which point it can be > used by all other GCs if they wish. It's a drop-in replacement for > the existing task queue code. > > It sounds like it's similar to what you have in mind. Right? The ShenandoahTaskTerminator is different in that it optimizes the amount of woken up threads and the detection of that there is work to steal. The suggestion presented here improves work stealing itself, i.e. how the woken up threads search for threads. Note that I think that with the ShenandoahTaskTerminator such an implementation would be easier to implement than with the existing ParallelTaskTerminator. There are other ideas and information about known not-optimal- implementation floating around in literature (e.g. [1][2][3], but recently I looked there are a lots) to improve this even further. Thanks, Thomas [1] Kun Suo, Jia Rao, Hong Jiang, and Witawas Srisa-an. 2018. Characterizing and optimizing hotspot parallel garbage collection on multicore systems. In Proceedings of the Thirteenth EuroSys Conference (EuroSys '18). ACM, New York, NY, USA, Article 35, 15 pages. DOI: https ://doi.org/10.1145/3190508.3190512 [2] Junjie Qian, Witawas Srisa-an, Du Li, Hong Jiang, Sharad Seth, and Yaodong Yang. 2015. SmartStealing: Analysis and Optimization of Work Stealing in Parallel Garbage Collection for Java VM. In Proceedings of the Principles and Practices of Programming on The Java Platform (PPPJ '15). ACM, New York, NY, USA, 170-181. DOI: http://dx.doi.org/10.1145/2 807426.2807441 [3] Lokesh Gidra, Ga?l Thomas, Julien Sopena, Marc Shapiro, and Nhan Nguyen. 2015. NumaGiC: a Garbage Collector for Big Data on Big NUMA Machines. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '15). ACM, New York, NY, USA, 661-673. DOI: https://doi.org/10. 1145/2694344.2694361 [4] http://openjdk.java.net/contribute/ From thomas.schatzl at oracle.com Fri Jun 8 08:52:57 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 08 Jun 2018 10:52:57 +0200 Subject: RFR (S): 8204082: Add indication that this is the "Last Young GC before Mixed" to logs In-Reply-To: References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> Message-ID: Hi Stefan, On Thu, 2018-06-07 at 13:13 +0200, Stefan Johansson wrote: > Hi Thomas, > > On 2018-06-05 22:26, Thomas Schatzl wrote: > > Hi, > > > > On Wed, 2018-05-30 at 14:50 +0200, Thomas Schatzl wrote: > > > Hi all, > > > > > > can I have reviews for this change that adds an indication > > > about the "last young" gc before a mixed phase to the logs? > > > [...] > > > I am not too hung up about naming it "Last Young" in particular, > > > but I really would like to have this or a similar indication that > > > is generally available with minimal logging. > > > > Alternatively we could just call it "Mixed" that does not collect > > any old gen regions at this time. Later we could start gc as soon > > as the pause time goal permits and add a few old gen regions. > > > > I think I like this better than "Last Young" or "Final Young". > > > This would probably be a little confusing to users, but at least > > help diagnosing logs. > > > > There also has been the suggestion to call it "Final Young" instead > > of "Last Young". > > > > Any comments? > > Another possible solution, not sure it's better would be to an extra > tag to all Young GCs, something like: > Pause Young (Initial Mark) ... > Pause Young (Normal) ... > Pause Young (Finished Mark) ... > Pause Young (Mixed) ... > > But I guess it's a bit harsh to call the Mixed GCs Young =) The more I look at it the more I like this variant. > > A third solution would be to somehow use the GC cause to mark this > state. > > As usual naming is hard, I think I'm leaning towards just calling it > Mixed and then we know that the first mixed GC is always young only > =) Any other opinions? Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 8 12:46:41 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 08 Jun 2018 14:46:41 +0200 Subject: JEP draft: Dynamic Max Memory Limit [Was. Re: Elastic JVM improvements] In-Reply-To: References: <3d1484befe75b5296589decc1b0df05cdbefac29.camel@oracle.com> Message-ID: Hi Rodrigo, sorry for the late reply, this email somehow got unanswered... On Sun, 2018-06-03 at 17:44 +0200, Rodrigo Bruno wrote: > Hi Thomas, > > further suggestions and rearrangements on this JEP draft follow > below: > > Goals: > > The goal of this JEP is to allow the increase and/or decrease of the > amount of memory available to the application. > > Non-Goals: > > It is not a goal to change current heap sizing algorithms. > It is not a goal to dynamically change the maximum heap size limit as > this would require a lot of engineering > effort (maybe in the future). > > Success Metrics: > > The implementation should allow a user to increase and/or reduce the > amount of memory that can be used by the > application. This must be possible at any point during the execution > of an application. If it is not possible to > increase or decrease the amount of memory available to the > application, the operation should fail and the user > must be aware of the result of the operation. > > Motivation: > > Elasticity is the key feature of the cloud computing. It enables to > scale resources according to application workloads > timely. Now we live in the container era. Containers can be scaled > vertically on the fly without downtime. This provides > much better elasticity and density compared to VMs. However, JVM- > based applications are not fully container-ready. > One of the current issues is the fact that it is not possible to > increase the size of JVM Heap in runtime. If your production > application has an unpredictable traffic spike, the only one way to > increase the Heap size is to restart the JVM with a > new Xmx parameter. > > Alternatives: > > There are two alternatives: > 1 - restart a JVM whenever the application needs more or less memory. > This will adapt the memory usage of the JVM > to the application needs at the cost of downtime (which can be > prohibitive for many applications). > 2 - grant a large maximum memory limit. This will eventually lead to > resource wastage. > > Testing: > > Section 5.4 of the paper available at http://www.gsd.inesc-id.pt/~rbr > uno/publications/rbruno-ismm18.pdf shows that having > a very high maxium memory limit (-Xmx) leads to a very small increase > in the memory footprint. For example, increasing > -Xmx by 32GB increases the footprint by 31MB. I updated the text accordingly; I did some minor changes so that "memory" was always called out as Java heap memory as opposed to memory used for VM internal data structures. Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 8 14:52:47 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 08 Jun 2018 16:52:47 +0200 Subject: RFR (XS): 8204618: The parallel GC reference processing task executor enqueues a wrong number of tasks into the queue Message-ID: Hi all, can I have reviews for this change that fixes some issue with parallel gc reference processing task executor putting too many (always max number of threads) reference processing tasks into the work queue? The effects are benign (at least we have not observed any bad effects so far since enabling -XX:UseDynamicNumberOfGCThreads) in that reference processing will be called multiple times by the same active threads, basically doing nothing (because the corresponding ref proc queues are empty after the first time they are processed). CR: https://bugs.openjdk.java.net/browse/JDK-8204618 Webrev: http://cr.openjdk.java.net/~tschatzl/8204618/webrev/ Testing: hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 8 14:52:48 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 08 Jun 2018 16:52:48 +0200 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> Message-ID: Hi Sangheon, On Thu, 2018-06-07 at 10:50 -0700, sangheon.kim at oracle.com wrote: > Hi Thomas, > > On 6/7/18 7:12 AM, Thomas Schatzl wrote: > > Hi Sangheon, > > > > > [...] > > > As I noted in pre-review email thread, it showed better results > > > if I avoid using more than active cpus. > > > This is a kind of enhancement for better performance, so even > > > though the command-line option was set to use more than cpus(via > > > ParalelGCThreads), I think it is good to limit. This is different > > > from command-line option processing which lets users decide even > > > though the given options would results in worse performance. > > > > > > Or if you are saying I should use "active_processor_count()"? If > > > it is the case, I updated to use 'active_processor_count()' > > > instead of 'initial_active_processor_count()'. > > > > > > I would prefer to have this unless there's any reason of not > > > having it. > > > > First, always use active_processor_count() for current thread > > decisions. Initial_active_processor_count() is only here to get a > > consistent value of the processor count during initialization. > > > > What has been your test to determine this? GCOld? > > Micro-benchmark which can create 1~3digit K of references. (mostly > 8k, 32k, 128k) Specjvm2008.Derby as it generates maximum of 12k > references. > > GCOld is my favorite to play with but it doesn't stress much > references. I will look into this some more. > > > > I think this may be outdated as with recent changes scalability > > (with G1) should have improved a lot. > > I tested before your rebuildRset patch, so a bit old. > But I'm not sure how recent changes would affect to this case. i.e. > if users sets quite big number than actual cpus, and then use those > threads on ref.proc. etc.. This patch is suggesting not to rely on > current setting in worst case. > > But I will not argue here, as you will the person to push! :) There have been improvements to actual reference processing in G1 that improved throughput for reference processing significantly. [...] > > > > There is a WIP webrev at > > > > http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ > > > > for the curious (I think accomodated the review comments so far > > too). > > Still needs at least some more testing. > > > > And I need to remove the parallel gc support. > > Okay. > > > > > I am also not sure whether there has been a decision on how this > > feature should be enabled, i.e. which switches need to be active to > > use it. > > I made it so that if ParallelRefProcEnabled is true, and > > ReferencesPerThread > 0, we enable the dynamic number of reference > > processing threads. > > I.e. if ParallelRefProcEnabled is false, dynamic number of > > reference processing threads is also turned off (because the user > > requested to not do parallel ref processing at all). > > That is how works on my patch, too. > I should mention on my review request, thank you for bringing up > this topic!! I filed a preliminary CSR at https://bugs.openjdk.java.net/browse/JDK-8 204612 for this change. I.e. - make ParallelRefProcEnabled with dynamic selection of thread numbers the default for G1. - (I do not think we need the CSR for the ReferencesPerThread flag, but I did anyway). > Previously we discussed for changing ParallelRefProcEnabled option > to on/off/auto(enabling ReferencesPerThread). Of course, the name > also should be changed too. It would be good to discuss under > separate CR. > > thomas.webrev.3 looks good to me. I wasn't expecting a review already :) > > Some minor comments: > -------------------------- > - I guess you didn't merge TestPrintReferences.java yet? So > intentionally not included the file on webrev.3? I had to fix it a bit for the latest version of webrev.3, which I think is ready for review. > -------------------------- > src/hotspot/share/gc/shared/referenceProcessor.hpp > 215 bool _adjust_no_of_processing_threads; > > 556 // True, to allow ergonomically changing a number of > processing > threads based on references. > 557 bool _adjust_no_of_processing_threads; > - There are 2 declaration of _adjust_no_of_processing_threads. The > latter one exists on other class. > > -------------------------- > (pre-existing from my webrev) > src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp > 5128 assert(workers->active_workers() == ergo_workers, > 5129 "Ergonomically chosen workers(%u) should be less than > or > equal to active workers(%u)", > 5130 ergo_workers, workers->active_workers()); > - We can remove the assert or change the statement to remove 'less > than or'. > > -------------------------- > src/hotspot/share/gc/cms/parNewGeneration.cpp > 797 assert(workers->active_workers() == ergo_workers, > 798 "Ergonomically chosen workers(%u) should be less than > or > equal to active workers(%u)", > 799 ergo_workers, workers->active_workers()); > - Same as above. > > -------------------------- > - Same for pcTasks.cpp and psScavenge.cpp too if we decide not to > include parallel gc in this patch. > > -------------------------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > 786 RefProcMTDegreeAdjuster a(this, > 787 RefPhase1, > 788 num_soft_refs); > - We can put at the same line. Same for other cases. > > Again thank you for taking this CR, Thomas! I think I fixed all this. Webrev is at http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ . This webrev is based on top of latest jdk/jdk and https://bugs.openjdk. java.net/browse/JDK-8202845 . If you want to test parallel gc, you also need the fixes for JDK- 8204617 and JDK-8204618 currently out for review. Testing: hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 8 14:52:51 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 08 Jun 2018 16:52:51 +0200 Subject: RFR (XXS): 8204617: ParallelGC parallel reference processing does not set MT degree in reference processor Message-ID: <46467c2aadce0f02bcfdcbcf202caae3446bffbb.camel@oracle.com> Hi all, can I have reviews for this small patch that properly sets the MT degree in the parallel gc full gc? Otherwise some assert to be introduced in 8043575: Dynamically parallelize reference processing work will fire for any application using parallel gc that does not use the full amount of available gc threads (by UseDynamicNumberOfGCThreads). CR: https://bugs.openjdk.java.net/browse/JDK-8204617 Webrev: http://cr.openjdk.java.net/~tschatzl/8204617/webrev/ Testing: hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 8 17:08:05 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 08 Jun 2018 19:08:05 +0200 Subject: Potential optimization to the GC termination protocol in JDK8 In-Reply-To: References: <9b20437c-86e5-f413-e5e4-7f2089fc4182@oracle.com> <4aa5e12a-94b9-6ac8-f0e0-e2d48f6593f7@redhat.com> <3ad6900186a0f630d5fed599c543ce885e431127.camel@oracle.com> Message-ID: <0be639b50c6b47970abb674acac3f58de5156e86.camel@oracle.com> Hi, On Fri, 2018-06-08 at 11:54 -0500, T S wrote: > On Fri, Jun 8, 2018 at 2:35 AM Thomas Schatzl com> wrote: > > > > Hi, > > > > sorry for being somewhat late in that thread... busy lately... > > > > On Thu, 2018-04-26 at 08:53 +0200, Roman Kennke wrote: > > > > > [Problem] > > > > The work stealing during the GC takes lots of time to > > > > terminate. > > > > > [...] > > Sure we are interested in such an enhancement and would appreciate > > a contribution. However new work should be conducted in the latest > > version only, i.e. at this time JDK 11 (and probably JDK 12 by the > > time such a change would be pushed given the FC date of end of > > June). > > > > Hi Thomas, > > Could you please send me a link for the JDK 11 or 12? I cannot find > an official link on the openjdk website. > > http://hg.openjdk.java.net/shenandoah/jdk/ > Is it the latest JDK under developed? The always latest development happens in http://hg.openjdk.java.net/jdk /jdk . From this, at regular intervals, we branch off to make releases. The next branch will be jdk11, forked off around end of June, that's why we are talking about "we are working on jdk11" (and jdk11-ea releases are being made from jdk/jdk already all the time). As soon as the fork happened, everything that gets into jdk/jdk will get into jdk12. http://hg.openjdk.java.net/shenandoah/jdk/ is a fork of jdk/jdk done at some point, and kept up to date by maintainers. We typically do such parallel development for larger features only, like in this case the Shenandoah GC. When all changes from shenandoah/jdk are merged (when the feature is "done"), that branch may be closed for further development and all Shenandoah-related development happens in jdk/jdk. People are of course free to continue working in shenandoah/jdk :) About your questions: the task terminator/stealing code did not change for a looooong time. Most likely, apart from path changes of the files, it applies cleanly. Please, if you intend to contribute, make sure that step 0 on the "How to contribute" [1] page has been done. Otherwise we (at Oracle at least) are not allowed to look at your changes due to legal reasons. This process typically takes a week or so. Otherwise, when you are eager to contribute and want others to look at your patches, the first thing that happens is that everyone needs to wait instead of work... Thanks, Thomas [1] http://openjdk.java.net/contribute/ From hohensee at amazon.com Fri Jun 8 17:13:35 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 8 Jun 2018 17:13:35 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <9f972398-115f-06ad-1e0c-513abceb097a@oracle.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> <15754d34-f3c7-6431-7abf-05214bd6b9d2@oracle.com> <93417506-8e12-6ad3-690a-439e36719652@oracle.com> <99C937B6-F67B-4F4A-ACC8-444B5F83932B@amazon.com> <3922628B-81DE-4904-AEFE-F4163EB1E655@amazon.com> <8444d766-8adb-013f-f75b-b5d4420635df@oracle.com> <5768E4BF-5556-455B-899B-94A67BF940A0@amazon.com> <3E15AAD8-5017-4CCB-B927-6B31FD0D7809@amazon.com> <22216AD4-9A78-4427-9AB8-629EC685C296@amazon.com> <9f972398-115f-06ad-1e0c-513abceb097a@oracle.com> Message-ID: <48344C2E-2423-4CB1-A5C3-09223CA1ED78@amazon.com> Back after a long hiatus... Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ TIA for your re-review. Plus, may I have another reviewer look at it please? Paul ?On 2/26/18, 8:47 AM, "Erik Helin" wrote: Hi Paul, a couple of comments on the patch: - memoryService.hpp: + 150 bool countCollection, + 151 bool allMemoryPoolsAffected = true); There is no need to use a default value for the parameter allMemoryPoolsAffected here. Skipping the default value also allows you to put allMemoryPoolsAffected to TraceMemoryManager::initialize in the same relative position as for the constructor parameter (this will make the code more uniform and easier to follow). - memoryManager.cpp Instead of adding a default parameter, maybe add a new method? Something like GCMemoryManager::add_not_always_affected_pool() (I couldn't come up with a shorter name at the moment). - TestMixedOldGenCollectionUsage.java The test is too strict about how and when collections should occur. Tests written this way often become very brittle, they might e.g. fail to finish a concurrent mark on time on a very slow, single core, machine. It is better to either force collections by using the WhiteBox API or make the test more lenient. Thanks, Erik On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > Ping for a review please. > > Thanks, > > Paul > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > From the original RR: > > > The bug is that from the JMX point of view, G1?s incremental collector > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > survivor and eden spaces. In fact, mixed collections run by this > > collector also affect the G1 old generation. > > > > This proposed fix is to record, for each of a JMX garbage collector's > > memory pools, whether that memory pool is affected by all collections > > using that collector. And, for each collection, record whether or not > > all the collector's memory pools are affected. After each collection, > > for each memory pool, if either all the collector's memory pools were > > affected or the memory pool is affected for all collections, record > > CollectionUsage for that pool. > > > > For collectors other than G1 Young Generation, all pools are recorded as > > affected by all collections and every collection is recorded as > > affecting all the collector?s memory pools. For the G1 Young Generation > > collector, the G1 Old Gen pool is recorded as not being affected by all > > collections, and non-mixed collections are recorded as not affecting all > > memory pools. The result is that for non-mixed collections, > > CollectionUsage is recorded after a collection only the G1 Eden Space > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > is recorded for G1 Old Gen as well. > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > CollectionUsage, the only external behavior change is that > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > rather than 2. > > > > With this fix, a collector?s memory pools can be divided into two > > disjoint subsets, one that participates in all collections and one that > > doesn?t. This is a bit more general than the minimum necessary to fix > > G1, but not by much. Because I expect it to apply to other incremental > > region-based collectors, I went with the more general solution. I > > minimized the amount of code I had to touch by using default parameters > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > From suokunstar at gmail.com Fri Jun 8 16:16:43 2018 From: suokunstar at gmail.com (T S) Date: Fri, 8 Jun 2018 11:16:43 -0500 Subject: Potential optimization to the GC termination protocol in JDK8 In-Reply-To: <3ad6900186a0f630d5fed599c543ce885e431127.camel@oracle.com> References: <9b20437c-86e5-f413-e5e4-7f2089fc4182@oracle.com> <4aa5e12a-94b9-6ac8-f0e0-e2d48f6593f7@redhat.com> <3ad6900186a0f630d5fed599c543ce885e431127.camel@oracle.com> Message-ID: On Fri, Jun 8, 2018 at 2:35 AM Thomas Schatzl wrote: > > Hi, > Hi Thomas, Thanks for your feedback. > sorry for being somewhat late in that thread... busy lately... > > On Thu, 2018-04-26 at 08:53 +0200, Roman Kennke wrote: > > > > [Problem] > > > > The work stealing during the GC takes lots of time to terminate. > > > > The Parallel Scavenge in JDK 8 uses a distributed termination > > > > protocol to synchronize GC threads. After 2 ? N consecutive > > > > unsuccessful steal attempts (steal_best_of_2 function), a GC > > > > thread enters the termination procedure, where N is the number of > > > > GC threads. > > > > > > > > Suppose there are N GC threads, it takes 2 * N * N failed > > > > attempts before a GC stops. It is inefficient and takes too much > > > > time, especially when there are very few GC threads alive. If > > > > there are hundreds or thousands of GC happen during the app > > > > execution, that is a big waste of time. > > > > > > > > [Solution] > > > > Is that possible to reduce the number of steal attempt during the > > > > end of GC? My idea is to record the number of active GC threads > > > > (i.e., N_live) that are not yet in the termination protocol. A > > > > thread only steals from the pool of active GC threads because the > > > > inactive GC threads have no task to be stolen. Accordingly, the > > > > criteria for thread termination becomes 2 ? N_live consecutive > > > > failed steal attempts. It can reduce the steal attempts to half. > > > > > > > > Good forward for others' feedbacks. > > Sure we are interested in such an enhancement and would appreciate a > contribution. However new work should be conducted in the latest > version only, i.e. at this time JDK 11 (and probably JDK 12 by the time > such a change would be pushed given the FC date of end of June). > Currently, I worked on the JDK 8. I am not sure whether there exist lots of code changes in the latest JDK 11 or 12. > Maybe by coincidence you might be one of the authors of a recent paper > [1] demonstrating this? If you are interested in contributing, please > have a look at the corresponding wiki page [4]. > Yes, I am one of the authors of the paper. > > > > Thanks. > > > > Tony > > > > Hi Tony, > > in Shenandoah we have implemented an improved termination protocol. > > See the comment here: > > > > http://hg.openjdk.java.net/shenandoah/jdk/file/b4a3595f7c56/src/hotsp > > ot/share/gc/shenandoah/shenandoahTaskqueue.hpp#l93 > > > > We intend to upstream this stuff very soon, at which point it can be > > used by all other GCs if they wish. It's a drop-in replacement for > > the existing task queue code. > > > > It sounds like it's similar to what you have in mind. Right? > > The ShenandoahTaskTerminator is different in that it optimizes the > amount of woken up threads and the detection of that there is work to > steal. > Thanks for the introduction of ShenandoahTaskTerminator. I will read the code later. Actually, our optimization on GC terminator is similar as you described. We also optimize stealing in two ways: 1, the amount of live GC threads to control the steal total number; 2, who or which thread to steal. Our modification of steal on JDK 8 can be found on: https://github.com/tonys24/optimized-steal The basic idea of our work is simple: To realize target 1, we replace _n with live_number for live GC threads. https://github.com/tonys24/optimized-steal/blob/master/file6.diff To realize target 2, we replace steal_best_of_2 with steal_best_of_2_kun. We picked two threads, one is stolen successfully last time and the other is picked randomly. Then we select the one which has longer queue size. I will read the code of ShenandoahTaskTerminator to see how can we contribute to the latest code of OpenJDK. Thank you so much for your email. Best Tony > The suggestion presented here improves work stealing itself, i.e. how > the woken up threads search for threads. > > Note that I think that with the ShenandoahTaskTerminator such an > implementation would be easier to implement than with the existing > ParallelTaskTerminator. > > There are other ideas and information about known not-optimal- > implementation floating around in literature (e.g. [1][2][3], but > recently I looked there are a lots) to improve this even further. > > Thanks, > Thomas > > [1] Kun Suo, Jia Rao, Hong Jiang, and Witawas Srisa-an. 2018. > Characterizing and optimizing hotspot parallel garbage collection on > multicore systems. In Proceedings of the Thirteenth EuroSys Conference > (EuroSys '18). ACM, New York, NY, USA, Article 35, 15 pages. DOI: https > ://doi.org/10.1145/3190508.3190512 > [2] Junjie Qian, Witawas Srisa-an, Du Li, Hong Jiang, Sharad Seth, and > Yaodong Yang. 2015. SmartStealing: Analysis and Optimization of Work > Stealing in Parallel Garbage Collection for Java VM. In Proceedings of > the Principles and Practices of Programming on The Java Platform (PPPJ > '15). ACM, New York, NY, USA, 170-181. DOI: http://dx.doi.org/10.1145/2 > 807426.2807441 > [3] Lokesh Gidra, Ga?l Thomas, Julien Sopena, Marc Shapiro, and Nhan > Nguyen. 2015. NumaGiC: a Garbage Collector for Big Data on Big NUMA > Machines. In Proceedings of the Twentieth International Conference on > Architectural Support for Programming Languages and Operating Systems > (ASPLOS '15). ACM, New York, NY, USA, 661-673. DOI: https://doi.org/10. > 1145/2694344.2694361 > [4] http://openjdk.java.net/contribute/ -- ********************************** > Tony Suo > Computer Science, University of Texas at Arlington ********************************** From suokunstar at gmail.com Fri Jun 8 16:54:37 2018 From: suokunstar at gmail.com (T S) Date: Fri, 8 Jun 2018 11:54:37 -0500 Subject: Potential optimization to the GC termination protocol in JDK8 In-Reply-To: <3ad6900186a0f630d5fed599c543ce885e431127.camel@oracle.com> References: <9b20437c-86e5-f413-e5e4-7f2089fc4182@oracle.com> <4aa5e12a-94b9-6ac8-f0e0-e2d48f6593f7@redhat.com> <3ad6900186a0f630d5fed599c543ce885e431127.camel@oracle.com> Message-ID: On Fri, Jun 8, 2018 at 2:35 AM Thomas Schatzl wrote: > > Hi, > > sorry for being somewhat late in that thread... busy lately... > > On Thu, 2018-04-26 at 08:53 +0200, Roman Kennke wrote: > > > > [Problem] > > > > The work stealing during the GC takes lots of time to terminate. > > > > The Parallel Scavenge in JDK 8 uses a distributed termination > > > > protocol to synchronize GC threads. After 2 ? N consecutive > > > > unsuccessful steal attempts (steal_best_of_2 function), a GC > > > > thread enters the termination procedure, where N is the number of > > > > GC threads. > > > > > > > > Suppose there are N GC threads, it takes 2 * N * N failed > > > > attempts before a GC stops. It is inefficient and takes too much > > > > time, especially when there are very few GC threads alive. If > > > > there are hundreds or thousands of GC happen during the app > > > > execution, that is a big waste of time. > > > > > > > > [Solution] > > > > Is that possible to reduce the number of steal attempt during the > > > > end of GC? My idea is to record the number of active GC threads > > > > (i.e., N_live) that are not yet in the termination protocol. A > > > > thread only steals from the pool of active GC threads because the > > > > inactive GC threads have no task to be stolen. Accordingly, the > > > > criteria for thread termination becomes 2 ? N_live consecutive > > > > failed steal attempts. It can reduce the steal attempts to half. > > > > > > > > Good forward for others' feedbacks. > > Sure we are interested in such an enhancement and would appreciate a > contribution. However new work should be conducted in the latest > version only, i.e. at this time JDK 11 (and probably JDK 12 by the time > such a change would be pushed given the FC date of end of June). > Hi Thomas, Could you please send me a link for the JDK 11 or 12? I cannot find an official link on the openjdk website. http://hg.openjdk.java.net/shenandoah/jdk/ Is it the latest JDK under developed? Thanks. Tony > Maybe by coincidence you might be one of the authors of a recent paper > [1] demonstrating this? If you are interested in contributing, please > have a look at the corresponding wiki page [4]. > > > > > Thanks. > > > > Tony > > > > Hi Tony, > > in Shenandoah we have implemented an improved termination protocol. > > See the comment here: > > > > http://hg.openjdk.java.net/shenandoah/jdk/file/b4a3595f7c56/src/hotsp > > ot/share/gc/shenandoah/shenandoahTaskqueue.hpp#l93 > > > > We intend to upstream this stuff very soon, at which point it can be > > used by all other GCs if they wish. It's a drop-in replacement for > > the existing task queue code. > > > > It sounds like it's similar to what you have in mind. Right? > > The ShenandoahTaskTerminator is different in that it optimizes the > amount of woken up threads and the detection of that there is work to > steal. > > The suggestion presented here improves work stealing itself, i.e. how > the woken up threads search for threads. > > Note that I think that with the ShenandoahTaskTerminator such an > implementation would be easier to implement than with the existing > ParallelTaskTerminator. > > There are other ideas and information about known not-optimal- > implementation floating around in literature (e.g. [1][2][3], but > recently I looked there are a lots) to improve this even further. > > Thanks, > Thomas > > [1] Kun Suo, Jia Rao, Hong Jiang, and Witawas Srisa-an. 2018. > Characterizing and optimizing hotspot parallel garbage collection on > multicore systems. In Proceedings of the Thirteenth EuroSys Conference > (EuroSys '18). ACM, New York, NY, USA, Article 35, 15 pages. DOI: https > ://doi.org/10.1145/3190508.3190512 > [2] Junjie Qian, Witawas Srisa-an, Du Li, Hong Jiang, Sharad Seth, and > Yaodong Yang. 2015. SmartStealing: Analysis and Optimization of Work > Stealing in Parallel Garbage Collection for Java VM. In Proceedings of > the Principles and Practices of Programming on The Java Platform (PPPJ > '15). ACM, New York, NY, USA, 170-181. DOI: http://dx.doi.org/10.1145/2 > 807426.2807441 > [3] Lokesh Gidra, Ga?l Thomas, Julien Sopena, Marc Shapiro, and Nhan > Nguyen. 2015. NumaGiC: a Garbage Collector for Big Data on Big NUMA > Machines. In Proceedings of the Twentieth International Conference on > Architectural Support for Programming Languages and Operating Systems > (ASPLOS '15). ACM, New York, NY, USA, 661-673. DOI: https://doi.org/10. > 1145/2694344.2694361 > [4] http://openjdk.java.net/contribute/ From tom.rodriguez at oracle.com Fri Jun 8 20:46:35 2018 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 8 Jun 2018 13:46:35 -0700 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV Message-ID: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> The JVMCI API may read Klass* and java.lang.Class instances from locations which G1 would consider to be weakly referenced. This can result in HotSpotResolvedObjectTypeImpl instances with references to Classes that have been unloaded. In the this crash, JVMCI was reading a Klass* from the profile in an MDO and building a wrapper around it. The MDO reference is weak and was the only remaining reference to the type so it could be dropped resulting in an eventual crash. I've added an explicit G1 enqueue before we call out to create the wrapper object but is there a more recommended way of doing this? Dean had pointed out the oddly named InstanceKlass::holder_phantom which is used by the CI. Should I be using that? The G1 barrier is only really need when reading from non-Java heap memory but since the get_jvmci_type method is the main entry point for this logic it safest to always perform it in this path. https://bugs.openjdk.java.net/browse/JDK-8198909 http://cr.openjdk.java.net/~never/8198909/webrev From kim.barrett at oracle.com Fri Jun 8 21:36:39 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 8 Jun 2018 17:36:39 -0400 Subject: RFR (XXS): 8204617: ParallelGC parallel reference processing does not set MT degree in reference processor In-Reply-To: <46467c2aadce0f02bcfdcbcf202caae3446bffbb.camel@oracle.com> References: <46467c2aadce0f02bcfdcbcf202caae3446bffbb.camel@oracle.com> Message-ID: <88731143-11E9-48CA-99C5-5BCFD4E1DF6A@oracle.com> > On Jun 8, 2018, at 10:52 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this small patch that properly sets the MT > degree in the parallel gc full gc? > > Otherwise some assert to be introduced in 8043575: Dynamically > parallelize reference processing work will fire for any application > using parallel gc that does not use the full amount of available gc > threads (by UseDynamicNumberOfGCThreads). > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204617 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8204617/webrev/ > Testing: > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled > > Thanks, > Thomas Looks good. From kim.barrett at oracle.com Fri Jun 8 21:39:40 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 8 Jun 2018 17:39:40 -0400 Subject: RFR (XS): 8204618: The parallel GC reference processing task executor enqueues a wrong number of tasks into the queue In-Reply-To: References: Message-ID: <345EF0CC-0646-44B2-A155-AF762D21A505@oracle.com> > On Jun 8, 2018, at 10:52 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this change that fixes some issue with > parallel gc reference processing task executor putting too many (always > max number of threads) reference processing tasks into the work queue? > > The effects are benign (at least we have not observed any bad effects > so far since enabling -XX:UseDynamicNumberOfGCThreads) in that > reference processing will be called multiple times by the same active > threads, basically doing nothing (because the corresponding ref proc > queues are empty after the first time they are processed). > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204618 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8204618/webrev/ > Testing: > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled > > Thanks, > Thomas Looks good. From sangheon.kim at oracle.com Sat Jun 9 05:09:09 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Fri, 8 Jun 2018 22:09:09 -0700 Subject: RFR(M) : 8202946 : [TESTBUG] Open source VM testbase OOM tests In-Reply-To: <2DD8F9C6-8471-4BF6-8573-0DA3F2B6C66B@oracle.com> References: <2DD8F9C6-8471-4BF6-8573-0DA3F2B6C66B@oracle.com> Message-ID: Hi Igor, On 5/15/18 4:16 PM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8202946/webrev.00/index.html >> 1619 lines changed: 1619 ins; 0 del; 0 mod; > Hi all, > > could you please review this patch which open sources OOM tests from VM testbase? these tests test OutOfMemoryError throwing in different scenarios. > > As usually w/ VM testbase code, these tests are old, they have been run in hotspot testing for a long period of time. Originally, these tests were run by a test harness different from jtreg and had different build and execution schemes, some parts couldn't be easily translated to jtreg, so tests might have actions or pieces of code which look weird. In a long term, we are planning to rework them. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8202946 > webrev: http://cr.openjdk.java.net/~iignatyev//8202946/webrev.00/index.html > testing: :vmTestbase_vm_oom test group Webrev.00 looks good to me but have minor nits. ------------------- test/hotspot/jtreg/TEST.groups 1164 # Test for OOME re-throwing after Java Heap exchausting - Typo: exchausting -> exhausting ------------------- test/hotspot/jtreg/vmTestbase/vm/oom/OOMTraceTest.java ? 68???? protected boolean isAlwaysOOM() { ? 69???????? return expectOOM; ? 70???? } - (optional) It is returning the variable of "expectOOM" but the name is "isAlwaysOOM" which makes me confused. If you prefer "isXXX" form of name, how about "isExpectingOOM()" etc..? Or you can defer this renaming, as you are planning to rework those tests. I don't need a new webrev for these. ------------------- Just random comment. - It would be better to use small fixed Java Heap size to trigger OOME for short test running time. Thanks, Sangheon > > Thanks, > -- Igor From sangheon.kim at oracle.com Sat Jun 9 05:11:39 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Fri, 8 Jun 2018 22:11:39 -0700 Subject: RFR (XXS): 8204617: ParallelGC parallel reference processing does not set MT degree in reference processor In-Reply-To: <46467c2aadce0f02bcfdcbcf202caae3446bffbb.camel@oracle.com> References: <46467c2aadce0f02bcfdcbcf202caae3446bffbb.camel@oracle.com> Message-ID: Hi Thomas, Looks good, thanks for fixing this! I also noticed about this and JDK-8204618, but accidentally not included at JDK-8043575 patch. Thanks, Sangheon On 6/8/18 7:52 AM, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this small patch that properly sets the MT > degree in the parallel gc full gc? > > Otherwise some assert to be introduced in 8043575: Dynamically > parallelize reference processing work will fire for any application > using parallel gc that does not use the full amount of available gc > threads (by UseDynamicNumberOfGCThreads). > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204617 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8204617/webrev/ > Testing: > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled > > Thanks, > Thomas From sangheon.kim at oracle.com Sat Jun 9 05:12:08 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Fri, 8 Jun 2018 22:12:08 -0700 Subject: RFR (XS): 8204618: The parallel GC reference processing task executor enqueues a wrong number of tasks into the queue In-Reply-To: References: Message-ID: <9dcb93d3-7488-8dd1-7993-034657d5f094@oracle.com> Hi Thomas, Looks good. Thanks, Sangheon On 6/8/18 7:52 AM, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that fixes some issue with > parallel gc reference processing task executor putting too many (always > max number of threads) reference processing tasks into the work queue? > > The effects are benign (at least we have not observed any bad effects > so far since enabling -XX:UseDynamicNumberOfGCThreads) in that > reference processing will be called multiple times by the same active > threads, basically doing nothing (because the corresponding ref proc > queues are empty after the first time they are processed). > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204618 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8204618/webrev/ > Testing: > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled > > Thanks, > Thomas From sangheon.kim at oracle.com Sat Jun 9 06:13:42 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Fri, 8 Jun 2018 23:13:42 -0700 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> Message-ID: <14c43d61-f258-cf91-3dd5-5fc7afe3c424@oracle.com> Hi Thomas, On 6/8/18 7:52 AM, Thomas Schatzl wrote: > Hi Sangheon, > > On Thu, 2018-06-07 at 10:50 -0700, sangheon.kim at oracle.com wrote: >> Hi Thomas, >> >> On 6/7/18 7:12 AM, Thomas Schatzl wrote: >>> Hi Sangheon, >>> >>>> [...] >>>> As I noted in pre-review email thread, it showed better results >>>> if I avoid using more than active cpus. >>>> This is a kind of enhancement for better performance, so even >>>> though the command-line option was set to use more than cpus(via >>>> ParalelGCThreads), I think it is good to limit. This is different >>>> from command-line option processing which lets users decide even >>>> though the given options would results in worse performance. >>>> >>>> Or if you are saying I should use "active_processor_count()"? If >>>> it is the case, I updated to use 'active_processor_count()' >>>> instead of 'initial_active_processor_count()'. >>>> >>>> I would prefer to have this unless there's any reason of not >>>> having it. >>> First, always use active_processor_count() for current thread >>> decisions. Initial_active_processor_count() is only here to get a >>> consistent value of the processor count during initialization. >>> >>> What has been your test to determine this? GCOld? >> Micro-benchmark which can create 1~3digit K of references. (mostly >> 8k, 32k, 128k) Specjvm2008.Derby as it generates maximum of 12k >> references. >> >> GCOld is my favorite to play with but it doesn't stress much >> references. > I will look into this some more. Okay, thanks. > >>> I think this may be outdated as with recent changes scalability >>> (with G1) should have improved a lot. >> I tested before your rebuildRset patch, so a bit old. >> But I'm not sure how recent changes would affect to this case. i.e. >> if users sets quite big number than actual cpus, and then use those >> threads on ref.proc. etc.. This patch is suggesting not to rely on >> current setting in worst case. >> >> But I will not argue here, as you will the person to push! :) > There have been improvements to actual reference processing in G1 that > improved throughput for reference processing significantly. Yes, I'm aware of these improvements. Probably I'm too paranoid, but what I'm unsure is that if a user sets too large number of ParallelGCThreads, and G1 used a lot of threads than actually the host has, we would have less performance than limiting number of cpus. Probably we would need follow-up CR to check if you want. :) > > [...] >>> There is a WIP webrev at >>> >>> http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ >>> >>> for the curious (I think accomodated the review comments so far >>> too). >>> Still needs at least some more testing. >>> >>> And I need to remove the parallel gc support. >> Okay. >> >>> I am also not sure whether there has been a decision on how this >>> feature should be enabled, i.e. which switches need to be active to >>> use it. >>> I made it so that if ParallelRefProcEnabled is true, and >>> ReferencesPerThread > 0, we enable the dynamic number of reference >>> processing threads. >>> I.e. if ParallelRefProcEnabled is false, dynamic number of >>> reference processing threads is also turned off (because the user >>> requested to not do parallel ref processing at all). >> That is how works on my patch, too. >> I should mention on my review request, thank you for bringing up >> this topic!! > I filed a preliminary CSR at https://bugs.openjdk.java.net/browse/JDK-8 > 204612 for this change. I can review it when you are ready and before my vacation. Looks good to me as is though. > > I.e. > - make ParallelRefProcEnabled with dynamic selection of thread numbers > the default for G1. > - (I do not think we need the CSR for the ReferencesPerThread flag, but > I did anyway). Okay, changing the default for G1 sounds a good idea. > >> Previously we discussed for changing ParallelRefProcEnabled option >> to on/off/auto(enabling ReferencesPerThread). Of course, the name >> also should be changed too. It would be good to discuss under >> separate CR. >> >> thomas.webrev.3 looks good to me. > I wasn't expecting a review already :) :-) > >> Some minor comments: >> -------------------------- >> - I guess you didn't merge TestPrintReferences.java yet? So >> intentionally not included the file on webrev.3? > I had to fix it a bit for the latest version of webrev.3, which I think > is ready for review. > >> -------------------------- >> src/hotspot/share/gc/shared/referenceProcessor.hpp >> 215 bool _adjust_no_of_processing_threads; >> >> 556 // True, to allow ergonomically changing a number of >> processing >> threads based on references. >> 557 bool _adjust_no_of_processing_threads; >> - There are 2 declaration of _adjust_no_of_processing_threads. The >> latter one exists on other class. >> >> -------------------------- >> (pre-existing from my webrev) >> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp >> 5128 assert(workers->active_workers() == ergo_workers, >> 5129 "Ergonomically chosen workers(%u) should be less than >> or >> equal to active workers(%u)", >> 5130 ergo_workers, workers->active_workers()); >> - We can remove the assert or change the statement to remove 'less >> than or'. >> >> -------------------------- >> src/hotspot/share/gc/cms/parNewGeneration.cpp >> 797 assert(workers->active_workers() == ergo_workers, >> 798 "Ergonomically chosen workers(%u) should be less than >> or >> equal to active workers(%u)", >> 799 ergo_workers, workers->active_workers()); >> - Same as above. >> >> -------------------------- >> - Same for pcTasks.cpp and psScavenge.cpp too if we decide not to >> include parallel gc in this patch. >> >> -------------------------- >> src/hotspot/share/gc/shared/referenceProcessor.cpp >> 786 RefProcMTDegreeAdjuster a(this, >> 787 RefPhase1, >> 788 num_soft_refs); >> - We can put at the same line. Same for other cases. >> >> Again thank you for taking this CR, Thomas! > I think I fixed all this. > > Webrev is at http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ . > > This webrev is based on top of latest jdk/jdk and https://bugs.openjdk. > java.net/browse/JDK-8202845 . > > If you want to test parallel gc, you also need the fixes for JDK- > 8204617 and JDK-8204618 currently out for review. > > Testing: > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled Webrev.3 looks good to me. Thanks, Sangheon > > Thanks, > Thomas > From hohensee at amazon.com Sat Jun 9 13:29:45 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Sat, 9 Jun 2018 13:29:45 +0000 Subject: FW: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <48344C2E-2423-4CB1-A5C3-09223CA1ED78@amazon.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> <15754d34-f3c7-6431-7abf-05214bd6b9d2@oracle.com> <93417506-8e12-6ad3-690a-439e36719652@oracle.com> <99C937B6-F67B-4F4A-ACC8-444B5F83932B@amazon.com> <3922628B-81DE-4904-AEFE-F4163EB1E655@amazon.com> <8444d766-8adb-013f-f75b-b5d4420635df@oracle.com> <5768E4BF-5556-455B-899B-94A67BF940A0@amazon.com> <3E15AAD8-5017-4CCB-B927-6B31FD0D7809@amazon.com> <22216AD4-9A78-4427-9AB8-629EC685C296@amazon.com> <9f972398-115f-06ad-1e0c-513abceb097a@oracle.com> <48344C2E-2423-4CB1-A5C3-09223CA1ED78@amazon.com> Message-ID: <815C8166-AE5B-40EC-B7B6-1DF9946C715D@amazon.com> Didn't seem to make it to hotspot-gc-dev... ?On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: Back after a long hiatus... Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ TIA for your re-review. Plus, may I have another reviewer look at it please? Paul On 2/26/18, 8:47 AM, "Erik Helin" wrote: Hi Paul, a couple of comments on the patch: - memoryService.hpp: + 150 bool countCollection, + 151 bool allMemoryPoolsAffected = true); There is no need to use a default value for the parameter allMemoryPoolsAffected here. Skipping the default value also allows you to put allMemoryPoolsAffected to TraceMemoryManager::initialize in the same relative position as for the constructor parameter (this will make the code more uniform and easier to follow). - memoryManager.cpp Instead of adding a default parameter, maybe add a new method? Something like GCMemoryManager::add_not_always_affected_pool() (I couldn't come up with a shorter name at the moment). - TestMixedOldGenCollectionUsage.java The test is too strict about how and when collections should occur. Tests written this way often become very brittle, they might e.g. fail to finish a concurrent mark on time on a very slow, single core, machine. It is better to either force collections by using the WhiteBox API or make the test more lenient. Thanks, Erik On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > Ping for a review please. > > Thanks, > > Paul > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > From the original RR: > > > The bug is that from the JMX point of view, G1?s incremental collector > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > survivor and eden spaces. In fact, mixed collections run by this > > collector also affect the G1 old generation. > > > > This proposed fix is to record, for each of a JMX garbage collector's > > memory pools, whether that memory pool is affected by all collections > > using that collector. And, for each collection, record whether or not > > all the collector's memory pools are affected. After each collection, > > for each memory pool, if either all the collector's memory pools were > > affected or the memory pool is affected for all collections, record > > CollectionUsage for that pool. > > > > For collectors other than G1 Young Generation, all pools are recorded as > > affected by all collections and every collection is recorded as > > affecting all the collector?s memory pools. For the G1 Young Generation > > collector, the G1 Old Gen pool is recorded as not being affected by all > > collections, and non-mixed collections are recorded as not affecting all > > memory pools. The result is that for non-mixed collections, > > CollectionUsage is recorded after a collection only the G1 Eden Space > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > is recorded for G1 Old Gen as well. > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > CollectionUsage, the only external behavior change is that > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > rather than 2. > > > > With this fix, a collector?s memory pools can be divided into two > > disjoint subsets, one that participates in all collections and one that > > doesn?t. This is a bit more general than the minimum necessary to fix > > G1, but not by much. Because I expect it to apply to other incremental > > region-based collectors, I went with the more general solution. I > > minimized the amount of code I had to touch by using default parameters > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > From erik.osterlund at oracle.com Mon Jun 11 09:09:09 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 11 Jun 2018 11:09:09 +0200 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV In-Reply-To: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> References: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> Message-ID: <5B1E3C35.3090506@oracle.com> Hi Tom, Could you please call InstanceKlass::holder_phantom() instead to keep the class alive? That is the more general mechanism that is also used by ciInstanceKlass. We don't want to use explicit G1 enqueue calls anymore. Also, you must not perform any thread transition between loading the weak klass from the MDO until you call holder_phantom, otherwise it might have been unloaded before you get to call holder_phantom(). Is this guaranteed somehow in this scenario? I looked through all callsites and could not find where the Klass pointer is read in the MDO and subsequently passed into the CompilerToVM::get_jvmci_type API, and therefore I do not know if this is guaranteed. Thanks, /Erik On 2018-06-08 22:46, Tom Rodriguez wrote: > The JVMCI API may read Klass* and java.lang.Class instances from > locations which G1 would consider to be weakly referenced. This can > result in HotSpotResolvedObjectTypeImpl instances with references to > Classes that have been unloaded. In the this crash, JVMCI was reading > a Klass* from the profile in an MDO and building a wrapper around it. > The MDO reference is weak and was the only remaining reference to the > type so it could be dropped resulting in an eventual crash. > > I've added an explicit G1 enqueue before we call out to create the > wrapper object but is there a more recommended way of doing this? Dean > had pointed out the oddly named InstanceKlass::holder_phantom which is > used by the CI. Should I be using that? The G1 barrier is only > really need when reading from non-Java heap memory but since the > get_jvmci_type method is the main entry point for this logic it safest > to always perform it in this path. > > https://bugs.openjdk.java.net/browse/JDK-8198909 > http://cr.openjdk.java.net/~never/8198909/webrev From thomas.schatzl at oracle.com Mon Jun 11 09:29:29 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 11 Jun 2018 11:29:29 +0200 Subject: RFR (XS): 8204618: The parallel GC reference processing task executor enqueues a wrong number of tasks into the queue In-Reply-To: <9dcb93d3-7488-8dd1-7993-034657d5f094@oracle.com> References: <9dcb93d3-7488-8dd1-7993-034657d5f094@oracle.com> Message-ID: <2dac905d2494a61714080e5248dc6724253fc6bc.camel@oracle.com> Hi Sangheon, Kim, On Fri, 2018-06-08 at 22:12 -0700, sangheon.kim at oracle.com wrote: > Hi Thomas, > > Looks good. thanks for your reviews. Thomas From thomas.schatzl at oracle.com Mon Jun 11 09:30:01 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 11 Jun 2018 11:30:01 +0200 Subject: RFR (XXS): 8204617: ParallelGC parallel reference processing does not set MT degree in reference processor In-Reply-To: <88731143-11E9-48CA-99C5-5BCFD4E1DF6A@oracle.com> References: <46467c2aadce0f02bcfdcbcf202caae3446bffbb.camel@oracle.com> <88731143-11E9-48CA-99C5-5BCFD4E1DF6A@oracle.com> Message-ID: Hi Kim, Sangheon, On Fri, 2018-06-08 at 17:36 -0400, Kim Barrett wrote: > > On Jun 8, 2018, at 10:52 AM, Thomas Schatzl > com> wrote: > > > > Hi all, > > > > can I have reviews for this small patch that properly sets the MT > > degree in the parallel gc full gc? > > > > Otherwise some assert to be introduced in 8043575: Dynamically > > parallelize reference processing work will fire for any application > > using parallel gc that does not use the full amount of available gc > > threads (by UseDynamicNumberOfGCThreads). > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8204617 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8204617/webrev/ > > Testing: > > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled > > > > Thanks, > > Thomas > > Looks good. thanks for your reviews. Thomas From thomas.schatzl at oracle.com Mon Jun 11 09:39:30 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 11 Jun 2018 11:39:30 +0200 Subject: RFR (S): 8204169: Humongous continues region remembered set states do not match the one from the corresponding humongous start region In-Reply-To: References: Message-ID: <4b4db0a471796285ccbb0b7416c4f9e564809482.camel@oracle.com> Hi all, ping! :-) Thanks, Thomas On Mon, 2018-06-04 at 12:27 +0200, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this small change that fixes up remembered > set states of humongous continues regions? In particular they are not > necessarily consistent with the humongous starts region at the > moment. > > This makes for some surprises when reading the logs. Otherwise there > is no impact: all decisions on how to treat the humongous object are > made on the humongous start region anyway. > > As for testing, I added code to the heap verification to check this > consistency requirement. This makes for a 100% failure rate in > gc/g1/TestEagerReclaimHumongousRegionsClearMarkBits if not fixed. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204169 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8204169/webrev/ > Testing: > hs-tier1-3 > > Thanks, > Thomas From thomas.schatzl at oracle.com Mon Jun 11 11:04:31 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 11 Jun 2018 13:04:31 +0200 Subject: Potential optimization to the GC termination protocol in JDK8 In-Reply-To: References: <9b20437c-86e5-f413-e5e4-7f2089fc4182@oracle.com> <4aa5e12a-94b9-6ac8-f0e0-e2d48f6593f7@redhat.com> <3ad6900186a0f630d5fed599c543ce885e431127.camel@oracle.com> Message-ID: <8b5f605c6f8ecc6c89e2187ab8cec14bb9ddcf11.camel@oracle.com> Hi Tony, On Fri, 2018-06-08 at 11:16 -0500, T S wrote: > On Fri, Jun 8, 2018 at 2:35 AM Thomas Schatzl com> wrote: > > > > Hi, > > > > Hi Thomas, > > Thanks for your feedback. > > > sorry for being somewhat late in that thread... busy lately... > > > > On Thu, 2018-04-26 at 08:53 +0200, Roman Kennke wrote: > > > > > [Problem] > > > > > The work stealing during the GC takes lots of time to > > > > > terminate. > > > > > [...] > > > > Hi Tony, > > > in Shenandoah we have implemented an improved termination > > > protocol. > > > See the comment here: > > > > > > http://hg.openjdk.java.net/shenandoah/jdk/file/b4a3595f7c56/src/h > > > otsp > > > ot/share/gc/shenandoah/shenandoahTaskqueue.hpp#l93 > > > > > > We intend to upstream this stuff very soon, at which point it can > > > be used by all other GCs if they wish. It's a drop-in replacement > > > for the existing task queue code. > > > > > > It sounds like it's similar to what you have in mind. Right? > > > > The ShenandoahTaskTerminator is different in that it optimizes the > > amount of woken up threads and the detection of that there is work > > to steal. > > > > Thanks for the introduction of ShenandoahTaskTerminator. I will read > the code later. > Actually, our optimization on GC terminator is similar as you > described. We also optimize stealing The ShenandoahTaskTerminator does not change the actual stealing itself after non-busy threads are woken up. It improves the process of waking up threads by (as far as I understand ;) of course): a) only waking up as many threads as there are work items b) only one (changing) master thread looks through the queues for work This is independent and so complementary to the work provided by you. > in two ways: > 1, the amount of live GC threads to control the steal total number; > 2, who or which thread to steal. > Our modification of steal on JDK 8 can be found on: > https://github.com/tonys24/optimized-steal > > The basic idea of our work is simple: > To realize target 1, we replace _n with live_number for live GC > threads. I have one question about that, that I think the paper also did not answer specifically (maybe some interpretation error from me). So steal_best_of_2, instead of doing 2*N*N (N = number of threads) steal attempts, it does 2*N_live*N_live steal attempts. Is there any provision to do these N_live steal attempts only on threads that actually have work? I can see that just by reducing the number of steal attempts we improve performance (I have done these experiments maybe a year or more ago, like only doing 2*N attempts or other random iteration counts), but just reducing the steal attempts seems to make it often fall back to trying to terminate. Consider the situation when N_live is much smaller than N - then the probability that a given thread trying to find work is small, and so there may be a lot of unusuccessful steal attempts. It seems to be much better to not only do N_live attempts, but do these attempts on only the queues of threads that are live. The second optimization shown by the paper, trying to keep stealing from the same thread as before probably mitigates this issue of unsuccessful stealing attempts quite a bit, but still... Property b) of the shenandoah task terminator might help here. > https://github.com/tonys24/optimized-steal/blob/master/file6.diff > To realize target 2, we replace steal_best_of_2 with > steal_best_of_2_kun. We picked two > threads, one is stolen successfully last time and the other is picked > randomly. Then we select the one which has longer queue size. Yeah, that makes sense. Thanks, Thomas From zgu at redhat.com Mon Jun 11 13:10:05 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 11 Jun 2018 09:10:05 -0400 Subject: Potential optimization to the GC termination protocol in JDK8 In-Reply-To: <8b5f605c6f8ecc6c89e2187ab8cec14bb9ddcf11.camel@oracle.com> References: <9b20437c-86e5-f413-e5e4-7f2089fc4182@oracle.com> <4aa5e12a-94b9-6ac8-f0e0-e2d48f6593f7@redhat.com> <3ad6900186a0f630d5fed599c543ce885e431127.camel@oracle.com> <8b5f605c6f8ecc6c89e2187ab8cec14bb9ddcf11.camel@oracle.com> Message-ID: On 06/11/2018 07:04 AM, Thomas Schatzl wrote: > Hi Tony, > > On Fri, 2018-06-08 at 11:16 -0500, T S wrote: >> On Fri, Jun 8, 2018 at 2:35 AM Thomas Schatzl > com> wrote: >>> >>> Hi, >>> >> >> Hi Thomas, >> >> Thanks for your feedback. >> >>> sorry for being somewhat late in that thread... busy lately... >>> >>> On Thu, 2018-04-26 at 08:53 +0200, Roman Kennke wrote: >>>>>> [Problem] >>>>>> The work stealing during the GC takes lots of time to >>>>>> terminate. >>>>>> [...] >> >>>> Hi Tony, >>>> in Shenandoah we have implemented an improved termination >>>> protocol. >>>> See the comment here: >>>> >>>> http://hg.openjdk.java.net/shenandoah/jdk/file/b4a3595f7c56/src/h >>>> otsp >>>> ot/share/gc/shenandoah/shenandoahTaskqueue.hpp#l93 >>>> >>>> We intend to upstream this stuff very soon, at which point it can >>>> be used by all other GCs if they wish. It's a drop-in replacement >>>> for the existing task queue code. >>>> >>>> It sounds like it's similar to what you have in mind. Right? >>> >>> The ShenandoahTaskTerminator is different in that it optimizes the >>> amount of woken up threads and the detection of that there is work >>> to steal. >>> >> >> Thanks for the introduction of ShenandoahTaskTerminator. I will read >> the code later. >> Actually, our optimization on GC terminator is similar as you >> described. We also optimize stealing > > The ShenandoahTaskTerminator does not change the actual stealing itself > after non-busy threads are woken up. > > It improves the process of waking up threads by (as far as I understand > ;) of course): > a) only waking up as many threads as there are work items > b) only one (changing) master thread looks through the queues for work > > This is independent and so complementary to the work provided by you. > >> in two ways: >> 1, the amount of live GC threads to control the steal total number; >> 2, who or which thread to steal. >> Our modification of steal on JDK 8 can be found on: >> https://github.com/tonys24/optimized-steal >> >> The basic idea of our work is simple: >> To realize target 1, we replace _n with live_number for live GC >> threads. > > I have one question about that, that I think the paper also did not > answer specifically (maybe some interpretation error from me). > > So steal_best_of_2, instead of doing 2*N*N (N = number of threads) > steal attempts, it does 2*N_live*N_live steal attempts. > > Is there any provision to do these N_live steal attempts only on > threads that actually have work? > > I can see that just by reducing the number of steal attempts we improve > performance (I have done these experiments maybe a year or more ago, > like only doing 2*N attempts or other random iteration counts), but > just reducing the steal attempts seems to make it often fall back to > trying to terminate. > > Consider the situation when N_live is much smaller than N - then the > probability that a given thread trying to find work is small, and so > there may be a lot of unusuccessful steal attempts. > > It seems to be much better to not only do N_live attempts, but do these > attempts on only the queues of threads that are live. > > The second optimization shown by the paper, trying to keep stealing > from the same thread as before probably mitigates this issue of > unsuccessful stealing attempts quite a bit, but still... I also believe this optimization can be very beneficial. My observation while working on ShenandoahTaskTerminator, there usually only one or two queues have remaining works near the termination, so successful stealing ratio could be quite low Thanks, -Zhengyu > > Property b) of the shenandoah task terminator might help here. > >> https://github.com/tonys24/optimized-steal/blob/master/file6.diff >> To realize target 2, we replace steal_best_of_2 with >> steal_best_of_2_kun. We picked two >> threads, one is stolen successfully last time and the other is picked >> randomly. Then we select the one which has longer queue size. > > Yeah, that makes sense. > > Thanks, > Thomas > From rkennke at redhat.com Mon Jun 11 16:02:35 2018 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 11 Jun 2018 18:02:35 +0200 Subject: RFR: JDK-8204685: Abstraction for TLAB dummy object Message-ID: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> Similar to how object allocations should be owned by the GC, so does TLAB dummy object 'allocation'. TLABs and PLABs are filling their remaining blocks with dummy objects. GCs like Shenandoah might need a slightly differet layout for this, and therefore we need an abstraction. The proposed change adds a new virtual method CollectedHeap::fill_with_dummy_object(), the default implementation calls the existing CH::fill_with_object() from TLAB and PLAB like it was done before. Tested: hotspot-tier1, Shenandoah tests with appropriate impl http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.00/ Ok? Thanks, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From robbin.ehn at oracle.com Mon Jun 11 16:09:00 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 11 Jun 2018 18:09:00 +0200 Subject: RFR(M): 8204613: StringTable: Calculates wrong number of uncleaned items. Message-ID: <8f27ac9e-8201-4cda-4bca-152c7ab992d3@oracle.com> Hi all, please review. The StringTable lazy evicts dead string, until a dead string is evicted it will be counted as a dead string. If it is not evicted before next GC cycle it is counted again, making the count of uncleaned strings skew. Also ZGC walks the strings without using the stringtable GC API, but it needs to be-able to feedback the number of dead strings to get the cleaning functionality. There is a big probability that ZGC makes it in before this change-set, so I included ZGC changes. There was a compile issue on slowdebug on windows for create_archived_string(), I added NOT_CDS_JAVA_HEAP_RETURN_(NULL) for it. Change-set: http://cr.openjdk.java.net/~rehn/8204613/webrev/index.html Bug: https://bugs.openjdk.java.net/browse/JDK-8204613 T1-3 with ZGC testing on, no related issues and manual JMH testing. Thanks, Robbin From tom.rodriguez at oracle.com Mon Jun 11 17:04:39 2018 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Mon, 11 Jun 2018 10:04:39 -0700 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV In-Reply-To: <5B1E3C35.3090506@oracle.com> References: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> <5B1E3C35.3090506@oracle.com> Message-ID: Erik ?sterlund wrote on 6/11/18 2:09 AM: > Hi Tom, > > Could you please call InstanceKlass::holder_phantom() instead to keep > the class alive? That is the more general mechanism that is also used by > ciInstanceKlass. We don't want to use explicit G1 enqueue calls anymore. Ok. I guess the same fix in JDK8 will have the use the explicit enqueue though or is it not required in JDK8? > Also, you must not perform any thread transition between loading the > weak klass from the MDO until you call holder_phantom, otherwise it > might have been unloaded before you get to call holder_phantom(). Is > this guaranteed somehow in this scenario? I looked through all callsites > and could not find where the Klass pointer is read in the MDO and > subsequently passed into the CompilerToVM::get_jvmci_type API, and > therefore I do not know if this is guaranteed. The obviously problematic path is at http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l334 when either base_address is a Klass* or base_object is NULL which is where we are reading from non-heap memory. There are other paths which are reading Klasses through more standard APIs from the ConstantPool for instance. There isn't an easy way to ensure no safepoint occurs in between so maybe we require the caller of get_jvmci_type to pass in the phantom_holder() as a way of forcing the caller to call holder_phantom() at the appropriate places? Or is it the case that getResolvedType is the only place where special effort is required? All the other paths are fairly normal HotSpot code but though place that uses klass->implementor() for instance seems like it could be considered to be weak by G1. http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l368 The lack of a properly working KlassHandle seems like an oversight in the API to me. tom > > Thanks, > /Erik > > On 2018-06-08 22:46, Tom Rodriguez wrote: >> The JVMCI API may read Klass* and java.lang.Class instances from >> locations which G1 would consider to be weakly referenced. This can >> result in HotSpotResolvedObjectTypeImpl instances with references to >> Classes that have been unloaded. In the this crash, JVMCI was reading >> a Klass* from the profile in an MDO and building a wrapper around it. >> The MDO reference is weak and was the only remaining reference to the >> type so it could be dropped resulting in an eventual crash. >> >> I've added an explicit G1 enqueue before we call out to create the >> wrapper object but is there a more recommended way of doing this? Dean >> had pointed out the oddly named InstanceKlass::holder_phantom which is >> used by the CI. Should I be using that? The G1 barrier is only >> really need when reading from non-Java heap memory but since the >> get_jvmci_type method is the main entry point for this logic it safest >> to always perform it in this path. >> >> https://bugs.openjdk.java.net/browse/JDK-8198909 >> http://cr.openjdk.java.net/~never/8198909/webrev > From igor.ignatyev at oracle.com Mon Jun 11 20:44:06 2018 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 11 Jun 2018 13:44:06 -0700 Subject: RFR(M) : 8202946 : [TESTBUG] Open source VM testbase OOM tests In-Reply-To: References: <2DD8F9C6-8471-4BF6-8573-0DA3F2B6C66B@oracle.com> Message-ID: <1D6C49A7-C5D8-402D-B559-7E3B7E8D3AAB@oracle.com> Hi Sangheon, thanks for your review, please see my answers inline. Cheers, -- Igor > On Jun 8, 2018, at 10:09 PM, sangheon.kim at oracle.com wrote: > > Hi Igor, > > On 5/15/18 4:16 PM, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8202946/webrev.00/index.html >>> 1619 lines changed: 1619 ins; 0 del; 0 mod; >> Hi all, >> >> could you please review this patch which open sources OOM tests from VM testbase? these tests test OutOfMemoryError throwing in different scenarios. >> >> As usually w/ VM testbase code, these tests are old, they have been run in hotspot testing for a long period of time. Originally, these tests were run by a test harness different from jtreg and had different build and execution schemes, some parts couldn't be easily translated to jtreg, so tests might have actions or pieces of code which look weird. In a long term, we are planning to rework them. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8202946 >> webrev: http://cr.openjdk.java.net/~iignatyev//8202946/webrev.00/index.html >> testing: :vmTestbase_vm_oom test group > Webrev.00 looks good to me but have minor nits. > > ------------------- > test/hotspot/jtreg/TEST.groups > 1164 # Test for OOME re-throwing after Java Heap exchausting > - Typo: exchausting -> exhausting will fix before pushing, thanks for spotting. > > ------------------- > test/hotspot/jtreg/vmTestbase/vm/oom/OOMTraceTest.java > 68 protected boolean isAlwaysOOM() { > 69 return expectOOM; > 70 } > - (optional) It is returning the variable of "expectOOM" but the name is "isAlwaysOOM" which makes me confused. If you prefer "isXXX" form of name, how about "isExpectingOOM()" etc..? Or you can defer this renaming, as you are planning to rework those tests. created JDK-8204697 for that. > > I don't need a new webrev for these. > > ------------------- > Just random comment. > - It would be better to use small fixed Java Heap size to trigger OOME for short test running time. thanks for suggestion, created JDK-8204698. > > Thanks, > Sangheon > > >> >> Thanks, >> -- Igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From shade at redhat.com Tue Jun 12 08:48:24 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Jun 2018 10:48:24 +0200 Subject: RFR: 8204180: Implementation: JEP 318: Epsilon GC (round 5) In-Reply-To: <0e63a401-d28d-2bdb-d024-3b48f3fba0cb@oracle.com> References: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> <7211f820-4999-2988-e06c-1067b1fb138f@oracle.com> <0e63a401-d28d-2bdb-d024-3b48f3fba0cb@oracle.com> Message-ID: <2eab1a10-121f-884d-0155-5a422f2a3268@redhat.com> On 06/07/2018 02:42 PM, Per Liden wrote: > On 06/07/2018 02:38 PM, Aleksey Shipilev wrote: >> On 06/07/2018 12:13 PM, Per Liden wrote: >>> On 06/06/2018 06:30 PM, Aleksey Shipilev wrote: >>>> Hi, >>>> >>>> This is fifth (and hopefully final) round of code review for Epsilon GC changes. It includes the >>>> fixes done as the result of fourth round of reviews, mostly in Serviceability. The build parts are >>>> the same since last few reviews, so this is not posted to build-dev at . >>>> >>>> Webrev: >>>> ??? http://cr.openjdk.java.net/~shade/epsilon/webrev.09/ >>> >>> Still looks good to me! >>> I have two follow ups after jdk-submit Windows and Solaris build/test failures: http://hg.openjdk.java.net/jdk/sandbox/rev/24339b23a56b http://hg.openjdk.java.net/jdk/sandbox/rev/f21420b61fd2 Full webrev that passes jdk-submit: http://cr.openjdk.java.net/~shade/epsilon/webrev.11/ Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From per.liden at oracle.com Tue Jun 12 09:17:33 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Jun 2018 11:17:33 +0200 Subject: RFR: 8204180: Implementation: JEP 318: Epsilon GC (round 5) In-Reply-To: <2eab1a10-121f-884d-0155-5a422f2a3268@redhat.com> References: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> <7211f820-4999-2988-e06c-1067b1fb138f@oracle.com> <0e63a401-d28d-2bdb-d024-3b48f3fba0cb@oracle.com> <2eab1a10-121f-884d-0155-5a422f2a3268@redhat.com> Message-ID: <5e4c692e-c638-1ec2-e9d3-faf613bcbf18@oracle.com> On 06/12/2018 10:48 AM, Aleksey Shipilev wrote: > On 06/07/2018 02:42 PM, Per Liden wrote: >> On 06/07/2018 02:38 PM, Aleksey Shipilev wrote: >>> On 06/07/2018 12:13 PM, Per Liden wrote: >>>> On 06/06/2018 06:30 PM, Aleksey Shipilev wrote: >>>>> Hi, >>>>> >>>>> This is fifth (and hopefully final) round of code review for Epsilon GC changes. It includes the >>>>> fixes done as the result of fourth round of reviews, mostly in Serviceability. The build parts are >>>>> the same since last few reviews, so this is not posted to build-dev at . >>>>> >>>>> Webrev: >>>>> ??? http://cr.openjdk.java.net/~shade/epsilon/webrev.09/ >>>> >>>> Still looks good to me! >>>> > > I have two follow ups after jdk-submit Windows and Solaris build/test failures: > http://hg.openjdk.java.net/jdk/sandbox/rev/24339b23a56b > http://hg.openjdk.java.net/jdk/sandbox/rev/f21420b61fd2 Looks good to me! /Per > > Full webrev that passes jdk-submit: > http://cr.openjdk.java.net/~shade/epsilon/webrev.11/ > > Thanks, > -Aleksey > From martin.doerr at sap.com Tue Jun 12 09:47:53 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 12 Jun 2018 09:47:53 +0000 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: Message-ID: <9ba49130e8394e41beec0a9841504dc0@sap.com> Hi Michihiro, I have removed the mailing lists except hotspot-gc-dev which is the one for this review. Thank you for taking care of this PPC64 performance problem. I think we shouldn?t ship jdk11 on PPC64 without addressing it. I guess handle_evacuation_failure_par is not performance critical, so I wonder if it needs to be part of this change. I haven?t checked if it?s correct. Your description and change of copy_to_survivor_space fit to the comments and how the algorithm works. So it looks good to me. I couldn?t find any requirement for memory barriers regarding the CAS. But we should have a G1 expert double-check that we haven?t missed anything. Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Donnerstag, 7. Juni 2018 08:01 To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Cc: Kim Barrett ; Gustavo Bueno Romero ; david.holmes at oracle.com; Erik Osterlund ; Doerr, Martin Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Dear all, Would you please review the following change? Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 G1ParScanThreadState::copy_to_survivor_space tries to move live objects to a different location. It uses a forwarding technique and allows multiple threads to compete for performing the copy step. A copy is performed after a thread succeeds in the CAS. CAS-failed threads are not allowed to dereference the forwardee concurrently. Current code is already written so that CAS-failed threads do not dereference the forwardee. Also, this constraint is documented in a caller function mark_forwarded_object as ?the object might be in the process of being copied by another worker so we cannot trust that its to-space image is well-formed?. * There is no copy that must finish before the CAS. * Threads that failed in the CAS must not dereference the forwardee. Therefore, no fence is necessary before and after the CAS. I measured SPECjbb2015 with this change. As a result, critical-jOPS performance improved by 27% on POWER8. Best regards, -- Michihiro, IBM Research - Tokyo -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Tue Jun 12 09:58:40 2018 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 12 Jun 2018 11:58:40 +0200 Subject: RFR: 8204180: Implementation: JEP 318: Epsilon GC (round 5) In-Reply-To: <2eab1a10-121f-884d-0155-5a422f2a3268@redhat.com> References: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> <7211f820-4999-2988-e06c-1067b1fb138f@oracle.com> <0e63a401-d28d-2bdb-d024-3b48f3fba0cb@oracle.com> <2eab1a10-121f-884d-0155-5a422f2a3268@redhat.com> Message-ID: > On 06/07/2018 02:42 PM, Per Liden wrote: >> On 06/07/2018 02:38 PM, Aleksey Shipilev wrote: >>> On 06/07/2018 12:13 PM, Per Liden wrote: >>>> On 06/06/2018 06:30 PM, Aleksey Shipilev wrote: >>>>> Hi, >>>>> >>>>> This is fifth (and hopefully final) round of code review for Epsilon GC changes. It includes the >>>>> fixes done as the result of fourth round of reviews, mostly in Serviceability. The build parts are >>>>> the same since last few reviews, so this is not posted to build-dev at . >>>>> >>>>> Webrev: >>>>> ??? http://cr.openjdk.java.net/~shade/epsilon/webrev.09/ >>>> >>>> Still looks good to me! >>>> > > I have two follow ups after jdk-submit Windows and Solaris build/test failures: > http://hg.openjdk.java.net/jdk/sandbox/rev/24339b23a56b > http://hg.openjdk.java.net/jdk/sandbox/rev/f21420b61fd2 > > Full webrev that passes jdk-submit: > http://cr.openjdk.java.net/~shade/epsilon/webrev.11/ > > Thanks, > -Aleksey > Looks good to me! Thanks, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Tue Jun 12 10:12:45 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Jun 2018 12:12:45 +0200 Subject: RFR: JDK-8204685: Abstraction for TLAB dummy object In-Reply-To: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> References: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> Message-ID: <791288d7453c99da05451e23068278b04800b024.camel@oracle.com> Hi, On Mon, 2018-06-11 at 18:02 +0200, Roman Kennke wrote: > Similar to how object allocations should be owned by the GC, so does > TLAB dummy object 'allocation'. TLABs and PLABs are filling their > remaining blocks with dummy objects. GCs like Shenandoah might need a > slightly differet layout for this, and therefore we need an > abstraction. > > The proposed change adds a new virtual method > CollectedHeap::fill_with_dummy_object(), the default implementation > calls the existing CH::fill_with_object() from TLAB and PLAB like it > was > done before. > > Tested: hotspot-tier1, Shenandoah tests with appropriate impl > > http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.00/ > > Ok? > looks good. Thomas From HORIE at jp.ibm.com Tue Jun 12 11:18:30 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Tue, 12 Jun 2018 20:18:30 +0900 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: <9ba49130e8394e41beec0a9841504dc0@sap.com> References: <9ba49130e8394e41beec0a9841504dc0@sap.com> Message-ID: Hi Martin, Thank you for your comments. Yes, this change is significant on PPC64 as I showed a big improvement in SPECjbb2015 (27% better critical-jOPS). Changing the handle_evacuation_failure_par is not necessary. I could not observe the performance bottleneck in handle_evacuation_failure_par from the profiles, New webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.01/ Best regards, -- Michihiro, IBM Research - Tokyo From: "Doerr, Martin" To: Michihiro Horie , "hotspot-gc-dev at openjdk.java.net" Cc: Kim Barrett , "david.holmes at oracle.com" , Erik Osterlund Date: 2018/06/12 18:47 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Michihiro, I have removed the mailing lists except hotspot-gc-dev which is the one for this review. Thank you for taking care of this PPC64 performance problem. I think we shouldn?t ship jdk11 on PPC64 without addressing it. I guess handle_evacuation_failure_par is not performance critical, so I wonder if it needs to be part of this change. I haven?t checked if it?s correct. Your description and change of copy_to_survivor_space fit to the comments and how the algorithm works. So it looks good to me. I couldn?t find any requirement for memory barriers regarding the CAS. But we should have a G1 expert double-check that we haven?t missed anything. Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Donnerstag, 7. Juni 2018 08:01 To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Cc: Kim Barrett ; Gustavo Bueno Romero ; david.holmes at oracle.com; Erik Osterlund ; Doerr, Martin Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Dear all, Would you please review the following change? Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 G1ParScanThreadState::copy_to_survivor_space tries to move live objects to a different location. It uses a forwarding technique and allows multiple threads to compete for performing the copy step. A copy is performed after a thread succeeds in the CAS. CAS-failed threads are not allowed to dereference the forwardee concurrently. Current code is already written so that CAS-failed threads do not dereference the forwardee. Also, this constraint is documented in a caller function mark_forwarded_object as ?the object might be in the process of being copied by another worker so we cannot trust that its to-space image is well-formed?. There is no copy that must finish before the CAS. Threads that failed in the CAS must not dereference the forwardee. Therefore, no fence is necessary before and after the CAS. I measured SPECjbb2015 with this change. As a result, critical-jOPS performance improved by 27% on POWER8. Best regards, -- Michihiro, IBM Research - Tokyo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From thomas.schatzl at oracle.com Tue Jun 12 11:26:39 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Jun 2018 13:26:39 +0200 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <9ba49130e8394e41beec0a9841504dc0@sap.com> Message-ID: <1b11d4beb061c88a657393e7bffc2e26682191f1.camel@oracle.com> Hi, On Tue, 2018-06-12 at 20:18 +0900, Michihiro Horie wrote: > Hi Martin, > > Thank you for your comments. Yes, this change is significant on PPC64 > as I showed a big improvement in SPECjbb2015 (27% better critical- > jOPS). > > Changing the handle_evacuation_failure_par is not necessary. I could > not observe the performance bottleneck in > handle_evacuation_failure_par from the profiles, > > New webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.01/ > of course handle_evacuation_failure_par() will not show up in the profiles if there is no evacuation failure. However during evacuation failure, this method will be stressed a lot. While from the outside it might not look like it is worth optimizing, lack of performance in evacuation failure has been a rather often voiced complaint (for the users where this occurs). This is only a side remark: I have not looked through the code for issues due to relaxing memory order recently, but I remember having done so when the similar change for parallel gc had been proposed last year. I remember that Michihiro's reasoning why this works for G1 the way it does is sound, but as mentioned please wait for a proper review. Thanks, Thomas From HORIE at jp.ibm.com Tue Jun 12 12:17:12 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Tue, 12 Jun 2018 21:17:12 +0900 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: <1b11d4beb061c88a657393e7bffc2e26682191f1.camel@oracle.com> References: <9ba49130e8394e41beec0a9841504dc0@sap.com> <1b11d4beb061c88a657393e7bffc2e26682191f1.camel@oracle.com> Message-ID: Hi Thomas, Thank you for telling that some people get a problem in handle_evacuation_failure. The handle_evacuation_failure is invoked from copy_to_survivor_space and also keeps the following two: There is no copy that must finish before the CAS. Threads that failed in the CAS must not dereference the forwardee. Best regards, -- Michihiro, IBM Research - Tokyo From: Thomas Schatzl To: Michihiro Horie , "Doerr, Martin" Cc: "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" Date: 2018/06/12 20:26 Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi, On Tue, 2018-06-12 at 20:18 +0900, Michihiro Horie wrote: > Hi Martin, > > Thank you for your comments. Yes, this change is significant on PPC64 > as I showed a big improvement in SPECjbb2015 (27% better critical- > jOPS). > > Changing the handle_evacuation_failure_par is not necessary. I could > not observe the performance bottleneck in > handle_evacuation_failure_par from the profiles, > > New webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.01/ > of course handle_evacuation_failure_par() will not show up in the profiles if there is no evacuation failure. However during evacuation failure, this method will be stressed a lot. While from the outside it might not look like it is worth optimizing, lack of performance in evacuation failure has been a rather often voiced complaint (for the users where this occurs). This is only a side remark: I have not looked through the code for issues due to relaxing memory order recently, but I remember having done so when the similar change for parallel gc had been proposed last year. I remember that Michihiro's reasoning why this works for G1 the way it does is sound, but as mentioned please wait for a proper review. Thanks, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From thomas.schatzl at oracle.com Tue Jun 12 13:16:18 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Jun 2018 15:16:18 +0200 Subject: RFR (S): 8204082: Add indication that this is the "Last Young GC before Mixed" to logs In-Reply-To: References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> Message-ID: <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> Hi all, On Fri, 2018-06-08 at 10:52 +0200, Thomas Schatzl wrote: > Hi Stefan, > > [...] > > > > Another possible solution, not sure it's better would be to an > > extra > > tag to all Young GCs, something like: > > Pause Young (Initial Mark) ... > > Pause Young (Normal) ... > > Pause Young (Finished Mark) ... > > Pause Young (Mixed) ... > > > > But I guess it's a bit harsh to call the Mixed GCs Young =) > > The more I look at it the more I like this variant. I created a prototype of this last variant, webrevs available at: http://cr.openjdk.java.net/~tschatzl/8204084/webrev.0_to_1 (diff) http://cr.openjdk.java.net/~tschatzl/8204084/webrev.1 (full) Ran through hs-tier1-4 Note that these changes required a bit more fixes in the tests that nicely cover all these messages. Thanks, Thomas From shade at redhat.com Tue Jun 12 13:20:11 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Jun 2018 15:20:11 +0200 Subject: RFR: 8204180: Implementation: JEP 318: Epsilon GC (round 5) In-Reply-To: <5e4c692e-c638-1ec2-e9d3-faf613bcbf18@oracle.com> References: <2ee03550-2b1e-b873-5ef3-26312e2a760e@redhat.com> <7211f820-4999-2988-e06c-1067b1fb138f@oracle.com> <0e63a401-d28d-2bdb-d024-3b48f3fba0cb@oracle.com> <2eab1a10-121f-884d-0155-5a422f2a3268@redhat.com> <5e4c692e-c638-1ec2-e9d3-faf613bcbf18@oracle.com> Message-ID: <9ebadbff-2461-a77c-3649-de8426e21adf@redhat.com> On 06/12/2018 11:17 AM, Per Liden wrote: > On 06/12/2018 10:48 AM, Aleksey Shipilev wrote: >> On 06/07/2018 02:42 PM, Per Liden wrote: >>> On 06/07/2018 02:38 PM, Aleksey Shipilev wrote: >>>> On 06/07/2018 12:13 PM, Per Liden wrote: >>>>> On 06/06/2018 06:30 PM, Aleksey Shipilev wrote: >>>>>> Hi, >>>>>> >>>>>> This is fifth (and hopefully final) round of code review for Epsilon GC changes. It includes the >>>>>> fixes done as the result of fourth round of reviews, mostly in Serviceability. The build parts >>>>>> are >>>>>> the same since last few reviews, so this is not posted to build-dev at . >>>>>> >>>>>> Webrev: >>>>>> ???? http://cr.openjdk.java.net/~shade/epsilon/webrev.09/ >>>>> >>>>> Still looks good to me! >>>>> >> >> I have two follow ups after jdk-submit Windows and Solaris build/test failures: >> ?? http://hg.openjdk.java.net/jdk/sandbox/rev/24339b23a56b >> ?? http://hg.openjdk.java.net/jdk/sandbox/rev/f21420b61fd2 > > Looks good to me! Integrated: https://bugs.openjdk.java.net/browse/JDK-8204180 http://hg.openjdk.java.net/jdk/jdk/rev/7b7c75d87f9b Let's see if it breaks anything not captured by our pre-integration testing. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From erik.helin at oracle.com Tue Jun 12 13:48:44 2018 From: erik.helin at oracle.com (Erik Helin) Date: Tue, 12 Jun 2018 15:48:44 +0200 Subject: FW: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <815C8166-AE5B-40EC-B7B6-1DF9946C715D@amazon.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> <15754d34-f3c7-6431-7abf-05214bd6b9d2@oracle.com> <93417506-8e12-6ad3-690a-439e36719652@oracle.com> <99C937B6-F67B-4F4A-ACC8-444B5F83932B@amazon.com> <3922628B-81DE-4904-AEFE-F4163EB1E655@amazon.com> <8444d766-8adb-013f-f75b-b5d4420635df@oracle.com> <5768E4BF-5556-455B-899B-94A67BF940A0@amazon.com> <3E15AAD8-5017-4CCB-B927-6B31FD0D7809@amazon.com> <22216AD4-9A78-4427-9AB8-629EC685C296@amazon.com> <9f972398-115f-06ad-1e0c-513abceb097a@oracle.com> <48344C2E-2423-4CB1-A5C3-09223CA1ED78@amazon.com> <815C8166-AE5B-40EC-B7B6-1DF9946C715D@amazon.com> Message-ID: (adding back serviceability-dev, please keep both hotspot-gc-dev and serviceability-dev) Hi Paul, before I start re-reviewing, did you test the new version of the patch via the jdk/submit repository [0]? Thanks, Erik [0]: http://hg.openjdk.java.net/jdk/submit On 06/09/2018 03:29 PM, Hohensee, Paul wrote: > Didn't seem to make it to hotspot-gc-dev... > > ?On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > Back after a long hiatus... > > Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ > > TIA for your re-review. Plus, may I have another reviewer look at it please? > > Paul > > On 2/26/18, 8:47 AM, "Erik Helin" wrote: > > Hi Paul, > > a couple of comments on the patch: > > - memoryService.hpp: > + 150 bool countCollection, > + 151 bool allMemoryPoolsAffected = true); > > There is no need to use a default value for the parameter > allMemoryPoolsAffected here. Skipping the default value also allows > you to put allMemoryPoolsAffected to TraceMemoryManager::initialize > in the same relative position as for the constructor parameter (this > will make the code more uniform and easier to follow). > > - memoryManager.cpp > > Instead of adding a default parameter, maybe add a new method? > Something like GCMemoryManager::add_not_always_affected_pool() > (I couldn't come up with a shorter name at the moment). > > - TestMixedOldGenCollectionUsage.java > > The test is too strict about how and when collections should > occur. Tests written this way often become very brittle, they might > e.g. fail to finish a concurrent mark on time on a very slow, single > core, machine. It is better to either force collections by using the > WhiteBox API or make the test more lenient. > > Thanks, > Erik > > On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > > Ping for a review please. > > > > Thanks, > > > > Paul > > > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > > > From the original RR: > > > > > The bug is that from the JMX point of view, G1?s incremental collector > > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > > survivor and eden spaces. In fact, mixed collections run by this > > > collector also affect the G1 old generation. > > > > > > This proposed fix is to record, for each of a JMX garbage collector's > > > memory pools, whether that memory pool is affected by all collections > > > using that collector. And, for each collection, record whether or not > > > all the collector's memory pools are affected. After each collection, > > > for each memory pool, if either all the collector's memory pools were > > > affected or the memory pool is affected for all collections, record > > > CollectionUsage for that pool. > > > > > > For collectors other than G1 Young Generation, all pools are recorded as > > > affected by all collections and every collection is recorded as > > > affecting all the collector?s memory pools. For the G1 Young Generation > > > collector, the G1 Old Gen pool is recorded as not being affected by all > > > collections, and non-mixed collections are recorded as not affecting all > > > memory pools. The result is that for non-mixed collections, > > > CollectionUsage is recorded after a collection only the G1 Eden Space > > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > > is recorded for G1 Old Gen as well. > > > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > > CollectionUsage, the only external behavior change is that > > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > > rather than 2. > > > > > > With this fix, a collector?s memory pools can be divided into two > > > disjoint subsets, one that participates in all collections and one that > > > doesn?t. This is a bit more general than the minimum necessary to fix > > > G1, but not by much. Because I expect it to apply to other incremental > > > region-based collectors, I went with the more general solution. I > > > minimized the amount of code I had to touch by using default parameters > > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > > > > > > > > From shade at redhat.com Tue Jun 12 13:58:03 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Jun 2018 15:58:03 +0200 Subject: RFR: JDK-8204685: Abstraction for TLAB dummy object In-Reply-To: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> References: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> Message-ID: <32a53350-bdc5-7ab1-6ef3-a6a2ef187f62@redhat.com> On 06/11/2018 06:02 PM, Roman Kennke wrote: > Similar to how object allocations should be owned by the GC, so does > TLAB dummy object 'allocation'. TLABs and PLABs are filling their > remaining blocks with dummy objects. GCs like Shenandoah might need a > slightly differet layout for this, and therefore we need an abstraction. > > The proposed change adds a new virtual method > CollectedHeap::fill_with_dummy_object(), the default implementation > calls the existing CH::fill_with_object() from TLAB and PLAB like it was > done before. > > Tested: hotspot-tier1, Shenandoah tests with appropriate impl > > http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.00/ I have doubts about setting "zap = false" unconditionally. As far as I can see, zapping is enabled for debug builds only anyway, so we better keep it "true". 82 size_t PLAB::retire_internal() { 83 size_t result = 0; 84 if (_top < _hard_end) { 85 Universe::heap()->fill_with_dummy_object(_top, _hard_end, false); <--- change to true? 86 result += invalidate(); 87 } 88 return result; 89 } 90 91 void PLAB::add_undo_waste(HeapWord* obj, size_t word_sz) { 92 Universe::heap()->fill_with_dummy_object(obj, obj + word_sz, false); <--- change to true? 93 _undo_wasted += word_sz; 94 } Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From kim.barrett at oracle.com Tue Jun 12 16:20:44 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Jun 2018 12:20:44 -0400 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> Message-ID: <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> > On Jun 6, 2018, at 1:13 PM, Thomas Schatzl wrote: > > I thought marks_oops_alive is an unnecessary quirk of the collectors > that found its way into the reference processing interface; its purpose > is to indicate whether the gc should perform stealing at the end of a > phase, as the complete closures drain the stacks completely by > themselves anyway. > > Stealing or not should be part of the complete closure (i.e. the gc, > and attempted by default by the gc) imho (it "completes" the phase), > like G1 does. > > I fixed it though. marks_oops_alive is another piece (really an entirely different mechanism) of the avoidance of stealing and the termination protocol when the keep_alive closure won't be marking any previously unmarked objects, so (maybe) won't be generating any potentially stealable work. It seems to me to be a better approach than sometimes not calling the complete_gc closure (the other mechanism, used by process_phase2), since it puts the decision on the specific collector that knows how its keep_alive and complete_gc closures interact. See further discussion below. > New webrevs: > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.0_to_1 (diff) > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.1 (full) > Testing: > hs-tier1-5,jdk-tier-1-3 (with +/-ParallelRefProcEnabled) > > Thanks for your quick review, > Thomas Sorry this one took longer. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 315 if (log_is_enabled(Trace, gc, ref)) { 324 if (log_is_enabled(Trace, gc, ref)) { s/log_is_enabled/log_develop_is_enabled/ ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 571 complete_gc.do_void(); I think this is unavoidable (at least for now), but I'm wondering if this is a performance issue for other than G1-young collections. In the phases where marks_oops_alive is false (like this one), for the other collectors (and G1 concurrent and full GC, I think), there won't be anything to steal. Starting the stealing process (and its associated termination protocol) is a waste of time in those cases. ParallelScavange and ParallelCompaction both avoid creating the stealing tasks if marks_oops_alive is false. CMS avoids calling do_work_stealing when marks_oops_alive is false. ParNew doesn't seem to look at that bit though. G1 concurrent and full GCs also don't seem to look at that bit, and so I think will be negatively affected. Maybe this is a job for another RFE. Some comment revision might be called for though. It would be nice to not have to re-discover (no pun intended) all this again. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 457 // Close the reachable set 458 complete_gc->do_void(); Same issues here for PhantomReferences as earlier for Soft/Weak/FinalReferences. And I think the same solution; pay attention to the marks_oops_alive bit. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/referenceProcessor.cpp 595 : ProcessTask(ref_processor, true /* marks_oops_alive */, phase_times) { } For PhantomReferences, marks_oops_alive should be false, for the same reason as for soft/weak/final. ------------------------------------------------------------------------------ From rkennke at redhat.com Tue Jun 12 16:40:33 2018 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 12 Jun 2018 18:40:33 +0200 Subject: RFR: JDK-8204685: Abstraction for TLAB dummy object In-Reply-To: <32a53350-bdc5-7ab1-6ef3-a6a2ef187f62@redhat.com> References: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> <32a53350-bdc5-7ab1-6ef3-a6a2ef187f62@redhat.com> Message-ID: <06608a7f-b7e6-a567-0d63-0818b9e23f87@redhat.com> Am 12.06.2018 um 15:58 schrieb Aleksey Shipilev: > On 06/11/2018 06:02 PM, Roman Kennke wrote: >> Similar to how object allocations should be owned by the GC, so does >> TLAB dummy object 'allocation'. TLABs and PLABs are filling their >> remaining blocks with dummy objects. GCs like Shenandoah might need a >> slightly differet layout for this, and therefore we need an abstraction. >> >> The proposed change adds a new virtual method >> CollectedHeap::fill_with_dummy_object(), the default implementation >> calls the existing CH::fill_with_object() from TLAB and PLAB like it was >> done before. >> >> Tested: hotspot-tier1, Shenandoah tests with appropriate impl >> >> http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.00/ > > I have doubts about setting "zap = false" unconditionally. As far as I can see, zapping is enabled > for debug builds only anyway, so we better keep it "true". > > 82 size_t PLAB::retire_internal() { > 83 size_t result = 0; > 84 if (_top < _hard_end) { > 85 Universe::heap()->fill_with_dummy_object(_top, _hard_end, false); <--- change to true? > 86 result += invalidate(); > 87 } > 88 return result; > 89 } > 90 > 91 void PLAB::add_undo_waste(HeapWord* obj, size_t word_sz) { > 92 Universe::heap()->fill_with_dummy_object(obj, obj + word_sz, false); <--- change to true? > 93 _undo_wasted += word_sz; > 94 } > Right. http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.01/ Good now? Thanks for reviewing! Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Tue Jun 12 18:19:35 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Jun 2018 20:19:35 +0200 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> Message-ID: Hi Kim, thanks for your review. On Tue, 2018-06-12 at 12:20 -0400, Kim Barrett wrote: > > On Jun 6, 2018, at 1:13 PM, Thomas Schatzl > om> wrote: > > > > I thought marks_oops_alive is an unnecessary quirk of the > > collectors that found its way into the reference processing > > interface; its purpose is to indicate whether the gc should perform > > stealing at the end of a phase, as the complete closures drain the > > stacks completely by themselves anyway. > > > > Stealing or not should be part of the complete closure (i.e. the > > gc, and attempted by default by the gc) imho (it "completes" the > > phase), like G1 does. > > > > I fixed it though. > > marks_oops_alive is another piece (really an entirely different > mechanism) of the avoidance of stealing and the termination protocol > when the keep_alive closure won't be marking any previously unmarked > objects, so (maybe) won't be generating any potentially stealable > work. It seems to me to be a better approach than sometimes not > calling the complete_gc closure (the other mechanism, used by > process_phase2), since it puts the decision on the specific collector > that knows how its keep_alive and complete_gc closures interact. See > further discussion below. Since the collector (via the keep_alive closure) generates the potentially stealable work, it could already find out whether there is work even without the mark_oops_alive mechanism - in a way that is even more exact than the all-or-nothing mark_oops_alive flag. > > New webrevs: > > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.0_to_1 (diff) > > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.1 (full) > > Testing: > > hs-tier1-5,jdk-tier-1-3 (with +/-ParallelRefProcEnabled) > > > > Thanks for your quick review, > > Thomas > > Sorry this one took longer. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > 315 if (log_is_enabled(Trace, gc, ref)) { > 324 if (log_is_enabled(Trace, gc, ref)) { > > s/log_is_enabled/log_develop_is_enabled/ > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > 571 complete_gc.do_void(); > > I think this is unavoidable (at least for now), but I'm wondering if > this is a performance issue for other than G1-young collections. > > In the phases where marks_oops_alive is false (like this one), for > the other collectors (and G1 concurrent and full GC, I think), there > won't be anything to steal. Why? Any imbalance in how fast the work will be processed will need stealing to keep the threads doing useful work. Particularly traversing and keeping alive followers is highly dependent on the object graph referenced by the referent. > Starting the stealing process (and its associated termination > protocol) is a waste of time in those cases. Starting the stealing process is basically free - at most what these do is looking through the work queues whether they are empty - this is a normal load of a few variables. Shenandoah's task terminator is very good at detecting these cases with very little or no work to be done and not waking up anyone (i.e. terminating very quickly). I have been running almost all test runs for all patches sent in the last month or so with it with no problem, so it will likely be a change for the first few weeks for jdk12. > ParallelScavange and ParallelCompaction both avoid creating the > stealing tasks if marks_oops_alive is false. This is actually the only case where the flag makes some sense because parallel gc needs to set the tasks beforehand. But we may want to look into making parallel gc use workgangs too, because the current implementation of the task queue in parallel is very problematic as it is a single point of contention every time a task ends, and it causes us to use lots of boiler plate code to support both workgang and what parallel gc uses. > CMS avoids calling do_work_stealing when marks_oop_live is > false. ParNew doesn't seem to look at that bit though. > > G1 concurrent and full GCs also don't seem to look at that bit, and > so I think will be negatively affected. Maybe this is a job for > another RFE. Yes, we will probably need to spend more time in this area :) > Some comment revision might be called for though. It would be nice > to not have to re-discover (no pun intended) all this again. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > 457 // Close the reachable set > 458 complete_gc->do_void(); > > Same issues here for PhantomReferences as earlier for > Soft/Weak/FinalReferences. And I think the same solution; pay > attention to the marks_oops_alive bit. - fix the task terminator (we need that anyway) - let the collector decide whether it has work - fix the parallel gc work gang :) > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/shared/referenceProcessor.cpp > 595 : ProcessTask(ref_processor, true /* marks_oops_alive */, > phase_times) { } > > For PhantomReferences, marks_oops_alive should be false, for the same > reason as for soft/weak/final. > > ------------------------------------------------------------------- > ----------- > Fixed. http://cr.openjdk.java.net/~tschatzl/8202845/webrev.1_to_2 (diff) http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2 (full) Thanks, Thomas From thomas.schatzl at oracle.com Tue Jun 12 18:50:35 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Jun 2018 20:50:35 +0200 Subject: RFR (S): 8204169: Humongous continues region remembered set states do not match the one from the corresponding humongous start region In-Reply-To: References: Message-ID: Hi all, *ping* Thanks for your reviews, Thomas On Mon, 2018-06-04 at 12:27 +0200, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this small change that fixes up remembered > set states of humongous continues regions? In particular they are not > necessarily consistent with the humongous starts region at the > moment. > > This makes for some surprises when reading the logs. Otherwise there > is no impact: all decisions on how to treat the humongous object are > made on the humongous start region anyway. > > As for testing, I added code to the heap verification to check this > consistency requirement. This makes for a 100% failure rate in > gc/g1/TestEagerReclaimHumongousRegionsClearMarkBits if not fixed. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204169 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8204169/webrev/ > Testing: > hs-tier1-3 > > Thanks, > Thomas From vinay.k.awasthi at intel.com Tue Jun 12 19:32:18 2018 From: vinay.k.awasthi at intel.com (Awasthi, Vinay K) Date: Tue, 12 Jun 2018 19:32:18 +0000 Subject: RFR(M): 8204908: Allocation of Old generation of Java Heap on alternate memory devices. Message-ID: Hello, I am requesting comments on POGC/G1GC supporting NVDIMM/DRAM heaps. When user supplies AllocateOldGenAt=, JVM divides heap into 2 parts. First part is on NVDIMM where long living objects go (OldGen) and other part is on DRAM where short living objects reside(YoungGen). This is ONLY supported for G1GC and POGC collectors on Linux and Windows. On Windows, OldGen resizing is NOT supported. On Linux, for G1GC, OldGen resizing is not supported however for POGC it is. Heap residing on DRAM is supported for Windows and Linux for POGC and G1GC. JEP to support allocating Old generation on NV-DIMM - https://bugs.openjdk.java.net/browse/JDK-8202286 Patch is at http://cr.openjdk.java.net/~kkharbas/8202286/webrev.00/ SpecJbb2005/SpecJbb2015 etc. are passing with this patch and one can test this by simply mounting tmpfs of certain size and pass that as an argument to AllocateOldGenAt. For G1GC, G1MaxNewSizePercent controls how much of total heap will reside on DRAM. Rest of the heap then goes to NVDIMM. For POGC, MaxNewSize decides the DRAM residing young gen size. Rest is mounted on NVDIMM. In all these implementations, JVM ends up reserving more than initial size determined by ergonomics (never more than Xmx). JVM displays these messages and shows NVDIMM and DRAM reserved bytes. Thanks, Vinay -------------- next part -------------- An HTML attachment was scrubbed... URL: From kim.barrett at oracle.com Tue Jun 12 19:39:46 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Jun 2018 15:39:46 -0400 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> Message-ID: <9A97AFF1-0A3B-471B-B995-C39627146AA6@oracle.com> > On Jun 12, 2018, at 2:19 PM, Thomas Schatzl wrote: > On Tue, 2018-06-12 at 12:20 -0400, Kim Barrett wrote: >> src/hotspot/share/gc/shared/referenceProcessor.cpp >> 571 complete_gc.do_void(); >> >> I think this is unavoidable (at least for now), but I'm wondering if >> this is a performance issue for other than G1-young collections. >> >> In the phases where marks_oops_alive is false (like this one), for >> the other collectors (and G1 concurrent and full GC, I think), there >> won't be anything to steal. > > Why? Any imbalance in how fast the work will be processed will need > stealing to keep the threads doing useful work. > Particularly traversing and keeping alive followers is highly dependent > on the object graph referenced by the referent. [Thomas and I discussed this offline. This is for anyone else who might be following along.] Because the threads aren't generating any work for others to steal, for collections other than G1-young/mixed. This is distinct from any allocation of references to be processed to threads. This is about dealing with referents to which the keep_alive closure was applied. In these phases, for some collections the keep_alive is just an expensive nop, finding that the referent is already marked live (which we knew because the is_alive closure already told us that). For others, it needs to forward the referent field to the already copied referent; no additional work (e.g. scanning the referent) is needed. The only current keep_alive closure that creates any work for the complete_gc closure in these cases is G1CopyingKeepAliveClosure. The old process_phase2 assumed the complete_gc closure would never have any work to do, so didn't bother calling it. >> Starting the stealing process (and its associated termination >> protocol) is a waste of time in those cases. > > Starting the stealing process is basically free - at most what these do > is looking through the work queues whether they are empty - this is a > normal load of a few variables. > > Shenandoah's task terminator is very good at detecting these cases with > very little or no work to be done and not waking up anyone (i.e. > terminating very quickly). > > I have been running almost all test runs for all patches sent in the > last month or so with it with no problem, so it will likely be a change > for the first few weeks for jdk12. Oh, that sounds great! Looking forward to it. >> ParallelScavange and ParallelCompaction both avoid creating the >> stealing tasks if marks_oops_alive is false. > > This is actually the only case where the flag makes some sense because > parallel gc needs to set the tasks beforehand. > > But we may want to look into making parallel gc use workgangs too, > because the current implementation of the task queue in parallel is > very problematic as it is a single point of contention every time a > task ends, and it causes us to use lots of boiler plate code to support > both workgang and what parallel gc uses. > >> CMS avoids calling do_work_stealing when marks_oop_live is >> false. ParNew doesn't seem to look at that bit though. >> >> G1 concurrent and full GCs also don't seem to look at that bit, and >> so I think will be negatively affected. Maybe this is a job for >> another RFE. > > Yes, we will probably need to spend more time in this area :) Agreed. > >> Some comment revision might be called for though. It would be nice >> to not have to re-discover (no pun intended) all this again. >> >> ------------------------------------------------------------------- >> ----------- >> src/hotspot/share/gc/shared/referenceProcessor.cpp >> 457 // Close the reachable set >> 458 complete_gc->do_void(); >> >> Same issues here for PhantomReferences as earlier for >> Soft/Weak/FinalReferences. And I think the same solution; pay >> attention to the marks_oops_alive bit. > > - fix the task terminator (we need that anyway) > - let the collector decide whether it has work > - fix the parallel gc work gang > > :) Agreed. > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.1_to_2 (diff) > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2 (full) Looks good. From kim.barrett at oracle.com Tue Jun 12 21:18:25 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Jun 2018 17:18:25 -0400 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: <9A97AFF1-0A3B-471B-B995-C39627146AA6@oracle.com> References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> <9A97AFF1-0A3B-471B-B995-C39627146AA6@oracle.com> Message-ID: <66141559-E8EF-4654-9BBE-0D62E7FE4E02@oracle.com> > On Jun 12, 2018, at 3:39 PM, Kim Barrett wrote: > >> On Jun 12, 2018, at 2:19 PM, Thomas Schatzl wrote:http://cr.openjdk.java.net/~tschatzl/8202845/webrev.1_to_2 (diff) >> http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2 (full) > > Looks good. One more thing I just noticed. src/hotspot/share/gc/shared/referenceProcessor.cpp 865 log_reflist("Phase2 Soft after", _discoveredSoftRefs, _max_num_queues); 866 log_reflist("Phase2 Weak after", _discoveredWeakRefs, _max_num_queues); at the end of process_soft_weak_final_refs. At this stage, I think there must be no soft or weak references. Better to assert that than log empty sets. Similarly at the end of process_phantom_refs. And the same is true for process_final_keep_alive. From kim.barrett at oracle.com Wed Jun 13 01:19:16 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Jun 2018 21:19:16 -0400 Subject: RFR (S): 8204169: Humongous continues region remembered set states do not match the one from the corresponding humongous start region In-Reply-To: References: Message-ID: <324AC60D-1C6D-492C-8BC7-DA118E455F8E@oracle.com> > On Jun 4, 2018, at 6:27 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this small change that fixes up remembered set > states of humongous continues regions? In particular they are not > necessarily consistent with the humongous starts region at the moment. > > This makes for some surprises when reading the logs. Otherwise there is > no impact: all decisions on how to treat the humongous object are made > on the humongous start region anyway. > > As for testing, I added code to the heap verification to check this > consistency requirement. This makes for a 100% failure rate in > gc/g1/TestEagerReclaimHumongousRegionsClearMarkBits if not fixed. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204169 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8204169/webrev/ > Testing: > hs-tier1-3 > > Thanks, > Thomas Looks good. From thomas.stuefe at gmail.com Wed Jun 13 06:11:17 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 13 Jun 2018 08:11:17 +0200 Subject: RFR(M): 8204908: Allocation of Old generation of Java Heap on alternate memory devices. In-Reply-To: References: Message-ID: Hi, (adding hs-runtime) had a cursory glance at the proposed changes. I am taken aback by the amount of complexity added to ReservedSpace for what I understand is a narrow experimental feature only benefiting 1-2 Operating Systems and - I guess, the JEP is not really clear there - only x86 with certain hardware configurations? e.g. http://cr.openjdk.java.net/~kkharbas/8202286/webrev.00/share/memory/virtualspace.hpp.udiff.html The source having zero comments does not really help either. "The motivation behind this JEP is to provide an experimental feature to facilitate exploration of different use cases for these non-DRAM memories." Ok, but does this really have to be upstream, in this form, to experiment with it? I am not objecting against this feature in general. But I am unhappy about the monster ReservedSpace is turning into. IMHO before increasing complexity even further this should be revamped, otherwise it becomes too unwieldy to do anything with it. It already somehow takes care of a number of huge pages ("_special" and "_alignment"), implicit-null-checks-when-space-happens-to-be-used-as-part-of-java-heap ("_noaccess_prefix"), allocation at alternate file locations ("_fd_for_heap", introduced by you with 8190308). You also added a new variant to os::commit_memory() which directly goes down to mmap(). So far, for the most part, the os::{reserve|commit}_memory APIs have been agnostic to the underlying implementation. You pretty much tie it to mmap() now. This adds implicit restrictions to the API we did not have before (e.g. will not work if platform uses SysV shm APIs to implement these APIs). Best Regards, Thomas On Tue, Jun 12, 2018 at 9:32 PM, Awasthi, Vinay K wrote: > Hello, > > I am requesting comments on POGC/G1GC supporting NVDIMM/DRAM heaps. When > user supplies AllocateOldGenAt=, JVM divides heap into 2 > parts. First part is on NVDIMM where long living objects go (OldGen) and > other part is on DRAM where short living objects reside(YoungGen). This is > ONLY supported for G1GC and POGC collectors on Linux and Windows. > > On Windows, OldGen resizing is NOT supported. On Linux, for G1GC, OldGen > resizing is not supported however for POGC it is. Heap residing on DRAM is > supported for Windows and Linux for POGC and G1GC. > > JEP to support allocating Old generation on NV-DIMM - > https://bugs.openjdk.java.net/browse/JDK-8202286 > > Patch is at http://cr.openjdk.java.net/~kkharbas/8202286/webrev.00/ > > SpecJbb2005/SpecJbb2015 etc. are passing with this patch and one can test > this by simply mounting tmpfs of certain size and pass that as an argument > to AllocateOldGenAt. > > For G1GC, G1MaxNewSizePercent controls how much of total heap will reside on > DRAM. Rest of the heap then goes to NVDIMM. > > For POGC, MaxNewSize decides the DRAM residing young gen size. Rest is > mounted on NVDIMM. > > In all these implementations, JVM ends up reserving more than initial size > determined by ergonomics (never more than Xmx). JVM displays these messages > and shows NVDIMM and DRAM reserved bytes. > > Thanks, > > Vinay > > > > > > From shade at redhat.com Wed Jun 13 08:09:04 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 13 Jun 2018 10:09:04 +0200 Subject: RFR: JDK-8204685: Abstraction for TLAB dummy object In-Reply-To: <06608a7f-b7e6-a567-0d63-0818b9e23f87@redhat.com> References: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> <32a53350-bdc5-7ab1-6ef3-a6a2ef187f62@redhat.com> <06608a7f-b7e6-a567-0d63-0818b9e23f87@redhat.com> Message-ID: <125758f1-1f78-6610-84c7-7c6899d0810f@redhat.com> On 06/12/2018 06:40 PM, Roman Kennke wrote: > Right. > http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.01/ It looks okay. Looking closer to whole thing again. What is enforcing the use of this new method? I would expect only fill_with_dummy_object be available to non-CollectedHeap callers, which means other methods e.g. fill_with_object to become protected/private. Looking at how proliferated the use of those methods are within the GCs, it seems some "friend"-ing is in order. Can be done in a separate RFE. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Wed Jun 13 08:14:46 2018 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 13 Jun 2018 10:14:46 +0200 Subject: RFR: JDK-8204685: Abstraction for TLAB dummy object In-Reply-To: <125758f1-1f78-6610-84c7-7c6899d0810f@redhat.com> References: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> <32a53350-bdc5-7ab1-6ef3-a6a2ef187f62@redhat.com> <06608a7f-b7e6-a567-0d63-0818b9e23f87@redhat.com> <125758f1-1f78-6610-84c7-7c6899d0810f@redhat.com> Message-ID: Am 13.06.2018 um 10:09 schrieb Aleksey Shipilev: > On 06/12/2018 06:40 PM, Roman Kennke wrote: >> Right. >> http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.01/ > > It looks okay. > > Looking closer to whole thing again. What is enforcing the use of this new method? I would expect > only fill_with_dummy_object be available to non-CollectedHeap callers, which means other methods > e.g. fill_with_object to become protected/private. Looking at how proliferated the use of those > methods are within the GCs, it seems some "friend"-ing is in order. Can be done in a separate RFE. > > -Aleksey > Right. I filed: https://bugs.openjdk.java.net/browse/JDK-8204940 and will handle it there. Thanks for reviewing! Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Wed Jun 13 08:35:51 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 13 Jun 2018 10:35:51 +0200 Subject: RFR (S): 8204169: Humongous continues region remembered set states do not match the one from the corresponding humongous start region In-Reply-To: <324AC60D-1C6D-492C-8BC7-DA118E455F8E@oracle.com> References: <324AC60D-1C6D-492C-8BC7-DA118E455F8E@oracle.com> Message-ID: <1b2779249b2f85cf6a7b56e8d004b7fe8c6dda7c.camel@oracle.com> Hi, On Tue, 2018-06-12 at 21:19 -0400, Kim Barrett wrote: > > On Jun 4, 2018, at 6:27 AM, Thomas Schatzl > om> wrote: > > > > Hi all, > > > > can I have reviews for this small change that fixes up remembered > > set states of humongous continues regions? [...] > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8204169 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8204169/webrev/ > > Testing: > > hs-tier1-3 > > > > Thanks, > > Thomas > > Looks good. > thanks for your review. Thomas From thomas.schatzl at oracle.com Wed Jun 13 08:55:08 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 13 Jun 2018 10:55:08 +0200 Subject: RFR: JDK-8204685: Abstraction for TLAB dummy object In-Reply-To: <06608a7f-b7e6-a567-0d63-0818b9e23f87@redhat.com> References: <0da62d1d-02ea-e10d-3b4d-3d7c6aa48a80@redhat.com> <32a53350-bdc5-7ab1-6ef3-a6a2ef187f62@redhat.com> <06608a7f-b7e6-a567-0d63-0818b9e23f87@redhat.com> Message-ID: <908b8fdd1054fc2abe379f57e2f1762ac90ef9e1.camel@oracle.com> Hi, On Tue, 2018-06-12 at 18:40 +0200, Roman Kennke wrote: > Am 12.06.2018 um 15:58 schrieb Aleksey Shipilev: > > On 06/11/2018 06:02 PM, Roman Kennke wrote: > > > Similar to how object allocations should be owned by the GC, so > > > does TLAB dummy object 'allocation'. TLABs and PLABs are filling > > > their remaining blocks with dummy objects. GCs like Shenandoah > > > might need a slightly different layout for this, and therefore we > > > need an abstraction. > > > > > > The proposed change adds a new virtual method > > > CollectedHeap::fill_with_dummy_object(), the default > > > implementation calls the existing CH::fill_with_object() from > > > TLAB and PLAB like it was done before. > > > > > > Tested: hotspot-tier1, Shenandoah tests with appropriate impl > > > > > > http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.00/ > > > > I have doubts about setting "zap = false" unconditionally. As far > > as I can see, zapping is enabled > > for debug builds only anyway, so we better keep it "true". > > [...] > > Right. > http://cr.openjdk.java.net/~rkennke/JDK-8204685/webrev.01/ > > Good now? > good. Thomas From thomas.schatzl at oracle.com Wed Jun 13 09:18:20 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 13 Jun 2018 11:18:20 +0200 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: <66141559-E8EF-4654-9BBE-0D62E7FE4E02@oracle.com> References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> <9A97AFF1-0A3B-471B-B995-C39627146AA6@oracle.com> <66141559-E8EF-4654-9BBE-0D62E7FE4E02@oracle.com> Message-ID: <7a8fc4917623be1649b07e1147c8df6463c14e8a.camel@oracle.com> Hi Kim, On Tue, 2018-06-12 at 17:18 -0400, Kim Barrett wrote: > > On Jun 12, 2018, at 3:39 PM, Kim Barrett > > wrote: > > > > > On Jun 12, 2018, at 2:19 PM, Thomas Schatzl > > > wrote:http://cr.openjdk.java.net/~tsc > > > hatzl/8202845/webrev.1_to_2 (diff) > > > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2 (full) > > > > Looks good. > > One more thing I just noticed. > > src/hotspot/share/gc/shared/referenceProcessor.cpp > 865 log_reflist("Phase2 Soft after", _discoveredSoftRefs, > _max_num_queues); > 866 log_reflist("Phase2 Weak after", _discoveredWeakRefs, > _max_num_queues); > > at the end of process_soft_weak_final_refs. At this stage, I think > there must be no soft or weak references. Better to assert that than > log empty sets. > > Similarly at the end of process_phantom_refs. > > And the same is true for process_final_keep_alive. > fixed in http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2_to_3 (diff) and http://cr.openjdk.java.net/~tschatzl/8202845/webrev.3 (full) Currently running hs-tier1-3, but I do not expect issues. Thanks! Thomas From robbin.ehn at oracle.com Wed Jun 13 10:44:58 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 13 Jun 2018 12:44:58 +0200 Subject: RFR(M): 8204613: StringTable: Calculates wrong number of uncleaned items. In-Reply-To: <8f27ac9e-8201-4cda-4bca-152c7ab992d3@oracle.com> References: <8f27ac9e-8201-4cda-4bca-152c7ab992d3@oracle.com> Message-ID: <3a927b10-6b01-f1c2-184c-5a9e3e2aefdb@oracle.com> Hi, Stefan pointed out that there are some useless methods calls since we don't remove any strings in some of the walks. Serial, Parallel and CMS only removes strings in serial call to unlink. G1 only removes strings in StringAndSymbolCleaningTask. I reverted does unneeded changes, leaving the patch with only G1 and ZGC changes (+ stringtable): http://cr.openjdk.java.net/~rehn/8204613/v2/webrev/ Thanks Stefan! /Robbin On 06/11/2018 06:09 PM, Robbin Ehn wrote: > Hi all, please review. > > The StringTable lazy evicts dead string, until a dead string is evicted it will > be counted as a dead string. If it is not evicted before next GC cycle it is > counted again, making the count of uncleaned strings skew. > Also ZGC walks the strings without using the stringtable GC API, but it needs to > be-able to feedback the number of dead strings to get the cleaning functionality. > There is a big probability that ZGC makes it in before this change-set, so I > included ZGC changes. > > There was a compile issue on slowdebug on windows for create_archived_string(), > I added NOT_CDS_JAVA_HEAP_RETURN_(NULL) for it. > > Change-set: http://cr.openjdk.java.net/~rehn/8204613/webrev/index.html > Bug: https://bugs.openjdk.java.net/browse/JDK-8204613 > > T1-3 with ZGC testing on, no related issues and manual JMH testing. > > Thanks, Robbin From per.liden at oracle.com Wed Jun 13 12:20:07 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 13 Jun 2018 14:20:07 +0200 Subject: RFR(M): 8204613: StringTable: Calculates wrong number of uncleaned items. In-Reply-To: <3a927b10-6b01-f1c2-184c-5a9e3e2aefdb@oracle.com> References: <8f27ac9e-8201-4cda-4bca-152c7ab992d3@oracle.com> <3a927b10-6b01-f1c2-184c-5a9e3e2aefdb@oracle.com> Message-ID: Looks good Robbin! /Per On 06/13/2018 12:44 PM, Robbin Ehn wrote: > Hi, Stefan pointed out that there are some useless methods calls since > we don't remove any strings in some of the walks. > > Serial, Parallel and CMS only removes strings in serial call to unlink. > G1 only removes strings in StringAndSymbolCleaningTask. > > I reverted does unneeded changes, leaving the patch with only G1 and ZGC > changes (+ stringtable): > http://cr.openjdk.java.net/~rehn/8204613/v2/webrev/ > > Thanks Stefan! > > /Robbin > > On 06/11/2018 06:09 PM, Robbin Ehn wrote: >> Hi all, please review. >> >> The StringTable lazy evicts dead string, until a dead string is >> evicted it will be counted as a dead string. If it is not evicted >> before next GC cycle it is counted again, making the count of >> uncleaned strings skew. >> Also ZGC walks the strings without using the stringtable GC API, but >> it needs to be-able to feedback the number of dead strings to get the >> cleaning functionality. >> There is a big probability that ZGC makes it in before this >> change-set, so I included ZGC changes. >> >> There was a compile issue on slowdebug on windows for >> create_archived_string(), I added NOT_CDS_JAVA_HEAP_RETURN_(NULL) for it. >> >> Change-set: http://cr.openjdk.java.net/~rehn/8204613/webrev/index.html >> Bug: https://bugs.openjdk.java.net/browse/JDK-8204613 >> >> T1-3 with ZGC testing on, no related issues and manual JMH testing. >> >> Thanks, Robbin From robbin.ehn at oracle.com Wed Jun 13 12:31:39 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 13 Jun 2018 14:31:39 +0200 Subject: RFR(M): 8204613: StringTable: Calculates wrong number of uncleaned items. In-Reply-To: References: <8f27ac9e-8201-4cda-4bca-152c7ab992d3@oracle.com> <3a927b10-6b01-f1c2-184c-5a9e3e2aefdb@oracle.com> Message-ID: <0d066691-9ee6-8d08-3f5f-6a5507cf17a1@oracle.com> Thanks Per! /Robbin On 06/13/2018 02:20 PM, Per Liden wrote: > Looks good Robbin! > > /Per > > On 06/13/2018 12:44 PM, Robbin Ehn wrote: >> Hi, Stefan pointed out that there are some useless methods calls since we >> don't remove any strings in some of the walks. >> >> Serial, Parallel and CMS only removes strings in serial call to unlink. >> G1 only removes strings in StringAndSymbolCleaningTask. >> >> I reverted does unneeded changes, leaving the patch with only G1 and ZGC >> changes (+ stringtable): >> http://cr.openjdk.java.net/~rehn/8204613/v2/webrev/ >> >> Thanks Stefan! >> >> /Robbin >> >> On 06/11/2018 06:09 PM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> The StringTable lazy evicts dead string, until a dead string is evicted it >>> will be counted as a dead string. If it is not evicted before next GC cycle >>> it is counted again, making the count of uncleaned strings skew. >>> Also ZGC walks the strings without using the stringtable GC API, but it needs >>> to be-able to feedback the number of dead strings to get the cleaning >>> functionality. >>> There is a big probability that ZGC makes it in before this change-set, so I >>> included ZGC changes. >>> >>> There was a compile issue on slowdebug on windows for >>> create_archived_string(), I added NOT_CDS_JAVA_HEAP_RETURN_(NULL) for it. >>> >>> Change-set: http://cr.openjdk.java.net/~rehn/8204613/webrev/index.html >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8204613 >>> >>> T1-3 with ZGC testing on, no related issues and manual JMH testing. >>> >>> Thanks, Robbin From Derek.White at cavium.com Wed Jun 13 16:16:43 2018 From: Derek.White at cavium.com (White, Derek) Date: Wed, 13 Jun 2018 16:16:43 +0000 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <9ba49130e8394e41beec0a9841504dc0@sap.com> <1b11d4beb061c88a657393e7bffc2e26682191f1.camel@oracle.com> Message-ID: Hi Michihiro, We are testing this patch on aarch64, and are seeing an (inexplicable) performance regression with G1GC on SPECjbb. The major difference in the generated code is removing a barrier as expected. We are investigating further, and will retest to rule out user error, but wanted throw out a caution flag. * Derek From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] On Behalf Of Michihiro Horie Sent: Tuesday, June 12, 2018 8:17 AM To: Thomas Schatzl Cc: david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Thomas, Thank you for telling that some people get a problem in handle_evacuation_failure. The handle_evacuation_failure is invoked from copy_to_survivor_space and also keeps the following two: * There is no copy that must finish before the CAS. * Threads that failed in the CAS must not dereference the forwardee. Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for Thomas Schatzl ---2018/06/12 20:26:54---Hi, On Tue, 2018-06-12 at 20:18 +0900, Michihiro Horie wrote:]Thomas Schatzl ---2018/06/12 20:26:54---Hi, On Tue, 2018-06-12 at 20:18 +0900, Michihiro Horie wrote: From: Thomas Schatzl > To: Michihiro Horie >, "Doerr, Martin" > Cc: "david.holmes at oracle.com" >, "hotspot-gc-dev at openjdk.java.net" > Date: 2018/06/12 20:26 Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space ________________________________ Hi, On Tue, 2018-06-12 at 20:18 +0900, Michihiro Horie wrote: > Hi Martin, > > Thank you for your comments. Yes, this change is significant on PPC64 > as I showed a big improvement in SPECjbb2015 (27% better critical- > jOPS). > > Changing the handle_evacuation_failure_par is not necessary. I could > not observe the performance bottleneck in > handle_evacuation_failure_par from the profiles, > > New webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.01/ > of course handle_evacuation_failure_par() will not show up in the profiles if there is no evacuation failure. However during evacuation failure, this method will be stressed a lot. While from the outside it might not look like it is worth optimizing, lack of performance in evacuation failure has been a rather often voiced complaint (for the users where this occurs). This is only a side remark: I have not looked through the code for issues due to relaxing memory order recently, but I remember having done so when the similar change for parallel gc had been proposed last year. I remember that Michihiro's reasoning why this works for G1 the way it does is sound, but as mentioned please wait for a proper review. Thanks, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From vinay.k.awasthi at intel.com Wed Jun 13 18:17:46 2018 From: vinay.k.awasthi at intel.com (Awasthi, Vinay K) Date: Wed, 13 Jun 2018 18:17:46 +0000 Subject: RFR(M): 8204908: Allocation of Old generation of Java Heap on alternate memory devices. In-Reply-To: References: Message-ID: First Thanks Thomas for your comments... 1. All changes are kept within AllocateOldGenAt variable being set enclosure... We do not want any changes to happen if this is NOT set. On virtualspace.cpp All changes are related to the fact that now Xmx gives overall memory to map, out of that part of the memory is mapped in for DRAM rest is reserved and later committed during expand--> calling make_region_available etc.. for G1GC. We no longer have one shot reservation/commit for entire space. We are only setting variables here (if _fd_for_nvdimm is set which is coming from AllocateOldGenAt...)... These setting are later used in g1PageBasedVirtualMemory (see os::nvdimm_heapbase() etc..). All changes again are kept out from leaking (except class variables in reservespace... On ReserveSpace: 2. Changes in ReservedSpace are related to following: 1. We want to reserve enough virtual memory so that later there is room to map NVDIMM. a. We envision configurations where 1 TB NVDIMM is being used within a system with 32-64 GB memory. User will provide -Xmx and G1MaxNewSizePercent (this will limit DRAM allocation of YoungGen) and rest will be mapped to NVDIMM.. currently ERGO is not aware of 2 kinds of memories present in the system with different performance characteristics (read/write latencies etc...)... I did not want to make any other changes any where else... b. We setup nvdimm_base memory and sizes etc.. for NVDIMM and commit it when expand is called in g1PageBasedVirtualMemory. c. All committing is happening with in map_memory_to_file which is called when AllocateHeapAt or AllocateOldGenAt is used. Rest of the VM uses in case of Linux etc.. os::Linux:commit_memory_impl which I did not want to modify to keep changes local (as we need MAP_SHARED in case of file descriptors..) On os::commit_memory() 3. Since I did not want to have *ANY* change to become active if AllocateOldGenAt is not used... I added commit_memory variance that takes file descriptor and offset also as parameter.. This is *ONLY* called when AllocateOldGenAt is used by OldGen to commit memory incrementally as we got feedback earlier that JVM team will like to have support for UseAdaptiveSizePolicy as this resizes OldGen (in our case NVDIMM mapping) and YoungGen (using DRAM) as needed. We see about 20-28% gain in performance using tmpfs for Specjbb2005 (vs fixing OldGen to a fixed size and mapping it all in one shot.... On Windows, there is NO way to do incremental mapping so we are committing oldGen to maximum allowed from get go)... JVM is continuing to use commit_memory without file descriptors & offset in all cases if AllocateOldGenAt is not used. If AllocateOldGenAt is used commit_meory is ONLY called by OldGen to map using map_memory_to_file (again stand alone function that rest of the VM does not use) to do incremental mapping. _fd_for_heap was added to map whole heap on NVDIMM. These are minimum set of changes that we need to support NVDIMM (i.e. we need to open a file, map it and commit)... Other changes are related to option handling etc.. i.e. do not allow AllocateOldGenAt if not G1GC and POGC... With this context in place, I am looking for any and all feedback... Thanks, Vinay Regarding Reserve Space: This implementation is using map_memory_to_file in all cases.. which is kept separate -----Original Message----- From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: Tuesday, June 12, 2018 11:11 PM To: Awasthi, Vinay K Cc: hotspot-gc-dev at openjdk.java.net; Kharbas, Kishor ; Viswanathan, Sandhya ; Hotspot dev runtime Subject: Re: RFR(M): 8204908: Allocation of Old generation of Java Heap on alternate memory devices. Hi, (adding hs-runtime) had a cursory glance at the proposed changes. I am taken aback by the amount of complexity added to ReservedSpace for what I understand is a narrow experimental feature only benefiting 1-2 Operating Systems and - I guess, the JEP is not really clear there - only x86 with certain hardware configurations? e.g. http://cr.openjdk.java.net/~kkharbas/8202286/webrev.00/share/memory/virtualspace.hpp.udiff.html The source having zero comments does not really help either. "The motivation behind this JEP is to provide an experimental feature to facilitate exploration of different use cases for these non-DRAM memories." Ok, but does this really have to be upstream, in this form, to experiment with it? I am not objecting against this feature in general. But I am unhappy about the monster ReservedSpace is turning into. IMHO before increasing complexity even further this should be revamped, otherwise it becomes too unwieldy to do anything with it. It already somehow takes care of a number of huge pages ("_special" and "_alignment"), implicit-null-checks-when-space-happens-to-be-used-as-part-of-java-heap ("_noaccess_prefix"), allocation at alternate file locations ("_fd_for_heap", introduced by you with 8190308). You also added a new variant to os::commit_memory() which directly goes down to mmap(). So far, for the most part, the os::{reserve|commit}_memory APIs have been agnostic to the underlying implementation. You pretty much tie it to mmap() now. This adds implicit restrictions to the API we did not have before (e.g. will not work if platform uses SysV shm APIs to implement these APIs). Best Regards, Thomas On Tue, Jun 12, 2018 at 9:32 PM, Awasthi, Vinay K wrote: > Hello, > > I am requesting comments on POGC/G1GC supporting NVDIMM/DRAM heaps. > When user supplies AllocateOldGenAt=, JVM divides > heap into 2 parts. First part is on NVDIMM where long living objects > go (OldGen) and other part is on DRAM where short living objects > reside(YoungGen). This is ONLY supported for G1GC and POGC collectors on Linux and Windows. > > On Windows, OldGen resizing is NOT supported. On Linux, for G1GC, > OldGen resizing is not supported however for POGC it is. Heap residing > on DRAM is supported for Windows and Linux for POGC and G1GC. > > JEP to support allocating Old generation on NV-DIMM - > https://bugs.openjdk.java.net/browse/JDK-8202286 > > Patch is at http://cr.openjdk.java.net/~kkharbas/8202286/webrev.00/ > > SpecJbb2005/SpecJbb2015 etc. are passing with this patch and one can > test this by simply mounting tmpfs of certain size and pass that as an > argument to AllocateOldGenAt. > > For G1GC, G1MaxNewSizePercent controls how much of total heap will > reside on DRAM. Rest of the heap then goes to NVDIMM. > > For POGC, MaxNewSize decides the DRAM residing young gen size. Rest is > mounted on NVDIMM. > > In all these implementations, JVM ends up reserving more than initial > size determined by ergonomics (never more than Xmx). JVM displays > these messages and shows NVDIMM and DRAM reserved bytes. > > Thanks, > > Vinay > > > > > > From kim.barrett at oracle.com Wed Jun 13 20:08:27 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 13 Jun 2018 16:08:27 -0400 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: <7a8fc4917623be1649b07e1147c8df6463c14e8a.camel@oracle.com> References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> <9A97AFF1-0A3B-471B-B995-C39627146AA6@oracle.com> <66141559-E8EF-4654-9BBE-0D62E7FE4E02@oracle.com> <7a8fc4917623be1649b07e1147c8df6463c14e8a.camel@oracle.com> Message-ID: <72FB6B58-45FC-4B57-8484-915C6B63777A@oracle.com> > On Jun 13, 2018, at 5:18 AM, Thomas Schatzl wrote: > > Hi Kim, > > On Tue, 2018-06-12 at 17:18 -0400, Kim Barrett wrote: >>> On Jun 12, 2018, at 3:39 PM, Kim Barrett >>> wrote: >>> >>>> On Jun 12, 2018, at 2:19 PM, Thomas Schatzl >>>> wrote:http://cr.openjdk.java.net/~tsc >>>> hatzl/8202845/webrev.1_to_2 (diff) >>>> http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2 (full) >>> >>> Looks good. >> >> One more thing I just noticed. >> >> src/hotspot/share/gc/shared/referenceProcessor.cpp >> 865 log_reflist("Phase2 Soft after", _discoveredSoftRefs, >> _max_num_queues); >> 866 log_reflist("Phase2 Weak after", _discoveredWeakRefs, >> _max_num_queues); >> >> at the end of process_soft_weak_final_refs. At this stage, I think >> there must be no soft or weak references. Better to assert that than >> log empty sets. >> >> Similarly at the end of process_phantom_refs. >> >> And the same is true for process_final_keep_alive. >> > > fixed in > > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2_to_3 (diff) > and > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.3 (full) > > Currently running hs-tier1-3, but I do not expect issues. > > Thanks! > > Thomas Looks good, except verify_total_count_zero should be debug-only rather than !PRODUCT. In an ?optimized" build it does nothing except waste time (calls total_count but does nothing with the result because assert is suppressed). From kim.barrett at oracle.com Wed Jun 13 21:15:41 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 13 Jun 2018 17:15:41 -0400 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: Message-ID: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> > On Jun 7, 2018, at 2:01 AM, Michihiro Horie wrote: > > Dear all, > > Would you please review the following change? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 > Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 I was going to say that this looks good to me. But then I saw Derek White?s reply about an unexpected performance regression. I?d like to wait until he reports back. From thomas.schatzl at oracle.com Wed Jun 13 21:26:08 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 13 Jun 2018 23:26:08 +0200 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: <72FB6B58-45FC-4B57-8484-915C6B63777A@oracle.com> References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> <9A97AFF1-0A3B-471B-B995-C39627146AA6@oracle.com> <66141559-E8EF-4654-9BBE-0D62E7FE4E02@oracle.com> <7a8fc4917623be1649b07e1147c8df6463c14e8a.camel@oracle.com> <72FB6B58-45FC-4B57-8484-915C6B63777A@oracle.com> Message-ID: <6d103b4fe2caf32c7ff8d3599522338ba8021ccc.camel@oracle.com> Hi Kim, On Wed, 2018-06-13 at 16:08 -0400, Kim Barrett wrote: > > On Jun 13, 2018, at 5:18 AM, Thomas Schatzl > com> wrote: > > > > Hi Kim, > > > > On Tue, 2018-06-12 at 17:18 -0400, Kim Barrett wrote: > > > > On Jun 12, 2018, at 3:39 PM, Kim Barrett > > > m> > > > > wrote: > > > > > > > > > On Jun 12, 2018, at 2:19 PM, Thomas Schatzl > > > > > wrote:http://cr.openjdk.java.net/ > > > > > ~tsc > > > > > hatzl/8202845/webrev.1_to_2 (diff) > > > > > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2 (full) > > > > > > > > Looks good. > > > > > > One more thing I just noticed. > > > > > > src/hotspot/share/gc/shared/referenceProcessor.cpp > > > 865 log_reflist("Phase2 Soft after", _discoveredSoftRefs, > > > _max_num_queues); > > > 866 log_reflist("Phase2 Weak after", _discoveredWeakRefs, > > > _max_num_queues); > > > > > > at the end of process_soft_weak_final_refs. At this stage, I > > > think there must be no soft or weak references. Better to assert > > > that than log empty sets. > > > > > > Similarly at the end of process_phantom_refs. > > > > > > And the same is true for process_final_keep_alive. > > > > > > > fixed in > > > > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2_to_3 (diff) > > and > > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.3 (full) > > > > Currently running hs-tier1-3, but I do not expect issues. > > No issues. > > Thomas > > Looks good, except verify_total_count_zero should be debug-only > rather than !PRODUCT. > In an ?optimized" build it does nothing except waste time (calls > total_count but does nothing > with the result because assert is suppressed). I fixed that in place in the webrevs, replacing the NOT_PRODUCT_RETURN with a NOT_DEBUG_RETURN and the #ifndef PRODUCT with an #ifdef ASSERT. Thanks, Thomas From kim.barrett at oracle.com Wed Jun 13 21:53:40 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 13 Jun 2018 17:53:40 -0400 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: <6d103b4fe2caf32c7ff8d3599522338ba8021ccc.camel@oracle.com> References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> <9A97AFF1-0A3B-471B-B995-C39627146AA6@oracle.com> <66141559-E8EF-4654-9BBE-0D62E7FE4E02@oracle.com> <7a8fc4917623be1649b07e1147c8df6463c14e8a.camel@oracle.com> <72FB6B58-45FC-4B57-8484-915C6B63777A@oracle.com> <6d103b4fe2caf32c7ff8d3599522338ba8021ccc.camel@oracle.com> Message-ID: <5EB91B90-6A06-42E3-A138-859600F4D02D@oracle.com> > On Jun 13, 2018, at 5:26 PM, Thomas Schatzl wrote: > >>> >>> http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2_to_3 (diff) >>> and >>> http://cr.openjdk.java.net/~tschatzl/8202845/webrev.3 (full) >>> >>> Currently running hs-tier1-3, but I do not expect issues. >>> > > No issues. > >>> Thomas >> >> Looks good, except verify_total_count_zero should be debug-only >> rather than !PRODUCT. >> In an ?optimized" build it does nothing except waste time (calls >> total_count but does nothing >> with the result because assert is suppressed). > > I fixed that in place in the webrevs, replacing the NOT_PRODUCT_RETURN > with a NOT_DEBUG_RETURN and the #ifndef PRODUCT with an #ifdef ASSERT. > > Thanks, > Thomas Looks good From kim.barrett at oracle.com Wed Jun 13 22:05:19 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 13 Jun 2018 18:05:19 -0400 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> Message-ID: > On Jun 8, 2018, at 10:52 AM, Thomas Schatzl wrote: > > Webrev is at http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ . > > This webrev is based on top of latest jdk/jdk and https://bugs.openjdk. > java.net/browse/JDK-8202845 . > > If you want to test parallel gc, you also need the fixes for JDK- > 8204617 and JDK-8204618 currently out for review. > > Testing: > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled > > Thanks, > Thomas Looks good. From HORIE at jp.ibm.com Wed Jun 13 23:47:39 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Thu, 14 Jun 2018 08:47:39 +0900 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi Derek, Kim, I agree Derek?s further report on an unexpected performance regression is needed. I would like to know the root cause if any. Just for reference, we once measured the copy_to_survivor_space change in ParallelGC (JDK-8154736) with Cavium ThunderX ARMv8 (2.0GHz, 38 cores, 1 SMT for each core). There was no performance regression in SPECjbb2015. Best regards, -- Michihiro, IBM Research - Tokyo From: Kim Barrett To: Michihiro Horie Cc: "hotspot-gc-dev at openjdk.java.net" , Gustavo Bueno Romero , "david.holmes at oracle.com" , Erik Osterlund , "Doerr, Martin" Date: 2018/06/14 06:15 Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space > On Jun 7, 2018, at 2:01 AM, Michihiro Horie wrote: > > Dear all, > > Would you please review the following change? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 > Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 I was going to say that this looks good to me. But then I saw Derek White?s reply about an unexpected performance regression. I?d like to wait until he reports back. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From vinay.k.awasthi at intel.com Thu Jun 14 00:47:56 2018 From: vinay.k.awasthi at intel.com (Awasthi, Vinay K) Date: Thu, 14 Jun 2018 00:47:56 +0000 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - Updated patch that does not allow AllocateHeapAt and AllocateOldGenAt to be set at the same time... Message-ID: By preventing this, I repurposed _fd_for_heap for NVDIMM mapping. There is now no _fd_for_nvdimm. JEP to support allocating Old generation on NV-DIMM - https://bugs.openjdk.java.net/browse/JDK-8202286 Here is the implementation bug link: https://bugs.openjdk.java.net/browse/JDK-8204908 Patch is Uploaded at (full patch/incremental patch) http://cr.openjdk.java.net/~kkharbas/8204908/webrev.01/ http://cr.openjdk.java.net/~kkharbas/8204908/webrev.01_to_00/ Tested default setup (i.e. no file is being passed for heap) and AllocateHeapAt/AllocateOldGenAt with POGC and G1GC.. all passing... Any and all comments are welcome! Thanks, Vinay -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu Jun 14 04:59:58 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 14 Jun 2018 14:59:58 +1000 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <14230A21-5B54-451C-884C-E9E922967A25@oracle.com> Message-ID: Hi Robbin and reviewers, On 5/06/2018 1:15 AM, Robbin Ehn wrote: > Hi Jiangli, > > On 2018-05-30 21:47, Jiangli Zhou wrote: >> Hi Robbin, >> >> I mainly focused on the archived string part during review. Here are >> my comments, which are minor issues mostly. >> >> - stringTable.hpp >> >> Archived string is only supported with INCLUDE_CDS_JAVA_HEAP. Please >> add NOT_CDS_JAVA_HEAP_RETURN_(NULL) for lookup_shared() and >> create_archived_string() below so their call sites are handled >> properly when java heap object archiving is not supported. >> >> ? 153?? oop lookup_shared(jchar* name, int len, unsigned int hash); >> ? 154?? static oop create_archived_string(oop s, Thread* THREAD); >> > > Fixed create_archived_string() was not fixed. https://bugs.openjdk.java.net/browse/JDK-8205002 Cheers, David >> - stringTable.cpp >> >> How about renaming CopyArchive to CopyToArchive, so it?s more >> descriptive? > > Fixed > >> >> Looks like the ?bool? return type is not needed since we always return >> with true and the result is not checked. How able changing it to >> return ?void?? >> >> ? 774?? bool operator()(WeakHandle* val) { >> > > The scanning done by the hashtable is stopped if we return false here. > So the return value is used by the hashtable to know if it should > continue the walk over all items. > >> >> - genCollectedHeap.cpp >> >> Based on the assert at line 863, looks like ?par_state_string? is not >> NULL when 'scope->n_threads() > 1?. Maybe the if condition at line 865 >> could be simplified to be just ?if (scope->n_threads() > 1)?? >> >> ? 862?? // Either we should be single threaded or have a ParState >> ? 863?? assert((scope->n_threads() <= 1) || par_state_string != NULL, >> "Parallel but not ParState"); >> ? 864 >> ? 865?? if (scope->n_threads() > 1 && par_state_string != NULL) { > > Fixed, I'll send new version to rfr mail. > > Thanks, Robbin > >> >> >> Thanks, >> Jiangli >> >>> On May 28, 2018, at 6:19 AM, Robbin Ehn wrote: >>> >>> Hi all, please review. >>> >>> This implements the StringTable with the ConcurrentHashtable for >>> managing the >>> strings using oopStorage for backing the actual oops via WeakHandles. >>> >>> The unlinking and freeing of hashtable nodes is moved outside the >>> safepoint, >>> which means GC only needs to walk the oopStorage, either concurrently >>> or in a >>> safepoint. Walking oopStorage is also faster so there is a good >>> effect on all >>> safepoints visiting the oops. >>> >>> The unlinking and freeing happens during inserts when dead weak oops are >>> encountered in that bucket. In any normal workload the stringtable >>> self-cleans >>> without needing any additional cleaning. Cleaning/unlinking can also >>> be done >>> concurrently via the ServiceThread, it is started when we have a high >>> ?dead >>> factor?. E.g. application have a lot of interned string removes the >>> references >>> and never interns again. The ServiceThread also concurrently grows >>> the table if >>> ?load factor? is high. Both the cleaning and growing take care to not >>> prolonging >>> time to safepoint, at the cost of some speed. >>> >>> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >>> changeset, various benchmark such as JMH, specJBB2015. >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >>> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >>> >>> Thanks, Robbin >> From robbin.ehn at oracle.com Thu Jun 14 05:20:57 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 14 Jun 2018 07:20:57 +0200 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <14230A21-5B54-451C-884C-E9E922967A25@oracle.com> Message-ID: Hi David, thanks noting. Yes I'm aware of that, I included that fix in: 8204613: StringTable: Calculates wrong number of uncleaned items. Which is now reviewed and about to be pushed. /Robbin On 2018-06-14 06:59, David Holmes wrote: > Hi Robbin and reviewers, > > On 5/06/2018 1:15 AM, Robbin Ehn wrote: >> Hi Jiangli, >> >> On 2018-05-30 21:47, Jiangli Zhou wrote: >>> Hi Robbin, >>> >>> I mainly focused on the archived string part during review. Here are my >>> comments, which are minor issues mostly. >>> >>> - stringTable.hpp >>> >>> Archived string is only supported with INCLUDE_CDS_JAVA_HEAP. Please add >>> NOT_CDS_JAVA_HEAP_RETURN_(NULL) for lookup_shared() and >>> create_archived_string() below so their call sites are handled properly when >>> java heap object archiving is not supported. >>> >>> ? 153?? oop lookup_shared(jchar* name, int len, unsigned int hash); >>> ? 154?? static oop create_archived_string(oop s, Thread* THREAD); >>> >> >> Fixed > > create_archived_string() was not fixed. > > https://bugs.openjdk.java.net/browse/JDK-8205002 > > Cheers, > David > >>> - stringTable.cpp >>> >>> How about renaming CopyArchive to CopyToArchive, so it?s more descriptive? >> >> Fixed >> >>> >>> Looks like the ?bool? return type is not needed since we always return with >>> true and the result is not checked. How able changing it to return ?void?? >>> >>> ? 774?? bool operator()(WeakHandle* val) { >>> >> >> The scanning done by the hashtable is stopped if we return false here. >> So the return value is used by the hashtable to know if it should continue the >> walk over all items. >> >>> >>> - genCollectedHeap.cpp >>> >>> Based on the assert at line 863, looks like ?par_state_string? is not NULL >>> when 'scope->n_threads() > 1?. Maybe the if condition at line 865 could be >>> simplified to be just ?if (scope->n_threads() > 1)?? >>> >>> ? 862?? // Either we should be single threaded or have a ParState >>> ? 863?? assert((scope->n_threads() <= 1) || par_state_string != NULL, >>> "Parallel but not ParState"); >>> ? 864 >>> ? 865?? if (scope->n_threads() > 1 && par_state_string != NULL) { >> >> Fixed, I'll send new version to rfr mail. >> >> Thanks, Robbin >> >>> >>> >>> Thanks, >>> Jiangli >>> >>>> On May 28, 2018, at 6:19 AM, Robbin Ehn wrote: >>>> >>>> Hi all, please review. >>>> >>>> This implements the StringTable with the ConcurrentHashtable for managing the >>>> strings using oopStorage for backing the actual oops via WeakHandles. >>>> >>>> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >>>> which means GC only needs to walk the oopStorage, either concurrently or in a >>>> safepoint. Walking oopStorage is also faster so there is a good effect on all >>>> safepoints visiting the oops. >>>> >>>> The unlinking and freeing happens during inserts when dead weak oops are >>>> encountered in that bucket. In any normal workload the stringtable self-cleans >>>> without needing any additional cleaning. Cleaning/unlinking can also be done >>>> concurrently via the ServiceThread, it is started when we have a high ?dead >>>> factor?. E.g. application have a lot of interned string removes the references >>>> and never interns again. The ServiceThread also concurrently grows the table if >>>> ?load factor? is high. Both the cleaning and growing take care to not >>>> prolonging >>>> time to safepoint, at the cost of some speed. >>>> >>>> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >>>> changeset, various benchmark such as JMH, specJBB2015. >>>> >>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >>>> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >>>> >>>> Thanks, Robbin >>> From thomas.schatzl at oracle.com Thu Jun 14 07:41:59 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 14 Jun 2018 09:41:59 +0200 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> Message-ID: Hi Kim and Sangheon, On Wed, 2018-06-13 at 18:05 -0400, Kim Barrett wrote: > > On Jun 8, 2018, at 10:52 AM, Thomas Schatzl > com> wrote: > > > > Webrev is at http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ > > . > > > > This webrev is based on top of latest jdk/jdk and > > https://bugs.openjdk.java.net/browse/JDK-8202845 . > > > > If you want to test parallel gc, you also need the fixes for JDK- > > 8204617 and JDK-8204618 currently out for review. > > > > Testing: > > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled > > > > Thanks, > > Thomas > > Looks good. > thanks for your reviews. Thomas From stefan.johansson at oracle.com Thu Jun 14 08:46:20 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Thu, 14 Jun 2018 10:46:20 +0200 Subject: RFR (S): 8204169: Humongous continues region remembered set states do not match the one from the corresponding humongous start region In-Reply-To: <1b2779249b2f85cf6a7b56e8d004b7fe8c6dda7c.camel@oracle.com> References: <324AC60D-1C6D-492C-8BC7-DA118E455F8E@oracle.com> <1b2779249b2f85cf6a7b56e8d004b7fe8c6dda7c.camel@oracle.com> Message-ID: <81ead50a-106a-1c06-70d5-56325d6ab207@oracle.com> On 2018-06-13 10:35, Thomas Schatzl wrote: > Hi, > > On Tue, 2018-06-12 at 21:19 -0400, Kim Barrett wrote: >>> On Jun 4, 2018, at 6:27 AM, Thomas Schatzl >> om> wrote: >>> >>> Hi all, >>> >>> can I have reviews for this small change that fixes up remembered >>> set states of humongous continues regions? > [...] >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8204169 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8204169/webrev/ Looks good to me too, Stefan >>> Testing: >>> hs-tier1-3 >>> >>> Thanks, >>> Thomas >> >> Looks good. >> > > thanks for your review. > > Thomas > From thomas.schatzl at oracle.com Thu Jun 14 09:08:05 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 14 Jun 2018 11:08:05 +0200 Subject: RFR (S): 8204169: Humongous continues region remembered set states do not match the one from the corresponding humongous start region In-Reply-To: <81ead50a-106a-1c06-70d5-56325d6ab207@oracle.com> References: <324AC60D-1C6D-492C-8BC7-DA118E455F8E@oracle.com> <1b2779249b2f85cf6a7b56e8d004b7fe8c6dda7c.camel@oracle.com> <81ead50a-106a-1c06-70d5-56325d6ab207@oracle.com> Message-ID: <7330a9b9a78477886c6a9facfdb0dab075b5c6b7.camel@oracle.com> Hi Stefan, Kim, On Thu, 2018-06-14 at 10:46 +0200, Stefan Johansson wrote: > > On 2018-06-13 10:35, Thomas Schatzl wrote: > > Hi, > > > > On Tue, 2018-06-12 at 21:19 -0400, Kim Barrett wrote: > > > > On Jun 4, 2018, at 6:27 AM, Thomas Schatzl > > > le.c > > > > om> wrote: > > > > > > > > Hi all, > > > > > > > > can I have reviews for this small change that fixes up > > > > remembered > > > > set states of humongous continues regions? > > > > [...] > > > > > > > > CR: > > > > https://bugs.openjdk.java.net/browse/JDK-8204169 > > > > Webrev: > > > > http://cr.openjdk.java.net/~tschatzl/8204169/webrev/ > > Looks good to me too, > Stefan > thanks for your reviews, Thomas From per.liden at oracle.com Thu Jun 14 09:34:13 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 11:34:13 +0200 Subject: RFR: 8205020: ZGC: Apply workaround for buggy sem_post() in glibc < 2.21 Message-ID: <89b4bb2d-ff20-22d4-02c5-330178df81f5@oracle.com> We should apply a workaround for a bug in sem_post() in glibc < 2.21, where it's not safe to destroy the semaphore immediately after returning from sem_wait(). The reason is that sem_post() can touch the semaphore after a waiting thread have returned from sem_wait(). To avoid this race we should force the waiting thread to acquire/release the lock held by the posting thread. This workaround is a bit ugly, but I propose we use this until we find a batter solution. I can also add that we've never actually seen any problems related to the specific cases addressed in this patch, but the sem_post() bug has shown up in the Handshake code (JDK-8204166), so we want to protect these paths too. For more details about the sem_post() bug, see https://sourceware.org/bugzilla/show_bug.cgi?id=12674 Bug: https://bugs.openjdk.java.net/browse/JDK-8205020 Webrev: http://cr.openjdk.java.net/~pliden/8205020/webrev.0/ Testing: Passed tier{1,2,3,4} cheers, Per From erik.osterlund at oracle.com Thu Jun 14 09:41:35 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 14 Jun 2018 11:41:35 +0200 Subject: RFR: 8205020: ZGC: Apply workaround for buggy sem_post() in glibc < 2.21 In-Reply-To: <89b4bb2d-ff20-22d4-02c5-330178df81f5@oracle.com> References: <89b4bb2d-ff20-22d4-02c5-330178df81f5@oracle.com> Message-ID: <5B22384F.5070404@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2018-06-14 11:34, Per Liden wrote: > We should apply a workaround for a bug in sem_post() in glibc < 2.21, > where it's not safe to destroy the semaphore immediately after > returning from sem_wait(). The reason is that sem_post() can touch the > semaphore after a waiting thread have returned from sem_wait(). To > avoid this race we should force the waiting thread to acquire/release > the lock held by the posting thread. This workaround is a bit ugly, > but I propose we use this until we find a batter solution. I can also > add that we've never actually seen any problems related to the > specific cases addressed in this patch, but the sem_post() bug has > shown up in the Handshake code (JDK-8204166), so we want to protect > these paths too. > > For more details about the sem_post() bug, see > https://sourceware.org/bugzilla/show_bug.cgi?id=12674 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205020 > Webrev: http://cr.openjdk.java.net/~pliden/8205020/webrev.0/ > > Testing: Passed tier{1,2,3,4} > > cheers, > Per From stefan.karlsson at oracle.com Thu Jun 14 09:39:52 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 14 Jun 2018 11:39:52 +0200 Subject: RFR: 8205020: ZGC: Apply workaround for buggy sem_post() in glibc < 2.21 In-Reply-To: <89b4bb2d-ff20-22d4-02c5-330178df81f5@oracle.com> References: <89b4bb2d-ff20-22d4-02c5-330178df81f5@oracle.com> Message-ID: <5d588c95-4245-8da0-09e9-b1c46806cb35@oracle.com> Looks good. StefanK On 2018-06-14 11:34, Per Liden wrote: > We should apply a workaround for a bug in sem_post() in glibc < 2.21, > where it's not safe to destroy the semaphore immediately after returning > from sem_wait(). The reason is that sem_post() can touch the semaphore > after a waiting thread have returned from sem_wait(). To avoid this race > we should force the waiting thread to acquire/release the lock held by > the posting thread. This workaround is a bit ugly, but I propose we use > this until we find a batter solution. I can also add that we've never > actually seen any problems related to the specific cases addressed in > this patch, but the sem_post() bug has shown up in the Handshake code > (JDK-8204166), so we want to protect these paths too. > > For more details about the sem_post() bug, see > https://sourceware.org/bugzilla/show_bug.cgi?id=12674 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205020 > Webrev: http://cr.openjdk.java.net/~pliden/8205020/webrev.0/ > > Testing: Passed tier{1,2,3,4} > > cheers, > Per From per.liden at oracle.com Thu Jun 14 09:42:36 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 11:42:36 +0200 Subject: RFR: 8205022: ZGC: SoftReferences not always cleared before throwing OOME Message-ID: The spec says "All soft references to softly-reachable objects are guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError". We currently have a window/race in ZGC, where we can throw OOME before clearing all SoftReferences with a softly-reachable referent. The reason is that the current logic only guarantees that we wait with throwing OOME until the allocating thread have been part of a complete GC cycle, but there is no guarantee that the last GC cycle also cleared SoftReferences. This race is causing rare/intermittent failures in tier2 testing (TestSoftRerencesBahviorOnOOME). Bug: https://bugs.openjdk.java.net/browse/JDK-8205022 Webrev: http://cr.openjdk.java.net/~pliden/8205022/webrev.0/ Testing: tier{1,2,3,4} /Per From per.liden at oracle.com Thu Jun 14 09:47:37 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 11:47:37 +0200 Subject: RFR: 8205024: ZGC: Worker threads boost mode not always enabled when is should be Message-ID: <7b02909d-fe55-3dea-2233-d16a2727e077@oracle.com> There's a race when deciding if worker threads should be boosted (very similar to the race in JDK-8205022), which can result in worker threads not being boosted when they should. The effect of this is that the GC cycle takes longer to complete, which in turn means that stalled threads wait longer than they would have done otherwise. The solution to this is very similar to the solution to JDK-8205022. Bug: https://bugs.openjdk.java.net/browse/JDK-8205024 Webrev: http://cr.openjdk.java.net/~pliden/8205024/webrev.0 Testing: tier{1,2,3,4} /Per From per.liden at oracle.com Thu Jun 14 09:50:24 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 11:50:24 +0200 Subject: RFR: 8205028: ZGC: Remove incorrect comment in ZHeap::object_iterate() Message-ID: <2381f06b-c79b-3f92-7551-5528c273bbcb@oracle.com> The comment in ZHeap::object_iterate() saying "Should only be called in a safepoint after mark end" is incorrect and should be removed. It's still true that this should only be called in a safepoint (which the assert below checks), but the part about "after mark end" is not true. Bug: https://bugs.openjdk.java.net/browse/JDK-8205028 Webrev: http://cr.openjdk.java.net/~pliden/8205028/webrev.0 /Per From per.liden at oracle.com Thu Jun 14 09:53:33 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 11:53:33 +0200 Subject: RFR: 8205020: ZGC: Apply workaround for buggy sem_post() in glibc < 2.21 In-Reply-To: <5B22384F.5070404@oracle.com> References: <89b4bb2d-ff20-22d4-02c5-330178df81f5@oracle.com> <5B22384F.5070404@oracle.com> Message-ID: <50a7dace-463a-b93b-4c40-fa13f525d6b1@oracle.com> Thanks Erik and Stefan! /Per On 06/14/2018 11:41 AM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2018-06-14 11:34, Per Liden wrote: >> We should apply a workaround for a bug in sem_post() in glibc < 2.21, >> where it's not safe to destroy the semaphore immediately after >> returning from sem_wait(). The reason is that sem_post() can touch the >> semaphore after a waiting thread have returned from sem_wait(). To >> avoid this race we should force the waiting thread to acquire/release >> the lock held by the posting thread. This workaround is a bit ugly, >> but I propose we use this until we find a batter solution. I can also >> add that we've never actually seen any problems related to the >> specific cases addressed in this patch, but the sem_post() bug has >> shown up in the Handshake code (JDK-8204166), so we want to protect >> these paths too. >> >> For more details about the sem_post() bug, see >> https://sourceware.org/bugzilla/show_bug.cgi?id=12674 >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205020 >> Webrev: http://cr.openjdk.java.net/~pliden/8205020/webrev.0/ >> >> Testing: Passed tier{1,2,3,4} >> >> cheers, >> Per > From stefan.karlsson at oracle.com Thu Jun 14 10:22:44 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 14 Jun 2018 12:22:44 +0200 Subject: RFR: 8205028: ZGC: Remove incorrect comment in ZHeap::object_iterate() In-Reply-To: <2381f06b-c79b-3f92-7551-5528c273bbcb@oracle.com> References: <2381f06b-c79b-3f92-7551-5528c273bbcb@oracle.com> Message-ID: Looks good. StefanK On 2018-06-14 11:50, Per Liden wrote: > The comment in ZHeap::object_iterate() saying "Should only be called in > a safepoint after mark end" is incorrect and should be removed. It's > still true that this should only be called in a safepoint (which the > assert below checks), but the part about "after mark end" is not true. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205028 > Webrev: http://cr.openjdk.java.net/~pliden/8205028/webrev.0 > > /Per From per.liden at oracle.com Thu Jun 14 10:26:11 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 12:26:11 +0200 Subject: RFR: 8205028: ZGC: Remove incorrect comment in ZHeap::object_iterate() In-Reply-To: References: <2381f06b-c79b-3f92-7551-5528c273bbcb@oracle.com> Message-ID: <8540b4d2-3feb-ee41-3ff2-6a6e38186035@oracle.com> Thanks Stefan! /Per On 06/14/2018 12:22 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2018-06-14 11:50, Per Liden wrote: >> The comment in ZHeap::object_iterate() saying "Should only be called >> in a safepoint after mark end" is incorrect and should be removed. >> It's still true that this should only be called in a safepoint (which >> the assert below checks), but the part about "after mark end" is not >> true. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205028 >> Webrev: http://cr.openjdk.java.net/~pliden/8205028/webrev.0 >> >> /Per From erik.osterlund at oracle.com Thu Jun 14 10:30:03 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 14 Jun 2018 12:30:03 +0200 Subject: RFR: 8205028: ZGC: Remove incorrect comment in ZHeap::object_iterate() In-Reply-To: <2381f06b-c79b-3f92-7551-5528c273bbcb@oracle.com> References: <2381f06b-c79b-3f92-7551-5528c273bbcb@oracle.com> Message-ID: <5B2243AB.3040702@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2018-06-14 11:50, Per Liden wrote: > The comment in ZHeap::object_iterate() saying "Should only be called > in a safepoint after mark end" is incorrect and should be removed. > It's still true that this should only be called in a safepoint (which > the assert below checks), but the part about "after mark end" is not > true. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205028 > Webrev: http://cr.openjdk.java.net/~pliden/8205028/webrev.0 > > /Per From stefan.karlsson at oracle.com Thu Jun 14 10:36:17 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 14 Jun 2018 12:36:17 +0200 Subject: RFR: 8205024: ZGC: Worker threads boost mode not always enabled when is should be In-Reply-To: <7b02909d-fe55-3dea-2233-d16a2727e077@oracle.com> References: <7b02909d-fe55-3dea-2233-d16a2727e077@oracle.com> Message-ID: <78783c66-3322-2388-050b-1e5cda5afc4e@oracle.com> Looks good. StefanK On 2018-06-14 11:47, Per Liden wrote: > There's a race when deciding if worker threads should be boosted (very > similar to the race in JDK-8205022), which can result in worker threads > not being boosted when they should. The effect of this is that the GC > cycle takes longer to complete, which in turn means that stalled threads > wait longer than they would have done otherwise. > > The solution to this is very similar to the solution to JDK-8205022. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205024 > Webrev: http://cr.openjdk.java.net/~pliden/8205024/webrev.0 > > Testing: tier{1,2,3,4} > > /Per From thomas.schatzl at oracle.com Thu Jun 14 11:34:00 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 14 Jun 2018 13:34:00 +0200 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> Message-ID: Hi all, after some talk about making parallel ref processing default for G1 the suggestion came up to extract that part into a separate CR. I will post that one shortly. However, this means that there is a trivial change in this webrev to be looked at: http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3_to_4 (diff) http://cr.openjdk.java.net/~tschatzl/8043575/webrev.4 (full) This is the full change, for reference: --- old/src/hotspot/share/gc/g1/g1Arguments.cpp 2018-06-14 13:30:08.166027425 +0200 +++ new/src/hotspot/share/gc/g1/g1Arguments.cpp 2018-06-14 13:30:07.818016510 +0200 @@ -122,10 +122,6 @@ FLAG_SET_DEFAULT(GCPauseIntervalMillis, MaxGCPauseMillis + 1); } - if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > 1) { - FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); - } - log_trace(gc)("MarkStackSize: %uk MarkStackSizeMax: %uk", (unsigned int) (MarkStackSize / K), (uint) (MarkStackSizeMax / K)); // By default do not let the target stack size to be more than 1/4 of the entries Thanks, Thomas On Thu, 2018-06-14 at 09:41 +0200, Thomas Schatzl wrote: > Hi Kim and Sangheon, > > On Wed, 2018-06-13 at 18:05 -0400, Kim Barrett wrote: > > > On Jun 8, 2018, at 10:52 AM, Thomas Schatzl > > e. > > > com> wrote: > > > > > > Webrev is at http://cr.openjdk.java.net/~tschatzl/8043575/webrev. > > > 3/ > > > . > > > > > > This webrev is based on top of latest jdk/jdk and > > > https://bugs.openjdk.java.net/browse/JDK-8202845 . > > > > > > If you want to test parallel gc, you also need the fixes for JDK- > > > 8204617 and JDK-8204618 currently out for review. > > > > > > Testing: > > > hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled > > > > > > Thanks, > > > Thomas > > > > Looks good. > > > > thanks for your reviews. > > Thomas > From thomas.schatzl at oracle.com Thu Jun 14 12:08:43 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 14 Jun 2018 14:08:43 +0200 Subject: RFR (XS): 8205043: Make parallel reference processing default for G1 Message-ID: <7daf58dfd770761768a743a6fca330cf8dfb76ba.camel@oracle.com> Hi all, can I have reviews for this split-off of JDK-8043575 to make parallel reference processing in conjunction with dynamic number of thread sizing default for G1? We think that with recent changes to parallel reference processing it is useful to do so, further nowadays we expect that most VMs are run with more than one GC thread. So reference processing should benefit from that as well by default. Threads are by default automatically limited by the functionality introduced with JDK-8043575 to avoid actually being slower than before if using too many threads. This is also the reason why we only suggest to make ParallelRefProcEnabled default for G1: the thread sizing does not work with other collectors. There is also a linked CSR for that change. CR: https://bugs.openjdk.java.net/browse/JDK-8205043 Webrev: http://cr.openjdk.java.net/~tschatzl/8205043/webrev/ Testing: new included test case Thanks, Thomas From stefan.karlsson at oracle.com Thu Jun 14 12:13:07 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 14 Jun 2018 14:13:07 +0200 Subject: RFR: 8205022: ZGC: SoftReferences not always cleared before throwing OOME In-Reply-To: References: Message-ID: <06bdfd8c-3ded-39b5-49c6-b6e49131883a@oracle.com> Looks good. StefanK On 2018-06-14 11:42, Per Liden wrote: > The spec says "All soft references to softly-reachable objects are > guaranteed to have been cleared before the virtual machine throws an > OutOfMemoryError". > > We currently have a window/race in ZGC, where we can throw OOME before > clearing all SoftReferences with a softly-reachable referent. The reason > is that the current logic only guarantees that we wait with throwing > OOME until the allocating thread have been part of a complete GC cycle, > but there is no guarantee that the last GC cycle also cleared > SoftReferences. > > This race is causing rare/intermittent failures in tier2 testing > (TestSoftRerencesBahviorOnOOME). > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205022 > Webrev: http://cr.openjdk.java.net/~pliden/8205022/webrev.0/ > > Testing: tier{1,2,3,4} > > /Per From sangheon.kim at oracle.com Thu Jun 14 12:22:33 2018 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Thu, 14 Jun 2018 21:22:33 +0900 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> Message-ID: Hi Thomas, > On Jun 14, 2018, at 8:34 PM, Thomas Schatzl wrote: > > Hi all, > > after some talk about making parallel ref processing default for G1 > the suggestion came up to extract that part into a separate CR. I will > post that one shortly. > > However, this means that there is a trivial change in this webrev to be > looked at: > > http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3_to_4 (diff) > http://cr.openjdk.java.net/~tschatzl/8043575/webrev.4 (full) Webrev.4 still looks good. Thanks, Sangheon > > This is the full change, for reference: > > --- old/src/hotspot/share/gc/g1/g1Arguments.cpp 2018-06-14 > 13:30:08.166027425 +0200 > +++ new/src/hotspot/share/gc/g1/g1Arguments.cpp 2018-06-14 > 13:30:07.818016510 +0200 > @@ -122,10 +122,6 @@ > FLAG_SET_DEFAULT(GCPauseIntervalMillis, MaxGCPauseMillis + 1); > } > > - if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > > 1) { > - FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); > - } > - > log_trace(gc)("MarkStackSize: %uk MarkStackSizeMax: %uk", (unsigned > int) (MarkStackSize / K), (uint) (MarkStackSizeMax / K)); > > // By default do not let the target stack size to be more than 1/4 > of the entries > > Thanks, > Thomas > >> On Thu, 2018-06-14 at 09:41 +0200, Thomas Schatzl wrote: >> Hi Kim and Sangheon, >> >> On Wed, 2018-06-13 at 18:05 -0400, Kim Barrett wrote: >>>> On Jun 8, 2018, at 10:52 AM, Thomas Schatzl >>> e. >>>> com> wrote: >>>> >>>> Webrev is at http://cr.openjdk.java.net/~tschatzl/8043575/webrev. >>>> 3/ >>>> . >>>> >>>> This webrev is based on top of latest jdk/jdk and >>>> https://bugs.openjdk.java.net/browse/JDK-8202845 . >>>> >>>> If you want to test parallel gc, you also need the fixes for JDK- >>>> 8204617 and JDK-8204618 currently out for review. >>>> >>>> Testing: >>>> hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled >>>> >>>> Thanks, >>>> Thomas >>> >>> Looks good. >>> >> >> thanks for your reviews. >> >> Thomas >> > From thomas.schatzl at oracle.com Thu Jun 14 13:02:06 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 14 Jun 2018 15:02:06 +0200 Subject: RFR (S): 8204082: Make names of Young GCs more uniform in logs In-Reply-To: <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> Message-ID: Hi all, another round of reviews after some more internal remarks :P I also changed the title of the CR. The set of "final" tags would be: Pause Young (Normal) ... Pause Young (Concurrent Start) ... Pause Young (Concurrent End) ... Pause Young (Mixed) ... I also adapted the strings in the GCVerifyType functionality. http://cr.openjdk.java.net/~tschatzl/8204082/webrev.1_to_2 (diff) http://cr.openjdk.java.net/~tschatzl/8204082/webrev.2 (full) Testing: running through all gc tests locally Thanks, Thomas From per.liden at oracle.com Thu Jun 14 14:44:14 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 16:44:14 +0200 Subject: RFR: 8205050: ZGC: Incorrect use of RootAccess in ZHeapIterator Message-ID: <934e4de9-fb5e-d4d5-e17d-91e20fb0bb14@oracle.com> In ZHeapIteratorRootOopClosure::do_oop() we're using RootAccess<>::oop_load() to load oops. However, as the comment right above that line suggests, that's incorrect and we should have a load barrier here. In fact, we used to have a load barrier here, but this was for some mysterious reason changed. We should revert back to the code that used to be there. Bug: https://bugs.openjdk.java.net/browse/JDK-8205050 Webrev: http://cr.openjdk.java.net/~pliden/8205050/webrev.0 cheers, Per From stefan.karlsson at oracle.com Thu Jun 14 15:14:48 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 14 Jun 2018 17:14:48 +0200 Subject: RFR: 8205050: ZGC: Incorrect use of RootAccess in ZHeapIterator In-Reply-To: <934e4de9-fb5e-d4d5-e17d-91e20fb0bb14@oracle.com> References: <934e4de9-fb5e-d4d5-e17d-91e20fb0bb14@oracle.com> Message-ID: Looks good. StefanK On 2018-06-14 16:44, Per Liden wrote: > In ZHeapIteratorRootOopClosure::do_oop() we're using > RootAccess<>::oop_load() to load oops. However, as the comment right > above that line suggests, that's incorrect and we should have a load > barrier here. In fact, we used to have a load barrier here, but this > was for some mysterious reason changed. We should revert back to the > code that used to be there. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205050 > Webrev: http://cr.openjdk.java.net/~pliden/8205050/webrev.0 > > cheers, > Per From per.liden at oracle.com Thu Jun 14 15:17:30 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 17:17:30 +0200 Subject: RFR: 8205028: ZGC: Remove incorrect comment in ZHeap::object_iterate() In-Reply-To: <5B2243AB.3040702@oracle.com> References: <2381f06b-c79b-3f92-7551-5528c273bbcb@oracle.com> <5B2243AB.3040702@oracle.com> Message-ID: <7ded25e2-cdab-cbbc-5da1-07077e5cba53@oracle.com> Thanks Erik! /Per On 06/14/2018 12:30 PM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2018-06-14 11:50, Per Liden wrote: >> The comment in ZHeap::object_iterate() saying "Should only be called >> in a safepoint after mark end" is incorrect and should be removed. >> It's still true that this should only be called in a safepoint (which >> the assert below checks), but the part about "after mark end" is not >> true. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205028 >> Webrev: http://cr.openjdk.java.net/~pliden/8205028/webrev.0 >> >> /Per > From per.liden at oracle.com Thu Jun 14 15:17:51 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 17:17:51 +0200 Subject: RFR: 8205024: ZGC: Worker threads boost mode not always enabled when is should be In-Reply-To: <78783c66-3322-2388-050b-1e5cda5afc4e@oracle.com> References: <7b02909d-fe55-3dea-2233-d16a2727e077@oracle.com> <78783c66-3322-2388-050b-1e5cda5afc4e@oracle.com> Message-ID: Thanks Stefan! /Per On 06/14/2018 12:36 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2018-06-14 11:47, Per Liden wrote: >> There's a race when deciding if worker threads should be boosted (very >> similar to the race in JDK-8205022), which can result in worker >> threads not being boosted when they should. The effect of this is that >> the GC cycle takes longer to complete, which in turn means that >> stalled threads wait longer than they would have done otherwise. >> >> The solution to this is very similar to the solution to JDK-8205022. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205024 >> Webrev: http://cr.openjdk.java.net/~pliden/8205024/webrev.0 >> >> Testing: tier{1,2,3,4} >> >> /Per From per.liden at oracle.com Thu Jun 14 15:18:36 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 17:18:36 +0200 Subject: RFR: 8205022: ZGC: SoftReferences not always cleared before throwing OOME In-Reply-To: <06bdfd8c-3ded-39b5-49c6-b6e49131883a@oracle.com> References: <06bdfd8c-3ded-39b5-49c6-b6e49131883a@oracle.com> Message-ID: Thanks Stefan! /Per On 06/14/2018 02:13 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2018-06-14 11:42, Per Liden wrote: >> The spec says "All soft references to softly-reachable objects are >> guaranteed to have been cleared before the virtual machine throws an >> OutOfMemoryError". >> >> We currently have a window/race in ZGC, where we can throw OOME before >> clearing all SoftReferences with a softly-reachable referent. The >> reason is that the current logic only guarantees that we wait with >> throwing OOME until the allocating thread have been part of a complete >> GC cycle, but there is no guarantee that the last GC cycle also >> cleared SoftReferences. >> >> This race is causing rare/intermittent failures in tier2 testing >> (TestSoftRerencesBahviorOnOOME). >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205022 >> Webrev: http://cr.openjdk.java.net/~pliden/8205022/webrev.0/ >> >> Testing: tier{1,2,3,4} >> >> /Per From per.liden at oracle.com Thu Jun 14 15:19:03 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 17:19:03 +0200 Subject: RFR: 8205050: ZGC: Incorrect use of RootAccess in ZHeapIterator In-Reply-To: References: <934e4de9-fb5e-d4d5-e17d-91e20fb0bb14@oracle.com> Message-ID: Thanks Stefan! /Per On 06/14/2018 05:14 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2018-06-14 16:44, Per Liden wrote: >> In ZHeapIteratorRootOopClosure::do_oop() we're using >> RootAccess<>::oop_load() to load oops. However, as the comment right >> above that line suggests, that's incorrect and we should have a load >> barrier here. In fact, we used to have a load barrier here, but this >> was for some mysterious reason changed. We should revert back to the >> code that used to be there. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205050 >> Webrev: http://cr.openjdk.java.net/~pliden/8205050/webrev.0 >> >> cheers, >> Per > > From erik.osterlund at oracle.com Thu Jun 14 15:31:02 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 14 Jun 2018 17:31:02 +0200 Subject: RFR: 8205050: ZGC: Incorrect use of RootAccess in ZHeapIterator In-Reply-To: <934e4de9-fb5e-d4d5-e17d-91e20fb0bb14@oracle.com> References: <934e4de9-fb5e-d4d5-e17d-91e20fb0bb14@oracle.com> Message-ID: <5B228A36.6060303@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2018-06-14 16:44, Per Liden wrote: > In ZHeapIteratorRootOopClosure::do_oop() we're using > RootAccess<>::oop_load() to load oops. However, as the comment right > above that line suggests, that's incorrect and we should have a load > barrier here. In fact, we used to have a load barrier here, but this > was for some mysterious reason changed. We should revert back to the > code that used to be there. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205050 > Webrev: http://cr.openjdk.java.net/~pliden/8205050/webrev.0 > > cheers, > Per From erik.osterlund at oracle.com Thu Jun 14 15:40:47 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 14 Jun 2018 17:40:47 +0200 Subject: RFR: 8205022: ZGC: SoftReferences not always cleared before throwing OOME In-Reply-To: References: Message-ID: <5B228C7F.8000105@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2018-06-14 11:42, Per Liden wrote: > The spec says "All soft references to softly-reachable objects are > guaranteed to have been cleared before the virtual machine throws an > OutOfMemoryError". > > We currently have a window/race in ZGC, where we can throw OOME before > clearing all SoftReferences with a softly-reachable referent. The > reason is that the current logic only guarantees that we wait with > throwing OOME until the allocating thread have been part of a complete > GC cycle, but there is no guarantee that the last GC cycle also > cleared SoftReferences. > > This race is causing rare/intermittent failures in tier2 testing > (TestSoftRerencesBahviorOnOOME). > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205022 > Webrev: http://cr.openjdk.java.net/~pliden/8205022/webrev.0/ > > Testing: tier{1,2,3,4} > > /Per From erik.osterlund at oracle.com Thu Jun 14 15:42:22 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 14 Jun 2018 17:42:22 +0200 Subject: RFR: 8205024: ZGC: Worker threads boost mode not always enabled when is should be In-Reply-To: <7b02909d-fe55-3dea-2233-d16a2727e077@oracle.com> References: <7b02909d-fe55-3dea-2233-d16a2727e077@oracle.com> Message-ID: <5B228CDE.5010305@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2018-06-14 11:47, Per Liden wrote: > There's a race when deciding if worker threads should be boosted (very > similar to the race in JDK-8205022), which can result in worker > threads not being boosted when they should. The effect of this is that > the GC cycle takes longer to complete, which in turn means that > stalled threads wait longer than they would have done otherwise. > > The solution to this is very similar to the solution to JDK-8205022. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205024 > Webrev: http://cr.openjdk.java.net/~pliden/8205024/webrev.0 > > Testing: tier{1,2,3,4} > > /Per From per.liden at oracle.com Thu Jun 14 15:48:14 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 17:48:14 +0200 Subject: RFR: 8205024: ZGC: Worker threads boost mode not always enabled when is should be In-Reply-To: <5B228CDE.5010305@oracle.com> References: <7b02909d-fe55-3dea-2233-d16a2727e077@oracle.com> <5B228CDE.5010305@oracle.com> Message-ID: <39c9e82b-6fb1-4365-5501-a441c1ec7646@oracle.com> Thanks Erik! /Per On 06/14/2018 05:42 PM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2018-06-14 11:47, Per Liden wrote: >> There's a race when deciding if worker threads should be boosted (very >> similar to the race in JDK-8205022), which can result in worker >> threads not being boosted when they should. The effect of this is that >> the GC cycle takes longer to complete, which in turn means that >> stalled threads wait longer than they would have done otherwise. >> >> The solution to this is very similar to the solution to JDK-8205022. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205024 >> Webrev: http://cr.openjdk.java.net/~pliden/8205024/webrev.0 >> >> Testing: tier{1,2,3,4} >> >> /Per > From per.liden at oracle.com Thu Jun 14 15:48:23 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 17:48:23 +0200 Subject: RFR: 8205022: ZGC: SoftReferences not always cleared before throwing OOME In-Reply-To: <5B228C7F.8000105@oracle.com> References: <5B228C7F.8000105@oracle.com> Message-ID: <254b9941-bc76-0647-6f2f-93f0ba634470@oracle.com> Thanks Erik! /Per On 06/14/2018 05:40 PM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2018-06-14 11:42, Per Liden wrote: >> The spec says "All soft references to softly-reachable objects are >> guaranteed to have been cleared before the virtual machine throws an >> OutOfMemoryError". >> >> We currently have a window/race in ZGC, where we can throw OOME before >> clearing all SoftReferences with a softly-reachable referent. The >> reason is that the current logic only guarantees that we wait with >> throwing OOME until the allocating thread have been part of a complete >> GC cycle, but there is no guarantee that the last GC cycle also >> cleared SoftReferences. >> >> This race is causing rare/intermittent failures in tier2 testing >> (TestSoftRerencesBahviorOnOOME). >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205022 >> Webrev: http://cr.openjdk.java.net/~pliden/8205022/webrev.0/ >> >> Testing: tier{1,2,3,4} >> >> /Per > From per.liden at oracle.com Thu Jun 14 15:48:29 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 17:48:29 +0200 Subject: RFR: 8205050: ZGC: Incorrect use of RootAccess in ZHeapIterator In-Reply-To: <5B228A36.6060303@oracle.com> References: <934e4de9-fb5e-d4d5-e17d-91e20fb0bb14@oracle.com> <5B228A36.6060303@oracle.com> Message-ID: <2c493e35-dc3e-286e-cc26-a9cc5b5bbc8d@oracle.com> Thanks Erik! /Per On 06/14/2018 05:31 PM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2018-06-14 16:44, Per Liden wrote: >> In ZHeapIteratorRootOopClosure::do_oop() we're using >> RootAccess<>::oop_load() to load oops. However, as the comment right >> above that line suggests, that's incorrect and we should have a load >> barrier here. In fact, we used to have a load barrier here, but this >> was for some mysterious reason changed. We should revert back to the >> code that used to be there. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205050 >> Webrev: http://cr.openjdk.java.net/~pliden/8205050/webrev.0 >> >> cheers, >> Per > From jiangli.zhou at oracle.com Thu Jun 14 16:56:54 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 14 Jun 2018 09:56:54 -0700 Subject: RFR(L): 8195097: Make it possible to process StringTable outside safepoint In-Reply-To: References: <0d09f2ad-47b9-2fca-e5ec-17547e923cbb@oracle.com> <14230A21-5B54-451C-884C-E9E922967A25@oracle.com> Message-ID: <933CF072-AE6A-4C17-9F85-5D077DC3B00E@oracle.com> Hi Robbin and David, > On Jun 13, 2018, at 10:20 PM, Robbin Ehn wrote: > > Hi David, thanks noting. > > Yes I'm aware of that, I included that fix in: > 8204613: StringTable: Calculates wrong number of uncleaned items. > Which is now reviewed and about to be pushed. Thanks for noticing that and fixing quickly. I didn?t catch it in the updated review. Thanks! Jiangli > > /Robbin > > On 2018-06-14 06:59, David Holmes wrote: >> Hi Robbin and reviewers, >> On 5/06/2018 1:15 AM, Robbin Ehn wrote: >>> Hi Jiangli, >>> >>> On 2018-05-30 21:47, Jiangli Zhou wrote: >>>> Hi Robbin, >>>> >>>> I mainly focused on the archived string part during review. Here are my comments, which are minor issues mostly. >>>> >>>> - stringTable.hpp >>>> >>>> Archived string is only supported with INCLUDE_CDS_JAVA_HEAP. Please add NOT_CDS_JAVA_HEAP_RETURN_(NULL) for lookup_shared() and create_archived_string() below so their call sites are handled properly when java heap object archiving is not supported. >>>> >>>> 153 oop lookup_shared(jchar* name, int len, unsigned int hash); >>>> 154 static oop create_archived_string(oop s, Thread* THREAD); >>>> >>> >>> Fixed >> create_archived_string() was not fixed. >> https://bugs.openjdk.java.net/browse/JDK-8205002 >> Cheers, >> David >>>> - stringTable.cpp >>>> >>>> How about renaming CopyArchive to CopyToArchive, so it?s more descriptive? >>> >>> Fixed >>> >>>> >>>> Looks like the ?bool? return type is not needed since we always return with true and the result is not checked. How able changing it to return ?void?? >>>> >>>> 774 bool operator()(WeakHandle* val) { >>>> >>> >>> The scanning done by the hashtable is stopped if we return false here. >>> So the return value is used by the hashtable to know if it should continue the walk over all items. >>> >>>> >>>> - genCollectedHeap.cpp >>>> >>>> Based on the assert at line 863, looks like ?par_state_string? is not NULL when 'scope->n_threads() > 1?. Maybe the if condition at line 865 could be simplified to be just ?if (scope->n_threads() > 1)?? >>>> >>>> 862 // Either we should be single threaded or have a ParState >>>> 863 assert((scope->n_threads() <= 1) || par_state_string != NULL, "Parallel but not ParState"); >>>> 864 >>>> 865 if (scope->n_threads() > 1 && par_state_string != NULL) { >>> >>> Fixed, I'll send new version to rfr mail. >>> >>> Thanks, Robbin >>> >>>> >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> On May 28, 2018, at 6:19 AM, Robbin Ehn wrote: >>>>> >>>>> Hi all, please review. >>>>> >>>>> This implements the StringTable with the ConcurrentHashtable for managing the >>>>> strings using oopStorage for backing the actual oops via WeakHandles. >>>>> >>>>> The unlinking and freeing of hashtable nodes is moved outside the safepoint, >>>>> which means GC only needs to walk the oopStorage, either concurrently or in a >>>>> safepoint. Walking oopStorage is also faster so there is a good effect on all >>>>> safepoints visiting the oops. >>>>> >>>>> The unlinking and freeing happens during inserts when dead weak oops are >>>>> encountered in that bucket. In any normal workload the stringtable self-cleans >>>>> without needing any additional cleaning. Cleaning/unlinking can also be done >>>>> concurrently via the ServiceThread, it is started when we have a high ?dead >>>>> factor?. E.g. application have a lot of interned string removes the references >>>>> and never interns again. The ServiceThread also concurrently grows the table if >>>>> ?load factor? is high. Both the cleaning and growing take care to not prolonging >>>>> time to safepoint, at the cost of some speed. >>>>> >>>>> Kitchensink24h, multiple tier1-5 with no issue that I can relate to this >>>>> changeset, various benchmark such as JMH, specJBB2015. >>>>> >>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8195097 >>>>> Webrev: http://cr.openjdk.java.net/~rehn/8195097/v0/webrev/ >>>>> >>>>> Thanks, Robbin >>>> From per.liden at oracle.com Thu Jun 14 20:34:15 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Jun 2018 22:34:15 +0200 Subject: RFR: 8205064: Fail immediately if an unsupported GC is selected Message-ID: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> If an unsupported GC (i.e. a GC that is not built into the VM) is selected by the user, the VM issues a warning and then continues and (silently) selects a different GC. Aleksey brought this up on the ZGC list [1]. I agree that this behavior seems dubious. With this patch we instead fail immediately to avoid unnecessary confusion. Bug: https://bugs.openjdk.java.net/browse/JDK-8205064 Webrev: http://cr.openjdk.java.net/~pliden/8205064/webrev.0 Testing: manual + currently running in mach5 t{1,2,3} /Per [1] http://mail.openjdk.java.net/pipermail/zgc-dev/2018-June/000422.html From vinay.k.awasthi at intel.com Thu Jun 14 20:49:44 2018 From: vinay.k.awasthi at intel.com (Awasthi, Vinay K) Date: Thu, 14 Jun 2018 20:49:44 +0000 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. Message-ID: Now ReservedSpace.cpp has logic to only open NVDIMM File (as it was done for AllocateheapAt).. if successful, set up 3 flags (base/nvdimm_present/file handle) at the end. There is *NO* collector specific code. All work has been moved to g1PagebasedVirtualSpace.cpp.. I am committing memory here and setting dram_heapbase used by g1 here. JEP to support allocating Old generation on NV-DIMM - https://bugs.openjdk.java.net/browse/JDK-8202286 Here is the implementation bug link: https://bugs.openjdk.java.net/browse/JDK-8204908 Patch is Uploaded at (full patch/incremental patch) http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02/ http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02_to_01/ Tested default setup (i.e. no file is being passed for heap) and AllocateHeapAt/AllocateOldGenAt with POGC and G1GC.. all passing... Any and all comments are welcome! Thanks, Vinay -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Thu Jun 14 22:15:14 2018 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 15 Jun 2018 00:15:14 +0200 Subject: RFR: 8205064: Fail immediately if an unsupported GC is selected In-Reply-To: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> References: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> Message-ID: <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> Am 14.06.2018 um 22:34 schrieb Per Liden: > If an unsupported GC (i.e. a GC that is not built into the VM) is > selected by the user, the VM issues a warning and then continues and > (silently) selects a different GC. Aleksey brought this up on the ZGC > list [1]. I agree that this behavior seems dubious. With this patch we > instead fail immediately to avoid unnecessary confusion. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205064 > Webrev: http://cr.openjdk.java.net/~pliden/8205064/webrev.0 > > Testing: manual + currently running in mach5 t{1,2,3} > > /Per > > [1] http://mail.openjdk.java.net/pipermail/zgc-dev/2018-June/000422.html Hi Per, I am not sure I'd ever set Epsilon as default, but it's the last in line, i.e. selected when built *only* with Epsilon, is that right? Why not include CMS in that list? If I build with CMS and Epsilon, I get Epsilon selected? Other than that, I think the patch is good. Thanks, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Fri Jun 15 05:51:50 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 Jun 2018 07:51:50 +0200 Subject: RFR: 8205064: Fail immediately if an unsupported GC is selected In-Reply-To: <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> References: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> Message-ID: On 06/15/2018 12:15 AM, Roman Kennke wrote: > Am 14.06.2018 um 22:34 schrieb Per Liden: >> If an unsupported GC (i.e. a GC that is not built into the VM) is >> selected by the user, the VM issues a warning and then continues and >> (silently) selects a different GC. Aleksey brought this up on the ZGC >> list [1]. I agree that this behavior seems dubious. With this patch we >> instead fail immediately to avoid unnecessary confusion. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205064 >> Webrev: http://cr.openjdk.java.net/~pliden/8205064/webrev.0 Thanks! Looks good, modulo the comment below: > I am not sure I'd ever set Epsilon as default, but it's the last in > line, i.e. selected when built *only* with Epsilon, is that right? > Why not include CMS in that list? If I build with CMS and Epsilon, I get > Epsilon selected? Also, IIRC, if we do autoselect either ZGC or Epsilon, the argument checking would fail right away, because we need to unlock them with UnlockExperimentalVMOptions first: experimental(bool, UseEpsilonGC, false, \ "Use the Epsilon (no-op) garbage collector") \ \ experimental(bool, UseZGC, false, \ "Use the Z garbage collector") \ \ I think we better avoid adding experimental GCs to auto-selection, and just leave GCConfig::select_gc_ergonomically alone. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From per.liden at oracle.com Fri Jun 15 06:34:44 2018 From: per.liden at oracle.com (Per Liden) Date: Fri, 15 Jun 2018 08:34:44 +0200 Subject: RFR: 8205064: Fail immediately if an unsupported GC is selected In-Reply-To: <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> References: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> Message-ID: <78243003-7e48-2b16-6507-ecd6b02be4d8@oracle.com> On 2018-06-15 00:15, Roman Kennke wrote: > Am 14.06.2018 um 22:34 schrieb Per Liden: >> If an unsupported GC (i.e. a GC that is not built into the VM) is >> selected by the user, the VM issues a warning and then continues and >> (silently) selects a different GC. Aleksey brought this up on the ZGC >> list [1]. I agree that this behavior seems dubious. With this patch we >> instead fail immediately to avoid unnecessary confusion. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205064 >> Webrev: http://cr.openjdk.java.net/~pliden/8205064/webrev.0 >> >> Testing: manual + currently running in mach5 t{1,2,3} >> >> /Per >> >> [1] http://mail.openjdk.java.net/pipermail/zgc-dev/2018-June/000422.html > > Hi Per, > > I am not sure I'd ever set Epsilon as default, but it's the last in > line, i.e. selected when built *only* with Epsilon, is that right? > > Why not include CMS in that list? If I build with CMS and Epsilon, I get > Epsilon selected? Oops, I never intended to keep that part of the patch, but forgot to remove. Dropping this part. +#elif INCLUDE_ZGC + FLAG_SET_ERGO_IF_DEFAULT(bool, UseZGC, true); +#elif INCLUDE_EPSILONGC + FLAG_SET_ERGO_IF_DEFAULT(bool, UseEpsilonGC, true); > > Other than that, I think the patch is good. Thanks! /Per > > Thanks, Roman > From per.liden at oracle.com Fri Jun 15 06:36:39 2018 From: per.liden at oracle.com (Per Liden) Date: Fri, 15 Jun 2018 08:36:39 +0200 Subject: RFR: 8205064: Fail immediately if an unsupported GC is selected In-Reply-To: References: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> Message-ID: On 2018-06-15 07:51, Aleksey Shipilev wrote: > On 06/15/2018 12:15 AM, Roman Kennke wrote: >> Am 14.06.2018 um 22:34 schrieb Per Liden: >>> If an unsupported GC (i.e. a GC that is not built into the VM) is >>> selected by the user, the VM issues a warning and then continues and >>> (silently) selects a different GC. Aleksey brought this up on the ZGC >>> list [1]. I agree that this behavior seems dubious. With this patch we >>> instead fail immediately to avoid unnecessary confusion. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205064 >>> Webrev: http://cr.openjdk.java.net/~pliden/8205064/webrev.0 > > Thanks! Looks good, modulo the comment below: > >> I am not sure I'd ever set Epsilon as default, but it's the last in >> line, i.e. selected when built *only* with Epsilon, is that right? >> Why not include CMS in that list? If I build with CMS and Epsilon, I get >> Epsilon selected? > > Also, IIRC, if we do autoselect either ZGC or Epsilon, the argument checking would fail right away, > because we need to unlock them with UnlockExperimentalVMOptions first: > > experimental(bool, UseEpsilonGC, false, \ > "Use the Epsilon (no-op) garbage collector") \ > \ > experimental(bool, UseZGC, false, \ > "Use the Z garbage collector") \ > \ > > I think we better avoid adding experimental GCs to auto-selection, and just leave > GCConfig::select_gc_ergonomically alone. > Thanks for reviewing Aleksey. Dropping the auto-select part. /Per > -Aleksey > From kirk at kodewerk.com Fri Jun 15 06:50:00 2018 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Fri, 15 Jun 2018 09:50:00 +0300 Subject: RFR (S): 8204082: Make names of Young GCs more uniform in logs In-Reply-To: References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> Message-ID: Hi Thomas, >From a parsing POV would be preferable "Pause Young Normal" would be preferable to Pause Young (Normal)". I know it's a small change but it does make a difference. The rest of the code is fine (of course not an official review). As for the change it's self, [73.077s][info ][gc,start ] GC(262) Pause Initial Mark (G1 Humongous Allocation) is already perfectly clear to me. I'm not sure how [73.077s][info ][gc,start ] GC(262) Pause Young (Initial Mark) (G1 Humongous Allocation) clarifies things. IMO, this change only adds to the already high level of noise in the GC logs. On that note, I have time scheduled in July to see what can be done to reduce redundancy in the logs. Kind regards, Kirk On Thu, Jun 14, 2018 at 4:02 PM, Thomas Schatzl wrote: > Hi all, > > another round of reviews after some more internal remarks :P > > I also changed the title of the CR. > > The set of "final" tags would be: > > Pause Young (Normal) ... > Pause Young (Concurrent Start) ... > Pause Young (Concurrent End) ... > Pause Young (Mixed) ... > > I also adapted the strings in the GCVerifyType functionality. > > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.1_to_2 (diff) > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.2 (full) > > Testing: > running through all gc tests locally > > Thanks, > Thomas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.johansson at oracle.com Fri Jun 15 08:46:06 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 15 Jun 2018 10:46:06 +0200 Subject: RFR (S): 8204082: Make names of Young GCs more uniform in logs In-Reply-To: References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> Message-ID: <908f4150-ebf9-f374-3e77-a7e9c00293a4@oracle.com> Hi Kirk, On 2018-06-15 08:50, Kirk Pepperdine wrote: > Hi Thomas, > > From a parsing POV would be preferable "Pause Young Normal" would be > preferable to Pause Young (Normal)".? I know it's a small change but it > does make a difference. The rest of the code is fine (of course not an > official review). > > As for the change it's self, [73.077s][info ][gc,start ? ? ] GC(262) > Pause Initial Mark (G1 Humongous Allocation) is already perfectly clear > to me. I'm not sure how > [73.077s][info ][gc,start ? ? ] GC(262) Pause Young (Initial Mark) ?(G1 > Humongous Allocation) clarifies things. IMO, this change only adds to > the already high level of noise in the GC logs. I think this makes it more clear that Initial Mark is actually doing a Young GC as well as initiating the concurrent cycle and with the new proposed tags it will be even more clear IMHO. Cheers, Stefan > On that note, I have > time scheduled in July to see what can be done to reduce redundancy in > the logs. > > Kind regards, > Kirk > > > > On Thu, Jun 14, 2018 at 4:02 PM, Thomas Schatzl > > wrote: > > Hi all, > > ? another round of reviews after some more internal remarks :P > > I also changed the title of the CR. > > The set of "final" tags would be: > > Pause Young (Normal) ... > Pause Young (Concurrent Start) ... > Pause Young (Concurrent End) ... > Pause Young (Mixed) ... > > I also adapted the strings in the GCVerifyType functionality. > > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.1_to_2 > (diff) > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.2 > (full) > > Testing: > running through all gc tests locally > > Thanks, > ? Thomas > > From kirk at kodewerk.com Fri Jun 15 09:44:38 2018 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Fri, 15 Jun 2018 12:44:38 +0300 Subject: RFR (S): 8204082: Make names of Young GCs more uniform in logs In-Reply-To: <908f4150-ebf9-f374-3e77-a7e9c00293a4@oracle.com> References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> <908f4150-ebf9-f374-3e77-a7e9c00293a4@oracle.com> Message-ID: > On Jun 15, 2018, at 11:46 AM, Stefan Johansson wrote: > > Hi Kirk, > > On 2018-06-15 08:50, Kirk Pepperdine wrote: >> Hi Thomas, >> From a parsing POV would be preferable "Pause Young Normal" would be preferable to Pause Young (Normal)". I know it's a small change but it does make a difference. The rest of the code is fine (of course not an official review). >> As for the change it's self, [73.077s][info ][gc,start ] GC(262) Pause Initial Mark (G1 Humongous Allocation) is already perfectly clear to me. I'm not sure how >> [73.077s][info ][gc,start ] GC(262) Pause Young (Initial Mark) (G1 Humongous Allocation) clarifies things. IMO, this change only adds to the already high level of noise in the GC logs. > I think this makes it more clear that Initial Mark is actually doing a Young GC as well as initiating the concurrent cycle and with the new proposed tags it will be even more clear IMHO. Sure? but a GC log is really an audit trail of GC events?? when the collector ran, what the collector did, and how long it took to get the work done. In order to understand the audit trail you have to have some knowledge of how the collector works. Thus adding ?Pause? and ?Young? to Initial Mark is redundant to those that know and not any more helpful to those that don?t know how the collector works. And when you consider the use case for GC logs, it?s mostly going to be managed by tooling anyways. If you decide to keep the redundant information I would still like to request that the braces be dropped. Kind regards, Kirk >> On Thu, Jun 14, 2018 at 4:02 PM, Thomas Schatzl > wrote: >> Hi all, >> another round of reviews after some more internal remarks :P >> I also changed the title of the CR. >> The set of "final" tags would be: >> Pause Young (Normal) ... >> Pause Young (Concurrent Start) ... >> Pause Young (Concurrent End) ... >> Pause Young (Mixed) ... >> I also adapted the strings in the GCVerifyType functionality. >> http://cr.openjdk.java.net/~tschatzl/8204082/webrev.1_to_2 >> (diff) >> http://cr.openjdk.java.net/~tschatzl/8204082/webrev.2 >> (full) >> Testing: >> running through all gc tests locally >> Thanks, >> Thomas From per.liden at oracle.com Fri Jun 15 12:09:11 2018 From: per.liden at oracle.com (Per Liden) Date: Fri, 15 Jun 2018 14:09:11 +0200 Subject: RFR: 8205064: Fail immediately if an unsupported GC is selected In-Reply-To: References: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> Message-ID: Updated webrev with the auto-select stuff removed. http://cr.openjdk.java.net/~pliden/8205064/webrev.1 I need to check if this change requires a CSR, in which case this might or might not make it into 11, but I'll try. /Per On 06/15/2018 08:36 AM, Per Liden wrote: > On 2018-06-15 07:51, Aleksey Shipilev wrote: >> On 06/15/2018 12:15 AM, Roman Kennke wrote: >>> Am 14.06.2018 um 22:34 schrieb Per Liden: >>>> If an unsupported GC (i.e. a GC that is not built into the VM) is >>>> selected by the user, the VM issues a warning and then continues and >>>> (silently) selects a different GC. Aleksey brought this up on the ZGC >>>> list [1]. I agree that this behavior seems dubious. With this patch we >>>> instead fail immediately to avoid unnecessary confusion. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205064 >>>> Webrev: http://cr.openjdk.java.net/~pliden/8205064/webrev.0 >> >> Thanks! Looks good, modulo the comment below: >> >>> I am not sure I'd ever set Epsilon as default, but it's the last in >>> line, i.e. selected when built *only* with Epsilon, is that right? >>> Why not include CMS in that list? If I build with CMS and Epsilon, I get >>> Epsilon selected? >> >> Also, IIRC, if we do autoselect either ZGC or Epsilon, the argument >> checking would fail right away, >> because we need to unlock them with UnlockExperimentalVMOptions first: >> >> ?? experimental(bool, UseEpsilonGC, >> false,?????????????????????????????????? \ >> ?????????? "Use the Epsilon (no-op) garbage >> collector")????????????????????? \ >> >> \ >> ?? experimental(bool, UseZGC, >> false,???????????????????????????????????????? \ >> ?????????? "Use the Z garbage >> collector")??????????????????????????????????? \ >> >> \ >> >> I think we better avoid adding experimental GCs to auto-selection, and >> just leave >> GCConfig::select_gc_ergonomically alone. >> > > Thanks for reviewing Aleksey. Dropping the auto-select part. > > /Per > >> -Aleksey >> From shade at redhat.com Fri Jun 15 12:11:43 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 Jun 2018 14:11:43 +0200 Subject: RFR: 8205064: Fail immediately if an unsupported GC is selected In-Reply-To: References: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> Message-ID: <658e2c12-ad52-e73e-9b66-2c0a9a56e326@redhat.com> On 06/15/2018 02:09 PM, Per Liden wrote: > Updated webrev with the auto-select stuff removed. > > http://cr.openjdk.java.net/~pliden/8205064/webrev.1 Looks good! > I need to check if this change requires a CSR, in which case this might or might not make it into > 11, but I'll try. Thank you. Once CSR is there, I would vouch for this fix to be low risk. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From stefan.johansson at oracle.com Fri Jun 15 12:24:49 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 15 Jun 2018 14:24:49 +0200 Subject: RFR (XL): 8202845: Refactor reference processing for improved parallelism In-Reply-To: <5EB91B90-6A06-42E3-A138-859600F4D02D@oracle.com> References: <61154A78-083C-456C-97F9-2D2986B67168@oracle.com> <17748553-CE38-4E0F-AFCE-43F0107FDDE1@oracle.com> <9A97AFF1-0A3B-471B-B995-C39627146AA6@oracle.com> <66141559-E8EF-4654-9BBE-0D62E7FE4E02@oracle.com> <7a8fc4917623be1649b07e1147c8df6463c14e8a.camel@oracle.com> <72FB6B58-45FC-4B57-8484-915C6B63777A@oracle.com> <6d103b4fe2caf32c7ff8d3599522338ba8021ccc.camel@oracle.com> <5EB91B90-6A06-42E3-A138-859600F4D02D@oracle.com> Message-ID: <0858e040-c781-7022-4917-84209223204f@oracle.com> On 2018-06-13 23:53, Kim Barrett wrote: >> On Jun 13, 2018, at 5:26 PM, Thomas Schatzl wrote: >> >>>> >>>> http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2_to_3 (diff) >>>> and >>>> http://cr.openjdk.java.net/~tschatzl/8202845/webrev.3 (full) >>>> >>>> Currently running hs-tier1-3, but I do not expect issues. >>>> >> >> No issues. >> >>>> Thomas >>> >>> Looks good, except verify_total_count_zero should be debug-only >>> rather than !PRODUCT. >>> In an ?optimized" build it does nothing except waste time (calls >>> total_count but does nothing >>> with the result because assert is suppressed). >> >> I fixed that in place in the webrevs, replacing the NOT_PRODUCT_RETURN >> with a NOT_DEBUG_RETURN and the #ifndef PRODUCT with an #ifdef ASSERT. >> >> Thanks, >> Thomas > > Looks good > Nice work guys! Looks good to me too. Just one small thing in the test, might be nice to store the phase names in a variable since they are used more than once. No need to see a new webrev for this. Thanks, Stefan From stefan.johansson at oracle.com Fri Jun 15 13:19:41 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 15 Jun 2018 15:19:41 +0200 Subject: RFR (XS): 8205043: Make parallel reference processing default for G1 In-Reply-To: <7daf58dfd770761768a743a6fca330cf8dfb76ba.camel@oracle.com> References: <7daf58dfd770761768a743a6fca330cf8dfb76ba.camel@oracle.com> Message-ID: On 2018-06-14 14:08, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this split-off of JDK-8043575 to make parallel > reference processing in conjunction with dynamic number of thread > sizing default for G1? > > We think that with recent changes to parallel reference processing it > is useful to do so, further nowadays we expect that most VMs are run > with more than one GC thread. So reference processing should benefit > from that as well by default. > > Threads are by default automatically limited by the functionality > introduced with JDK-8043575 to avoid actually being slower than before > if using too many threads. > > This is also the reason why we only suggest to make > ParallelRefProcEnabled default for G1: the thread sizing does not work > with other collectors. > > There is also a linked CSR for that change. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8205043 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8205043/webrev/ Looks good, StefanJ > Testing: > new included test case > > Thanks, > Thomas > From thomas.schatzl at oracle.com Fri Jun 15 13:47:02 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 15 Jun 2018 15:47:02 +0200 Subject: RFR (S): 8204082: Make names of Young GCs more uniform in logs In-Reply-To: References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> Message-ID: Hi, during some more discussion about the messages there was the concern that the "Concurrent End" message is not really an indication of the concurrent marking end (this happens asynchronuosly), also that pause may not occur (if no mixed gc starts after marking), so we came up with "Prepare Mixed" for it. "Concurrent End" would probably fit better for the current "Cleanup" pause. I.e. Pause Young (Normal) ... Pause Young (Concurrent Start) ... Pause Young (Prepare Mixed) ... Pause Young (Mixed) ... http://cr.openjdk.java.net/~tschatzl/8204082/webrev.2_to_3 (diff) http://cr.openjdk.java.net/~tschatzl/8204082/webrev.3 (full) Thanks, Thomas On Thu, 2018-06-14 at 15:02 +0200, Thomas Schatzl wrote: > Hi all, > > another round of reviews after some more internal remarks :P > > I also changed the title of the CR. > > The set of "final" tags would be: > > Pause Young (Normal) ... > Pause Young (Concurrent Start) ... > Pause Young (Concurrent End) ... > Pause Young (Mixed) ... > > I also adapted the strings in the GCVerifyType functionality. > > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.1_to_2 (diff) > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.2 (full) > > Testing: > running through all gc tests locally > > Thanks, > Thomas > From thomas.schatzl at oracle.com Fri Jun 15 13:53:27 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 15 Jun 2018 15:53:27 +0200 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: References: Message-ID: Hi Vinay, On Thu, 2018-06-14 at 20:49 +0000, Awasthi, Vinay K wrote: > Now ReservedSpace.cpp has logic to only open NVDIMM File (as it was > done for AllocateheapAt).. if successful, set up 3 flags > (base/nvdimm_present/file handle) at the end. There is *NO* collector > specific code. > > All work has been moved to g1PagebasedVirtualSpace.cpp.. I am > committing memory here and setting dram_heapbase used by g1 here. > > JEP to support allocating Old generation on NV-DIMM - https://bugs.op > enjdk.java.net/browse/JDK-8202286 > Here is the implementation bug link: https://bugs.openjdk.java.net/br > owse/JDK-8204908 > > > Patch is Uploaded at (full patch/incremental patch) > > http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02/ > http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02_to_01/ > Tested default setup (i.e. no file is being passed for heap) and > AllocateHeapAt/AllocateOldGenAt with POGC and G1GC.. all passing? > Any and all comments are welcome! > looking briefly through the changes, I think they look much better already to move the G1 specific stuff into G1 code; however I would like to think about how we could reduce the complexity further and solve the case of allowing multiple mapping sources (tmpfs file, nvram, different "types" of RAM) for different parts of the heap in an even cleaner way. Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 15 14:14:10 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 15 Jun 2018 16:14:10 +0200 Subject: RFR (S): 8204082: Make names of Young GCs more uniform in logs In-Reply-To: References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> <908f4150-ebf9-f374-3e77-a7e9c00293a4@oracle.com> Message-ID: <739d3bfae6445db15a7b9d51892e1ffc20cdc1ae.camel@oracle.com> Hi, On Fri, 2018-06-15 at 12:44 +0300, Kirk Pepperdine wrote: > > On Jun 15, 2018, at 11:46 AM, Stefan Johansson > acle.com> wrote: > > > > Hi Kirk, > > > > On 2018-06-15 08:50, Kirk Pepperdine wrote: > > > Hi Thomas, > > > From a parsing POV would be preferable "Pause Young Normal" would > > > be preferable to Pause Young (Normal)". I know it's a small > > > change but it does make a difference. The rest of the code is > > > fine (of course not an official review). > > > As for the change it's self, [73.077s][info ][gc,start ] > > > GC(262) Pause Initial Mark (G1 Humongous Allocation) is already > > > perfectly clear to me. I'm not sure how > > > [73.077s][info ][gc,start ] GC(262) Pause Young (Initial > > > Mark) (G1 Humongous Allocation) clarifies things. IMO, this > > > change only adds to the already high level of noise in the GC > > > logs. > > > > I think this makes it more clear that Initial Mark is actually > > doing a Young GC as well as initiating the concurrent cycle and > > with the new proposed tags it will be even more clear IMHO. > > Sure? but a GC log is really an audit trail of GC events?? when the > collector ran, what the collector did, and how long it took to get > the work done. In order to understand the audit trail you have to > have some knowledge of how the collector works. Thus adding ?Pause? > and ?Young? to Initial Mark is redundant to those that know and not > any more helpful to those that don?t know how the collector works. it is helpful to me to e.g. to filter out anything that does not start with "Pause Young" immediately on a single look. > And when you consider the use case for GC logs, it?s mostly going to > be managed by tooling anyways. As mentioned in the original name of the CR ("Add indication that this is the "Last Young GC before Mixed" to logs") this change is mostly intended to ease staring at crash logs (debugging) where you typically do not have the chance to do another run (or adding more extensive logging makes the problem not appear) or turn on more logging. I.e. analysis with no tools at all. Or at most something like grep, awk and sed at quick disposal. So this is for a different purpose than auditing. Since JFR is open source now I recommend moving to feeding tools from that longer term. I agree that at the moment it does not give you the same information, but I assume the community will improve that over time. Thanks, Thomas From kirk at kodewerk.com Fri Jun 15 14:58:50 2018 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Fri, 15 Jun 2018 17:58:50 +0300 Subject: RFR (S): 8204082: Make names of Young GCs more uniform in logs In-Reply-To: <739d3bfae6445db15a7b9d51892e1ffc20cdc1ae.camel@oracle.com> References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> <908f4150-ebf9-f374-3e77-a7e9c00293a4@oracle.com> <739d3bfae6445db15a7b9d51892e1ffc20cdc1ae.camel@oracle.com> Message-ID: > On Jun 15, 2018, at 5:14 PM, Thomas Schatzl wrote: > > Hi, > > On Fri, 2018-06-15 at 12:44 +0300, Kirk Pepperdine wrote: >>> On Jun 15, 2018, at 11:46 AM, Stefan Johansson >> acle.com> wrote: >>> >>> Hi Kirk, >>> >>> On 2018-06-15 08:50, Kirk Pepperdine wrote: >>>> Hi Thomas, >>>> From a parsing POV would be preferable "Pause Young Normal" would >>>> be preferable to Pause Young (Normal)". I know it's a small >>>> change but it does make a difference. The rest of the code is >>>> fine (of course not an official review). >>>> As for the change it's self, [73.077s][info ][gc,start ] >>>> GC(262) Pause Initial Mark (G1 Humongous Allocation) is already >>>> perfectly clear to me. I'm not sure how >>>> [73.077s][info ][gc,start ] GC(262) Pause Young (Initial >>>> Mark) (G1 Humongous Allocation) clarifies things. IMO, this >>>> change only adds to the already high level of noise in the GC >>>> logs. >>> >>> I think this makes it more clear that Initial Mark is actually >>> doing a Young GC as well as initiating the concurrent cycle and >>> with the new proposed tags it will be even more clear IMHO. >> >> Sure? but a GC log is really an audit trail of GC events?? when the >> collector ran, what the collector did, and how long it took to get >> the work done. In order to understand the audit trail you have to >> have some knowledge of how the collector works. Thus adding ?Pause? >> and ?Young? to Initial Mark is redundant to those that know and not >> any more helpful to those that don?t know how the collector works. > > it is helpful to me to e.g. to filter out anything that does not > start with "Pause Young" immediately on a single look. Then, we should defer to your suggested format as it adds qualitative value to how you process these logs. The only thing that I would ask is if we can drop the braces as it makes it more difficult to me to support rule sets that span multiple versions of the JVM. > >> And when you consider the use case for GC logs, it?s mostly going to >> be managed by tooling anyways. > > As mentioned in the original name of the CR ("Add indication that this > is the "Last Young GC before Mixed" to logs") this change is mostly > intended to ease staring at crash logs (debugging) where you typically > do not have the chance to do another run (or adding more extensive > logging makes the problem not appear) or turn on more logging. > > I.e. analysis with no tools at all. Or at most something like grep, awk > and sed at quick disposal. > > So this is for a different purpose than auditing. On one shot debugging.. yup, been there.. try debugging device drivers.. if you mess up there, it not only freezes the kernel (forget about any messages at all) and most likely corrupt your boot disk which will need to be reformatted and?. As for the logging comment, myself and a few others have been trying to get people to distinguish between logging (primarily for errors) and journaling. Frameworks such as log4j have been historically (ab)used for the purposes of journaling which generally results in massive performance regressions. Journaling things such as GC events really creates an audit trail and there are far better solutions for journalling than log4j. > Since JFR is open > source now I recommend moving to feeding tools from that longer term. > I agree that at the moment it does not give you the same information, > but I assume the community will improve that over time. I would love to move away from our dependency on GC logging. It?s not only a pain for you to hear from me on what are essentially bike shedding issues but it?s also a pain for us to deal with the logs. Logging is the worst source of information save all the others. Unfortunately I think it?s going to be a while before this can happen as our analytics for use in productions systems rely on data that is only found in the logs. Kind regards, Kirk From per.liden at oracle.com Fri Jun 15 15:15:24 2018 From: per.liden at oracle.com (Per Liden) Date: Fri, 15 Jun 2018 17:15:24 +0200 Subject: RFR: 8205064: Fail immediately if an unsupported GC is selected In-Reply-To: <658e2c12-ad52-e73e-9b66-2c0a9a56e326@redhat.com> References: <362ec4a9-2165-fcff-6b94-34bbe78120cf@oracle.com> <88c80596-0e78-dfc0-17ef-057759d5b5f9@redhat.com> <658e2c12-ad52-e73e-9b66-2c0a9a56e326@redhat.com> Message-ID: <8943afb7-b36f-eeae-6c4a-45a3d991816e@oracle.com> Hi Aleksey, On 06/15/2018 02:11 PM, Aleksey Shipilev wrote: > On 06/15/2018 02:09 PM, Per Liden wrote: >> Updated webrev with the auto-select stuff removed. >> >> http://cr.openjdk.java.net/~pliden/8205064/webrev.1 > > Looks good! > >> I need to check if this change requires a CSR, in which case this might or might not make it into >> 11, but I'll try. > > Thank you. Once CSR is there, I would vouch for this fix to be low risk. Please review the CSR, and add yourself as a reviewer. Others are of course also welcome to do so. https://bugs.openjdk.java.net/browse/JDK-8205109 Thanks! /Per > > -Aleksey > From kim.barrett at oracle.com Fri Jun 15 16:42:28 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 15 Jun 2018 12:42:28 -0400 Subject: RFR(M) 8043575: Dynamically parallelize reference processing work In-Reply-To: References: <8af7594e-d5f4-ca0d-de73-6547ea131f3e@oracle.com> <42bc259b93355a98d66a8a14b1ba8d04ed33277c.camel@oracle.com> <25c462ac-4fef-6661-2280-635f280ad7a7@oracle.com> Message-ID: <2AF2D665-4B8C-4DCB-8A96-520CB8024D08@oracle.com> > On Jun 14, 2018, at 7:34 AM, Thomas Schatzl wrote: > > Hi all, > > after some talk about making parallel ref processing default for G1 > the suggestion came up to extract that part into a separate CR. I will > post that one shortly. > > However, this means that there is a trivial change in this webrev to be > looked at: > > http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3_to_4 (diff) > http://cr.openjdk.java.net/~tschatzl/8043575/webrev.4 (full) > > This is the full change, for reference: > > --- old/src/hotspot/share/gc/g1/g1Arguments.cpp 2018-06-14 > 13:30:08.166027425 +0200 > +++ new/src/hotspot/share/gc/g1/g1Arguments.cpp 2018-06-14 > 13:30:07.818016510 +0200 > @@ -122,10 +122,6 @@ > FLAG_SET_DEFAULT(GCPauseIntervalMillis, MaxGCPauseMillis + 1); > } > > - if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > > 1) { > - FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); > - } > - > log_trace(gc)("MarkStackSize: %uk MarkStackSizeMax: %uk", (unsigned > int) (MarkStackSize / K), (uint) (MarkStackSizeMax / K)); > > // By default do not let the target stack size to be more than 1/4 > of the entries > > Thanks, > Thomas Looks good. From kim.barrett at oracle.com Fri Jun 15 16:45:12 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 15 Jun 2018 12:45:12 -0400 Subject: RFR (XS): 8205043: Make parallel reference processing default for G1 In-Reply-To: <7daf58dfd770761768a743a6fca330cf8dfb76ba.camel@oracle.com> References: <7daf58dfd770761768a743a6fca330cf8dfb76ba.camel@oracle.com> Message-ID: > On Jun 14, 2018, at 8:08 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this split-off of JDK-8043575 to make parallel > reference processing in conjunction with dynamic number of thread > sizing default for G1? > > We think that with recent changes to parallel reference processing it > is useful to do so, further nowadays we expect that most VMs are run > with more than one GC thread. So reference processing should benefit > from that as well by default. > > Threads are by default automatically limited by the functionality > introduced with JDK-8043575 to avoid actually being slower than before > if using too many threads. > > This is also the reason why we only suggest to make > ParallelRefProcEnabled default for G1: the thread sizing does not work > with other collectors. > > There is also a linked CSR for that change. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8205043 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8205043/webrev/ > Testing: > new included test case > > Thanks, > Thomas Looks good. From vinay.k.awasthi at intel.com Fri Jun 15 17:59:28 2018 From: vinay.k.awasthi at intel.com (Awasthi, Vinay K) Date: Fri, 15 Jun 2018 17:59:28 +0000 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: References: Message-ID: HI Thomas, Thanks for your input.. Now there is *no* change in virtualspace.cpp... I moved reserve and commit (this is how memory backed by file is handled) from reserve space to commit places in respective gcs... All changes are again localized and isolated with os::has_nvdimm()/AllocateOldGenAT. There are also fixes (1 line changes) added related to alignment and there is no un-mapping etc.. before mapping nvdimm backed dax file. Full Patch patch is here.. http://cr.openjdk.java.net/~kkharbas/8204908/webrev.04 Any input is welcome. Thanks, Vinay -----Original Message----- From: Thomas Schatzl [mailto:thomas.schatzl at oracle.com] Sent: Friday, June 15, 2018 6:53 AM To: Awasthi, Vinay K ; 'Paul Su' ; 'hotspot-gc-dev at openjdk.java.net' ; 'Hotspot dev runtime' Cc: Kharbas, Kishor ; Aundhe, Shirish ; Viswanathan, Sandhya Subject: Re: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. Hi Vinay, On Thu, 2018-06-14 at 20:49 +0000, Awasthi, Vinay K wrote: > Now ReservedSpace.cpp has logic to only open NVDIMM File (as it was > done for AllocateheapAt).. if successful, set up 3 flags > (base/nvdimm_present/file handle) at the end. There is *NO* collector > specific code. > > All work has been moved to g1PagebasedVirtualSpace.cpp.. I am > committing memory here and setting dram_heapbase used by g1 here. > > JEP to support allocating Old generation on NV-DIMM - https://bugs.op > enjdk.java.net/browse/JDK-8202286 > Here is the implementation bug link: https://bugs.openjdk.java.net/br > owse/JDK-8204908 > > > Patch is Uploaded at (full patch/incremental patch) > > http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02/ > http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02_to_01/ > Tested default setup (i.e. no file is being passed for heap) and > AllocateHeapAt/AllocateOldGenAt with POGC and G1GC.. all passing? Any > and all comments are welcome! > looking briefly through the changes, I think they look much better already to move the G1 specific stuff into G1 code; however I would like to think about how we could reduce the complexity further and solve the case of allowing multiple mapping sources (tmpfs file, nvram, different "types" of RAM) for different parts of the heap in an even cleaner way. Thanks, Thomas From hohensee at amazon.com Fri Jun 15 20:21:59 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 15 Jun 2018 20:21:59 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results Message-ID: After some difficulty with the submit cluster, with which Erik helped me out, the patch passes. It also passed fastdebug hotspot tier 1 testing on my Mac laptop, which former includes the new test. I had to increase -Xmx and -Xms to 12m in order to get TestOldGenCollectionUsage to pass on the submit cluster, though the old 10m works fine on my Mac. New webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.03/ Thanks, Paul ?On 6/12/18, 6:52 AM, "Erik Helin" wrote: (adding back serviceability-dev, please keep both hotspot-gc-dev and serviceability-dev) Hi Paul, before I start re-reviewing, did you test the new version of the patch via the jdk/submit repository [0]? Thanks, Erik [0]: http://hg.openjdk.java.net/jdk/submit On 06/09/2018 03:29 PM, Hohensee, Paul wrote: > Didn't seem to make it to hotspot-gc-dev... > > On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > Back after a long hiatus... > > Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ > > TIA for your re-review. Plus, may I have another reviewer look at it please? > > Paul > > On 2/26/18, 8:47 AM, "Erik Helin" wrote: > > Hi Paul, > > a couple of comments on the patch: > > - memoryService.hpp: > + 150 bool countCollection, > + 151 bool allMemoryPoolsAffected = true); > > There is no need to use a default value for the parameter > allMemoryPoolsAffected here. Skipping the default value also allows > you to put allMemoryPoolsAffected to TraceMemoryManager::initialize > in the same relative position as for the constructor parameter (this > will make the code more uniform and easier to follow). > > - memoryManager.cpp > > Instead of adding a default parameter, maybe add a new method? > Something like GCMemoryManager::add_not_always_affected_pool() > (I couldn't come up with a shorter name at the moment). > > - TestMixedOldGenCollectionUsage.java > > The test is too strict about how and when collections should > occur. Tests written this way often become very brittle, they might > e.g. fail to finish a concurrent mark on time on a very slow, single > core, machine. It is better to either force collections by using the > WhiteBox API or make the test more lenient. > > Thanks, > Erik > > On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > > Ping for a review please. > > > > Thanks, > > > > Paul > > > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > > > From the original RR: > > > > > The bug is that from the JMX point of view, G1?s incremental collector > > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > > survivor and eden spaces. In fact, mixed collections run by this > > > collector also affect the G1 old generation. > > > > > > This proposed fix is to record, for each of a JMX garbage collector's > > > memory pools, whether that memory pool is affected by all collections > > > using that collector. And, for each collection, record whether or not > > > all the collector's memory pools are affected. After each collection, > > > for each memory pool, if either all the collector's memory pools were > > > affected or the memory pool is affected for all collections, record > > > CollectionUsage for that pool. > > > > > > For collectors other than G1 Young Generation, all pools are recorded as > > > affected by all collections and every collection is recorded as > > > affecting all the collector?s memory pools. For the G1 Young Generation > > > collector, the G1 Old Gen pool is recorded as not being affected by all > > > collections, and non-mixed collections are recorded as not affecting all > > > memory pools. The result is that for non-mixed collections, > > > CollectionUsage is recorded after a collection only the G1 Eden Space > > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > > is recorded for G1 Old Gen as well. > > > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > > CollectionUsage, the only external behavior change is that > > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > > rather than 2. > > > > > > With this fix, a collector?s memory pools can be divided into two > > > disjoint subsets, one that participates in all collections and one that > > > doesn?t. This is a bit more general than the minimum necessary to fix > > > G1, but not by much. Because I expect it to apply to other incremental > > > region-based collectors, I went with the more general solution. I > > > minimized the amount of code I had to touch by using default parameters > > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > > > > > > > > From Derek.White at cavium.com Fri Jun 15 22:55:07 2018 From: Derek.White at cavium.com (White, Derek) Date: Fri, 15 Jun 2018 22:55:07 +0000 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi Michihiro, Status update: My colleague and I are getting inconsistent results with this patch: -23% to +7% on SPECjbb, so we're trying to verify what's going on. On an unrelated note, the aarch64 port relies on GCC's __atomic_compare_exchange to implemented the relaxed case of Atomic::PlatformCmpxchg, and gcc 6 and earlier sometimes do a poor job on it. Not enough to account for the numbers we saw though. I hope to have an answer by Monday. - Derek > -----Original Message----- > From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] > On Behalf Of Kim Barrett > Sent: Wednesday, June 13, 2018 5:16 PM > To: Michihiro Horie > Cc: Gustavo Bueno Romero ; > david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in > G1ParScanThreadState::copy_to_survivor_space > > External Email > > > On Jun 7, 2018, at 2:01 AM, Michihiro Horie wrote: > > > > Dear all, > > > > Would you please review the following change? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 > > Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 > > I was going to say that this looks good to me. > > But then I saw Derek White?s reply about an unexpected performance > regression. > I?d like to wait until he reports back. From HORIE at jp.ibm.com Sat Jun 16 03:44:08 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Fri, 15 Jun 2018 22:44:08 -0500 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi Derek, Thank you for sharing your status of still having inconsistent results with the patch. I would wait for your updates. Thanks again, Best regards, -- Michihiro, IBM Research - Tokyo From: "White, Derek" To: Kim Barrett , Michihiro Horie Cc: Gustavo Bueno Romero , "david.holmes at oracle.com" , "hotspot-gc-dev at openjdk.java.net" Date: 2018/06/15 17:55 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Michihiro, Status update: My colleague and I are getting inconsistent results with this patch: -23% to +7% on SPECjbb, so we're trying to verify what's going on. On an unrelated note, the aarch64 port relies on GCC's __atomic_compare_exchange to implemented the relaxed case of Atomic::PlatformCmpxchg, and gcc 6 and earlier sometimes do a poor job on it. Not enough to account for the numbers we saw though. I hope to have an answer by Monday. - Derek > -----Original Message----- > From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] > On Behalf Of Kim Barrett > Sent: Wednesday, June 13, 2018 5:16 PM > To: Michihiro Horie > Cc: Gustavo Bueno Romero ; > david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in > G1ParScanThreadState::copy_to_survivor_space > > External Email > > > On Jun 7, 2018, at 2:01 AM, Michihiro Horie wrote: > > > > Dear all, > > > > Would you please review the following change? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 > > Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 > > I was going to say that this looks good to me. > > But then I saw Derek White?s reply about an unexpected performance > regression. > I?d like to wait until he reports back. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From erik.helin at oracle.com Sat Jun 16 08:23:59 2018 From: erik.helin at oracle.com (Erik Helin) Date: Sat, 16 Jun 2018 10:23:59 +0200 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: References: Message-ID: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> On 06/15/2018 10:21 PM, Hohensee, Paul wrote: > After some difficulty with the submit cluster, with which Erik helped me out, the patch passes. It also passed fastdebug hotspot tier 1 testing on my Mac laptop, which former includes the new test. > > I had to increase -Xmx and -Xms to 12m in order to get TestOldGenCollectionUsage to pass on the submit cluster, though the old 10m works fine on my Mac. New webrev: Thanks, the change of -Xmx and -Xms to 12m now also makes the test pass on my workstation. > http://cr.openjdk.java.net/~phh/8195115/webrev.03/ There seems to be some trailing whitespace in the patch, have you run jcheck (or `hg diff` which highlights trailing whitespace in red)? Please see + TraceMemoryManagerStats tms(&_memory_manager, gc_cause(), + collector_state()->yc_type() == Mixed /* allMemoryPoolsAffected */); + ^---- whitespace and +int MemoryManager::add_pool(MemoryPool* pool) { + int index = _num_pools; ^---- whitespace Another small comment, I would have written +void GCMemoryManager::add_pool(MemoryPool* pool) { + int index = MemoryManager::add_pool(pool); + _pool_always_affected_by_gc[index] = true; +} + +void GCMemoryManager::add_pool(MemoryPool* pool, bool always_affected_by_gc) { + int index = MemoryManager::add_pool(pool); + _pool_always_affected_by_gc[index] = always_affected_by_gc; +} + as +void GCMemoryManager::add_pool(MemoryPool* pool) { + add_pool(pool, true); +} + +void GCMemoryManager::add_pool(MemoryPool* pool, bool always_affected_by_gc) { + int index = MemoryManager::add_pool(pool); + _pool_always_affected_by_gc[index] = always_affected_by_gc; +} + to not have to two duplicate implementations of GCMemoryManager::add_pool. Would you mind updating the patch with this change (and remove the trailing whitespace)? Thanks, Erik > Thanks, > > Paul > > ?On 6/12/18, 6:52 AM, "Erik Helin" wrote: > > (adding back serviceability-dev, please keep both hotspot-gc-dev and > serviceability-dev) > > Hi Paul, > > before I start re-reviewing, did you test the new version of the patch > via the jdk/submit repository [0]? > > Thanks, > Erik > > [0]: http://hg.openjdk.java.net/jdk/submit > > On 06/09/2018 03:29 PM, Hohensee, Paul wrote: > > Didn't seem to make it to hotspot-gc-dev... > > > > On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > Back after a long hiatus... > > > > Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ > > > > TIA for your re-review. Plus, may I have another reviewer look at it please? > > > > Paul > > > > On 2/26/18, 8:47 AM, "Erik Helin" wrote: > > > > Hi Paul, > > > > a couple of comments on the patch: > > > > - memoryService.hpp: > > + 150 bool countCollection, > > + 151 bool allMemoryPoolsAffected = true); > > > > There is no need to use a default value for the parameter > > allMemoryPoolsAffected here. Skipping the default value also allows > > you to put allMemoryPoolsAffected to TraceMemoryManager::initialize > > in the same relative position as for the constructor parameter (this > > will make the code more uniform and easier to follow). > > > > - memoryManager.cpp > > > > Instead of adding a default parameter, maybe add a new method? > > Something like GCMemoryManager::add_not_always_affected_pool() > > (I couldn't come up with a shorter name at the moment). > > > > - TestMixedOldGenCollectionUsage.java > > > > The test is too strict about how and when collections should > > occur. Tests written this way often become very brittle, they might > > e.g. fail to finish a concurrent mark on time on a very slow, single > > core, machine. It is better to either force collections by using the > > WhiteBox API or make the test more lenient. > > > > Thanks, > > Erik > > > > On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > > > Ping for a review please. > > > > > > Thanks, > > > > > > Paul > > > > > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > > > > > From the original RR: > > > > > > > The bug is that from the JMX point of view, G1?s incremental collector > > > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > > > survivor and eden spaces. In fact, mixed collections run by this > > > > collector also affect the G1 old generation. > > > > > > > > This proposed fix is to record, for each of a JMX garbage collector's > > > > memory pools, whether that memory pool is affected by all collections > > > > using that collector. And, for each collection, record whether or not > > > > all the collector's memory pools are affected. After each collection, > > > > for each memory pool, if either all the collector's memory pools were > > > > affected or the memory pool is affected for all collections, record > > > > CollectionUsage for that pool. > > > > > > > > For collectors other than G1 Young Generation, all pools are recorded as > > > > affected by all collections and every collection is recorded as > > > > affecting all the collector?s memory pools. For the G1 Young Generation > > > > collector, the G1 Old Gen pool is recorded as not being affected by all > > > > collections, and non-mixed collections are recorded as not affecting all > > > > memory pools. The result is that for non-mixed collections, > > > > CollectionUsage is recorded after a collection only the G1 Eden Space > > > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > > > is recorded for G1 Old Gen as well. > > > > > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > > > CollectionUsage, the only external behavior change is that > > > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > > > rather than 2. > > > > > > > > With this fix, a collector?s memory pools can be divided into two > > > > disjoint subsets, one that participates in all collections and one that > > > > doesn?t. This is a bit more general than the minimum necessary to fix > > > > G1, but not by much. Because I expect it to apply to other incremental > > > > region-based collectors, I went with the more general solution. I > > > > minimized the amount of code I had to touch by using default parameters > > > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > > > > > > > > > > > > > > > > > From hohensee at amazon.com Sat Jun 16 19:00:15 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Sat, 16 Jun 2018 19:00:15 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> References: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> Message-ID: <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> Thanks for the re-review, Erik. New webrev with your fixes: http://cr.openjdk.java.net/~phh/8195115/webrev.04/ Need another reviewer, please. Thanks, Paul ?On 6/16/18, 1:25 AM, "Erik Helin" wrote: On 06/15/2018 10:21 PM, Hohensee, Paul wrote: > After some difficulty with the submit cluster, with which Erik helped me out, the patch passes. It also passed fastdebug hotspot tier 1 testing on my Mac laptop, which former includes the new test. > > I had to increase -Xmx and -Xms to 12m in order to get TestOldGenCollectionUsage to pass on the submit cluster, though the old 10m works fine on my Mac. New webrev: Thanks, the change of -Xmx and -Xms to 12m now also makes the test pass on my workstation. > http://cr.openjdk.java.net/~phh/8195115/webrev.03/ There seems to be some trailing whitespace in the patch, have you run jcheck (or `hg diff` which highlights trailing whitespace in red)? Please see + TraceMemoryManagerStats tms(&_memory_manager, gc_cause(), + collector_state()->yc_type() == Mixed /* allMemoryPoolsAffected */); + ^---- whitespace and +int MemoryManager::add_pool(MemoryPool* pool) { + int index = _num_pools; ^---- whitespace Another small comment, I would have written +void GCMemoryManager::add_pool(MemoryPool* pool) { + int index = MemoryManager::add_pool(pool); + _pool_always_affected_by_gc[index] = true; +} + +void GCMemoryManager::add_pool(MemoryPool* pool, bool always_affected_by_gc) { + int index = MemoryManager::add_pool(pool); + _pool_always_affected_by_gc[index] = always_affected_by_gc; +} + as +void GCMemoryManager::add_pool(MemoryPool* pool) { + add_pool(pool, true); +} + +void GCMemoryManager::add_pool(MemoryPool* pool, bool always_affected_by_gc) { + int index = MemoryManager::add_pool(pool); + _pool_always_affected_by_gc[index] = always_affected_by_gc; +} + to not have to two duplicate implementations of GCMemoryManager::add_pool. Would you mind updating the patch with this change (and remove the trailing whitespace)? Thanks, Erik > Thanks, > > Paul > > On 6/12/18, 6:52 AM, "Erik Helin" wrote: > > (adding back serviceability-dev, please keep both hotspot-gc-dev and > serviceability-dev) > > Hi Paul, > > before I start re-reviewing, did you test the new version of the patch > via the jdk/submit repository [0]? > > Thanks, > Erik > > [0]: http://hg.openjdk.java.net/jdk/submit > > On 06/09/2018 03:29 PM, Hohensee, Paul wrote: > > Didn't seem to make it to hotspot-gc-dev... > > > > On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > Back after a long hiatus... > > > > Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ > > > > TIA for your re-review. Plus, may I have another reviewer look at it please? > > > > Paul > > > > On 2/26/18, 8:47 AM, "Erik Helin" wrote: > > > > Hi Paul, > > > > a couple of comments on the patch: > > > > - memoryService.hpp: > > + 150 bool countCollection, > > + 151 bool allMemoryPoolsAffected = true); > > > > There is no need to use a default value for the parameter > > allMemoryPoolsAffected here. Skipping the default value also allows > > you to put allMemoryPoolsAffected to TraceMemoryManager::initialize > > in the same relative position as for the constructor parameter (this > > will make the code more uniform and easier to follow). > > > > - memoryManager.cpp > > > > Instead of adding a default parameter, maybe add a new method? > > Something like GCMemoryManager::add_not_always_affected_pool() > > (I couldn't come up with a shorter name at the moment). > > > > - TestMixedOldGenCollectionUsage.java > > > > The test is too strict about how and when collections should > > occur. Tests written this way often become very brittle, they might > > e.g. fail to finish a concurrent mark on time on a very slow, single > > core, machine. It is better to either force collections by using the > > WhiteBox API or make the test more lenient. > > > > Thanks, > > Erik > > > > On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > > > Ping for a review please. > > > > > > Thanks, > > > > > > Paul > > > > > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > > > > > From the original RR: > > > > > > > The bug is that from the JMX point of view, G1?s incremental collector > > > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > > > survivor and eden spaces. In fact, mixed collections run by this > > > > collector also affect the G1 old generation. > > > > > > > > This proposed fix is to record, for each of a JMX garbage collector's > > > > memory pools, whether that memory pool is affected by all collections > > > > using that collector. And, for each collection, record whether or not > > > > all the collector's memory pools are affected. After each collection, > > > > for each memory pool, if either all the collector's memory pools were > > > > affected or the memory pool is affected for all collections, record > > > > CollectionUsage for that pool. > > > > > > > > For collectors other than G1 Young Generation, all pools are recorded as > > > > affected by all collections and every collection is recorded as > > > > affecting all the collector?s memory pools. For the G1 Young Generation > > > > collector, the G1 Old Gen pool is recorded as not being affected by all > > > > collections, and non-mixed collections are recorded as not affecting all > > > > memory pools. The result is that for non-mixed collections, > > > > CollectionUsage is recorded after a collection only the G1 Eden Space > > > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > > > is recorded for G1 Old Gen as well. > > > > > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > > > CollectionUsage, the only external behavior change is that > > > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > > > rather than 2. > > > > > > > > With this fix, a collector?s memory pools can be divided into two > > > > disjoint subsets, one that participates in all collections and one that > > > > doesn?t. This is a bit more general than the minimum necessary to fix > > > > G1, but not by much. Because I expect it to apply to other incremental > > > > region-based collectors, I went with the more general solution. I > > > > minimized the amount of code I had to touch by using default parameters > > > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > > > > > > > > > > > > > > > > > From thomas.stuefe at gmail.com Sat Jun 16 19:00:56 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sat, 16 Jun 2018 21:00:56 +0200 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: References: Message-ID: Hi Vinay! this is the third thread you opened for this issue; it would helpful if you would not change subjects, because it splinters discussions on the mailing list. For reference, this is the first mail thread with our first exchange: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022342.html --- Thank you for your perseverance and patience. But unfortunately, my concerns have not been alleviated a lot by the current patch. I still think this stretches an already partly-ill-defined interface further. About os::commit_memory(): as I wrote in my first mail: "So far, for the most part, the os::{reserve|commit}_memory APIs have been agnostic to the underlying implementation. You pretty much tie it to mmap() now. This adds implicit restrictions to the API we did not have before (e.g. will not work if platform uses SysV shm APIs to implement these APIs)." In your response http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022356.html you explain why you do this. I understand your motives. I still dislike this, for the reasons I gave before: it adds a generic API to a set of generic APIs which IMHO breaks the implicit contract they all have with each other. Your patch also blurrs the difference between runtime "os::" layer and GC. "os" is a general OS-wrapping API layer, and should not know nor care about GC internals: + static inline int nvdimm_fd() { + // ParallelOldGC adaptive sizing requires nvdimm fd. + return _nvdimm_fd; + } + static inline address dram_heapbase() { + return _dram_heap_base; + } + static inline address nvdimm_heapbase() { + return _nvdimm_heap_base; + } + static inline uint nvdimm_regionlength() { + return _nvdimm_region_length; + } IMHO, currently the memory management is ill prepared for your patch; yes, one could shove it in, but at the expense of maintainability and code clearness. This expense would have to be carried by all JVM developers, regardless whether they work on your hardware and benefit from this feature. So I think this would work better with some preparatory refactoring done in the VM. Red Hat and Oracle did similar efforts by refactoring the GC interface before adding new GCs: see https://bugs.openjdk.java.net/browse/JDK-8163329. Maybe we could think about how to do this. It certainly would be a worthy goal. Kind Regards, Thomas (BTW, I really do not like the fact that in os_posix.cpp, os::map_memory_to_file() and os::allocate_file() do an vm_exit_during_initialization() in case of an error! These are (supposed to be) general purpose APIs and under no circumstances should they end the process. This is already in the hotspot now, added as part of JDK-8190308. This should be fixed.) On Fri, Jun 15, 2018 at 7:59 PM, Awasthi, Vinay K wrote: > HI Thomas, > > Thanks for your input.. > > Now there is *no* change in virtualspace.cpp... > > I moved reserve and commit (this is how memory backed by file is handled) from reserve space to commit places in respective gcs... All changes are again localized and isolated with os::has_nvdimm()/AllocateOldGenAT. > > There are also fixes (1 line changes) added related to alignment and there is no un-mapping etc.. before mapping nvdimm backed dax file. > > Full Patch patch is here.. > http://cr.openjdk.java.net/~kkharbas/8204908/webrev.04 > > Any input is welcome. > > Thanks, > Vinay > > -----Original Message----- > From: Thomas Schatzl [mailto:thomas.schatzl at oracle.com] > Sent: Friday, June 15, 2018 6:53 AM > To: Awasthi, Vinay K ; 'Paul Su' ; 'hotspot-gc-dev at openjdk.java.net' ; 'Hotspot dev runtime' > Cc: Kharbas, Kishor ; Aundhe, Shirish ; Viswanathan, Sandhya > Subject: Re: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. > > Hi Vinay, > > On Thu, 2018-06-14 at 20:49 +0000, Awasthi, Vinay K wrote: >> Now ReservedSpace.cpp has logic to only open NVDIMM File (as it was >> done for AllocateheapAt).. if successful, set up 3 flags >> (base/nvdimm_present/file handle) at the end. There is *NO* collector >> specific code. >> >> All work has been moved to g1PagebasedVirtualSpace.cpp.. I am >> committing memory here and setting dram_heapbase used by g1 here. >> >> JEP to support allocating Old generation on NV-DIMM - https://bugs.op >> enjdk.java.net/browse/JDK-8202286 >> Here is the implementation bug link: https://bugs.openjdk.java.net/br >> owse/JDK-8204908 >> >> >> Patch is Uploaded at (full patch/incremental patch) >> >> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02/ >> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02_to_01/ >> Tested default setup (i.e. no file is being passed for heap) and >> AllocateHeapAt/AllocateOldGenAt with POGC and G1GC.. all passing? Any >> and all comments are welcome! >> > > looking briefly through the changes, I think they look much better already to move the G1 specific stuff into G1 code; however I would like to think about how we could reduce the complexity further and solve the case of allowing multiple mapping sources (tmpfs file, nvram, different "types" of RAM) for different parts of the heap in an even cleaner way. > > Thanks, > Thomas > From mandy.chung at oracle.com Mon Jun 18 03:41:33 2018 From: mandy.chung at oracle.com (mandy chung) Date: Sun, 17 Jun 2018 20:41:33 -0700 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> References: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> Message-ID: <49f2eabc-3d3e-46cd-00b7-d52e11d74dab@oracle.com> Looks fine to me. Mandy On 6/16/18 12:00 PM, Hohensee, Paul wrote: > Thanks for the re-review, Erik. New webrev with your fixes: > > http://cr.openjdk.java.net/~phh/8195115/webrev.04/ > > Need another reviewer, please. > > Thanks, > > Paul From david.holmes at oracle.com Mon Jun 18 05:18:19 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 18 Jun 2018 15:18:19 +1000 Subject: RFR(M): 8204908: Allocation of Old generation of Java Heap on alternate memory devices. In-Reply-To: References: Message-ID: <3df9861a-cc52-56f1-dc34-dfaee098f278@oracle.com> On 13/06/2018 4:11 PM, Thomas St?fe wrote: > Hi, > > (adding hs-runtime) Thanks for that Thomas. I also had a cursory glance. This is not really a Request for Review (RFR) as suggested by the email subject but more a "request for comment". The JEP is still a draft and this is a long way from being in a form that could be integrated. As a POC/prototype this may be functional but it is far too intrusive to me to be considered as the final form for this kind of change. The level of abstraction seems wrong; the use of nvdimm is everywhere. Ideally I would have expected the use of an alternate memory device to be hidden behind the address range - ie you just point the old gen to the alternate memory device, mapping into a given address range for the process and it "just works". Perhaps that is naive thinking, but I would not have expected to see so much code that needs to know about the nature of the underlying memory. And if there are key differences in the nature of this memory that requires changes to the existing code then I'd expect to see suitable abstractions introduced for that. Cheers, David > had a cursory glance at the proposed changes. I am taken aback by the > amount of complexity added to ReservedSpace for what I understand is a > narrow experimental feature only benefiting 1-2 Operating Systems and > - I guess, the JEP is not really clear there - only x86 with certain > hardware configurations? > > e.g. http://cr.openjdk.java.net/~kkharbas/8202286/webrev.00/share/memory/virtualspace.hpp.udiff.html > > The source having zero comments does not really help either. > > "The motivation behind this JEP is to provide an experimental feature > to facilitate exploration of different use cases for these non-DRAM > memories." > > Ok, but does this really have to be upstream, in this form, to > experiment with it? > > I am not objecting against this feature in general. But I am unhappy > about the monster ReservedSpace is turning into. IMHO before > increasing complexity even further this should be revamped, otherwise > it becomes too unwieldy to do anything with it. It already somehow > takes care of a number of huge pages ("_special" and "_alignment"), > implicit-null-checks-when-space-happens-to-be-used-as-part-of-java-heap > ("_noaccess_prefix"), allocation at alternate file locations > ("_fd_for_heap", introduced by you with 8190308). > > You also added a new variant to os::commit_memory() which directly > goes down to mmap(). So far, for the most part, the > os::{reserve|commit}_memory APIs have been agnostic to the underlying > implementation. You pretty much tie it to mmap() now. This adds > implicit restrictions to the API we did not have before (e.g. will not > work if platform uses SysV shm APIs to implement these APIs). > > Best Regards, Thomas > > > On Tue, Jun 12, 2018 at 9:32 PM, Awasthi, Vinay K > wrote: >> Hello, >> >> I am requesting comments on POGC/G1GC supporting NVDIMM/DRAM heaps. When >> user supplies AllocateOldGenAt=, JVM divides heap into 2 >> parts. First part is on NVDIMM where long living objects go (OldGen) and >> other part is on DRAM where short living objects reside(YoungGen). This is >> ONLY supported for G1GC and POGC collectors on Linux and Windows. >> >> On Windows, OldGen resizing is NOT supported. On Linux, for G1GC, OldGen >> resizing is not supported however for POGC it is. Heap residing on DRAM is >> supported for Windows and Linux for POGC and G1GC. >> >> JEP to support allocating Old generation on NV-DIMM - >> https://bugs.openjdk.java.net/browse/JDK-8202286 >> >> Patch is at http://cr.openjdk.java.net/~kkharbas/8202286/webrev.00/ >> >> SpecJbb2005/SpecJbb2015 etc. are passing with this patch and one can test >> this by simply mounting tmpfs of certain size and pass that as an argument >> to AllocateOldGenAt. >> >> For G1GC, G1MaxNewSizePercent controls how much of total heap will reside on >> DRAM. Rest of the heap then goes to NVDIMM. >> >> For POGC, MaxNewSize decides the DRAM residing young gen size. Rest is >> mounted on NVDIMM. >> >> In all these implementations, JVM ends up reserving more than initial size >> determined by ergonomics (never more than Xmx). JVM displays these messages >> and shows NVDIMM and DRAM reserved bytes. >> >> Thanks, >> >> Vinay >> >> >> >> >> >> From david.holmes at oracle.com Mon Jun 18 05:35:45 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 18 Jun 2018 15:35:45 +1000 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: References: Message-ID: <1c7a266e-8de9-6037-bab8-3d76ca477dd9@oracle.com> On 17/06/2018 5:00 AM, Thomas St?fe wrote: > Hi Vinay! > > this is the third thread you opened for this issue; it would helpful > if you would not change subjects, because it splinters discussions on > the mailing list. +100 on that! At this stage the JEP should be being discussed more than the prototype implementation. I see: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-May/022092.html but zero discussion. David ----- > For reference, this is the first mail thread with our first exchange: > > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022342.html > > --- > > Thank you for your perseverance and patience. > > But unfortunately, my concerns have not been alleviated a lot by the > current patch. I still think this stretches an already > partly-ill-defined interface further. > > About os::commit_memory(): as I wrote in my first mail: > > "So far, for the most part, the os::{reserve|commit}_memory APIs have > been agnostic to the underlying implementation. You pretty much tie it > to mmap() now. This adds implicit restrictions to the API we did not > have before (e.g. will not work if platform uses SysV shm APIs to > implement these APIs)." > > In your response > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022356.html > you explain why you do this. I understand your motives. I still > dislike this, for the reasons I gave before: it adds a generic API to > a set of generic APIs which IMHO breaks the implicit contract they all > have with each other. > > Your patch also blurrs the difference between runtime "os::" layer and > GC. "os" is a general OS-wrapping API layer, and should not know nor > care about GC internals: > > + static inline int nvdimm_fd() { > + // ParallelOldGC adaptive sizing requires nvdimm fd. > + return _nvdimm_fd; > + } > + static inline address dram_heapbase() { > + return _dram_heap_base; > + } > + static inline address nvdimm_heapbase() { > + return _nvdimm_heap_base; > + } > + static inline uint nvdimm_regionlength() { > + return _nvdimm_region_length; > + } > > IMHO, currently the memory management is ill prepared for your patch; > yes, one could shove it in, but at the expense of maintainability and > code clearness. This expense would have to be carried by all JVM > developers, regardless whether they work on your hardware and benefit > from this feature. > > So I think this would work better with some preparatory refactoring > done in the VM. Red Hat and Oracle did similar efforts by refactoring > the GC interface before adding new GCs: see > https://bugs.openjdk.java.net/browse/JDK-8163329. > > Maybe we could think about how to do this. It certainly would be a worthy goal. > > Kind Regards, Thomas > > (BTW, I really do not like the fact that in os_posix.cpp, > os::map_memory_to_file() and os::allocate_file() do an > vm_exit_during_initialization() in case of an error! These are > (supposed to be) general purpose APIs and under no circumstances > should they end the process. This is already in the hotspot now, added > as part of JDK-8190308. This should be fixed.) > > > > On Fri, Jun 15, 2018 at 7:59 PM, Awasthi, Vinay K > wrote: >> HI Thomas, >> >> Thanks for your input.. >> >> Now there is *no* change in virtualspace.cpp... >> >> I moved reserve and commit (this is how memory backed by file is handled) from reserve space to commit places in respective gcs... All changes are again localized and isolated with os::has_nvdimm()/AllocateOldGenAT. >> >> There are also fixes (1 line changes) added related to alignment and there is no un-mapping etc.. before mapping nvdimm backed dax file. >> >> Full Patch patch is here.. >> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.04 >> >> Any input is welcome. >> >> Thanks, >> Vinay >> >> -----Original Message----- >> From: Thomas Schatzl [mailto:thomas.schatzl at oracle.com] >> Sent: Friday, June 15, 2018 6:53 AM >> To: Awasthi, Vinay K ; 'Paul Su' ; 'hotspot-gc-dev at openjdk.java.net' ; 'Hotspot dev runtime' >> Cc: Kharbas, Kishor ; Aundhe, Shirish ; Viswanathan, Sandhya >> Subject: Re: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. >> >> Hi Vinay, >> >> On Thu, 2018-06-14 at 20:49 +0000, Awasthi, Vinay K wrote: >>> Now ReservedSpace.cpp has logic to only open NVDIMM File (as it was >>> done for AllocateheapAt).. if successful, set up 3 flags >>> (base/nvdimm_present/file handle) at the end. There is *NO* collector >>> specific code. >>> >>> All work has been moved to g1PagebasedVirtualSpace.cpp.. I am >>> committing memory here and setting dram_heapbase used by g1 here. >>> >>> JEP to support allocating Old generation on NV-DIMM - https://bugs.op >>> enjdk.java.net/browse/JDK-8202286 >>> Here is the implementation bug link: https://bugs.openjdk.java.net/br >>> owse/JDK-8204908 >>> >>> >>> Patch is Uploaded at (full patch/incremental patch) >>> >>> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02/ >>> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02_to_01/ >>> Tested default setup (i.e. no file is being passed for heap) and >>> AllocateHeapAt/AllocateOldGenAt with POGC and G1GC.. all passing? Any >>> and all comments are welcome! >>> >> >> looking briefly through the changes, I think they look much better already to move the G1 specific stuff into G1 code; however I would like to think about how we could reduce the complexity further and solve the case of allowing multiple mapping sources (tmpfs file, nvram, different "types" of RAM) for different parts of the heap in an even cleaner way. >> >> Thanks, >> Thomas >> From kirk at kodewerk.com Mon Jun 18 06:23:48 2018 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Mon, 18 Jun 2018 09:23:48 +0300 Subject: RFR(M): 8204908: Allocation of Old generation of Java Heap on alternate memory devices. In-Reply-To: <3df9861a-cc52-56f1-dc34-dfaee098f278@oracle.com> References: <3df9861a-cc52-56f1-dc34-dfaee098f278@oracle.com> Message-ID: > On Jun 18, 2018, at 8:18 AM, David Holmes wrote: > > On 13/06/2018 4:11 PM, Thomas St?fe wrote: >> Hi, >> (adding hs-runtime) > > Thanks for that Thomas. > > I also had a cursory glance. This is not really a Request for Review (RFR) as suggested by the email subject but more a "request for comment". The JEP is still a draft and this is a long way from being in a form that could be integrated. As a POC/prototype this may be functional but it is far too intrusive to me to be considered as the final form for this kind of change. The level of abstraction seems wrong; the use of nvdimm is everywhere. Ideally I would have expected the use of an alternate memory device to be hidden behind the address range - ie you just point the old gen to the alternate memory device, mapping into a given address range for the process and it "just works?. +1, this feels like it should be managed by device drivers mapped to an address range. That said, I?m not sure that the JVM could take advantage of this is a completely transparent way. Kind regards, Kirk From stefan.karlsson at oracle.com Mon Jun 18 09:54:14 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 18 Jun 2018 11:54:14 +0200 Subject: RFR: 8205163: ZGC: Keeps finalizable marked PhantomReference referents strongly alive Message-ID: <16513927-8c0e-1582-6ea9-c0cfd4b0cbcc@oracle.com> Hi all, Please review this patch to fix a bug where PhantomReferences can cause otherwise finalizable objects to become considered as strongly reachable. http://cr.openjdk.java.net/~stefank/8205163/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8205163 ZGC has a concept of finalizable marked objects, for objects that are only reachable through the referents of Finalizers. When objects that have been finalizable marked, are later found to also be strongly reachable, the marking strength is updated to strongly marked. When "processing" SoftReferences, WeakReferences, and Finalizers, the GC checks if the referents are strongly marked, and if not, later pushes the References out to the pending list. For PhantomReferences, the GC also checks if the referents are finalizable marked, and if not, later pushes the PhantomReferences out to the pending list. Before the References are processed, the GC "discovers" References that it "encounters" during the marking phase. The GC keeps track of discovered References until the processing phase. If a Reference with a live referent is encountered, it is not discovered, and instead the fields (referent, discovered) are treated as normal fields. The bug happens when the GC tries to discover a PhantomReference with a finalizable marked referent. Since the referent is already live according to the PhantomReferences's definition of live (finalizable or strongly marked), the PhantomReference is not discovered, and the fields are treated as normal fields. This means that the ordinary strong marking closure is applied to both the referent and the discovered field, causing the marking strength of the referent to be promoted from finalizable marked to strongly marked. This increase in strength isn't problematic for the PhantomReference, since it already considered the referent to be alive. However, any Finalizers that depended on the referent to only be finalizable marked, will now be dropped instead of pushed to the pending list, and therefore the referent will not be finalized. The proposed patch prevents this problem by always discovering PhantomReferences with finalizable marked referents. By doing this, the marking code will not mark the referent as strongly reachable, the processing code will still drop the PhantomReference. And the finalizable objects that were incorrectly kept alive, can now be finalized. This was found in a closed test in higher-tier testing. Thanks, StefanK From per.liden at oracle.com Mon Jun 18 10:42:13 2018 From: per.liden at oracle.com (Per Liden) Date: Mon, 18 Jun 2018 12:42:13 +0200 Subject: RFR: 8205163: ZGC: Keeps finalizable marked PhantomReference referents strongly alive In-Reply-To: <16513927-8c0e-1582-6ea9-c0cfd4b0cbcc@oracle.com> References: <16513927-8c0e-1582-6ea9-c0cfd4b0cbcc@oracle.com> Message-ID: <12bbb6ee-755f-5964-fc64-81bf899a11f4@oracle.com> Looks good! /Per On 06/18/2018 11:54 AM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to fix a bug where PhantomReferences can cause > otherwise finalizable objects to become considered as strongly reachable. > > http://cr.openjdk.java.net/~stefank/8205163/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8205163 > > ZGC has a concept of finalizable marked objects, for objects that are > only reachable through the referents of Finalizers. > > When objects that have been finalizable marked, are later found to also > be strongly reachable, the marking strength is updated to strongly marked. > > When "processing" SoftReferences, WeakReferences, and Finalizers, the GC > checks if the referents are strongly marked, and if not, later pushes > the References out to the pending list. > > For PhantomReferences, the GC also checks if the referents are > finalizable marked, and if not, later pushes the PhantomReferences out > to the pending list. > > Before the References are processed, the GC "discovers" References that > it "encounters" during the marking phase. The GC keeps track of > discovered References until the processing phase. If a Reference with a > live referent is encountered, it is not discovered, and instead the > fields (referent, discovered) are treated as normal fields. > > The bug happens when the GC tries to discover a PhantomReference with a > finalizable marked referent. Since the referent is already live > according to the PhantomReferences's definition of live (finalizable or > strongly marked), the PhantomReference is not discovered, and the fields > are treated as normal fields. This means that the ordinary strong > marking closure is applied to both the referent and the discovered > field, causing the marking strength of the referent to be promoted from > finalizable marked to strongly marked. This increase in strength isn't > problematic for the PhantomReference, since it already considered the > referent to be alive. However, any Finalizers that depended on the > referent to only be finalizable marked, will now be dropped instead of > pushed to the pending list, and therefore the referent will not be > finalized. > > The proposed patch prevents this problem by always discovering > PhantomReferences with finalizable marked referents. By doing this, the > marking code will not mark the referent as strongly reachable, the > processing code will still drop the PhantomReference. And the > finalizable objects that were incorrectly kept alive, can now be finalized. > > This was found in a closed test in higher-tier testing. > > Thanks, > StefanK From erik.helin at oracle.com Mon Jun 18 14:05:13 2018 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 18 Jun 2018 16:05:13 +0200 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> References: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> Message-ID: On 06/16/2018 09:00 PM, Hohensee, Paul wrote: > Thanks for the re-review, Erik. New webrev with your fixes: > > http://cr.openjdk.java.net/~phh/8195115/webrev.04/ The patch is good to go now, Reviewed. Thanks, Erik > Need another reviewer, please. > > Thanks, > > Paul > > ?On 6/16/18, 1:25 AM, "Erik Helin" wrote: > > On 06/15/2018 10:21 PM, Hohensee, Paul wrote: > > After some difficulty with the submit cluster, with which Erik helped me out, the patch passes. It also passed fastdebug hotspot tier 1 testing on my Mac laptop, which former includes the new test. > > > > I had to increase -Xmx and -Xms to 12m in order to get TestOldGenCollectionUsage to pass on the submit cluster, though the old 10m works fine on my Mac. New webrev: > > Thanks, the change of -Xmx and -Xms to 12m now also makes the test pass > on my workstation. > > > http://cr.openjdk.java.net/~phh/8195115/webrev.03/ > > There seems to be some trailing whitespace in the patch, have you run > jcheck (or `hg diff` which highlights trailing whitespace in red)? > Please see > > + TraceMemoryManagerStats tms(&_memory_manager, gc_cause(), > + collector_state()->yc_type() == Mixed > /* allMemoryPoolsAffected */); > + > ^---- whitespace > > and > > +int MemoryManager::add_pool(MemoryPool* pool) { > + int index = _num_pools; > ^---- whitespace > > Another small comment, I would have written > > +void GCMemoryManager::add_pool(MemoryPool* pool) { > + int index = MemoryManager::add_pool(pool); > + _pool_always_affected_by_gc[index] = true; > +} > + > +void GCMemoryManager::add_pool(MemoryPool* pool, bool > always_affected_by_gc) { > + int index = MemoryManager::add_pool(pool); > + _pool_always_affected_by_gc[index] = always_affected_by_gc; > +} > + > > as > > +void GCMemoryManager::add_pool(MemoryPool* pool) { > + add_pool(pool, true); > +} > + > +void GCMemoryManager::add_pool(MemoryPool* pool, bool > always_affected_by_gc) { > + int index = MemoryManager::add_pool(pool); > + _pool_always_affected_by_gc[index] = always_affected_by_gc; > +} > + > > to not have to two duplicate implementations of > GCMemoryManager::add_pool. Would you mind updating the patch with this > change (and remove the trailing whitespace)? > > Thanks, > Erik > > > Thanks, > > > > Paul > > > > On 6/12/18, 6:52 AM, "Erik Helin" wrote: > > > > (adding back serviceability-dev, please keep both hotspot-gc-dev and > > serviceability-dev) > > > > Hi Paul, > > > > before I start re-reviewing, did you test the new version of the patch > > via the jdk/submit repository [0]? > > > > Thanks, > > Erik > > > > [0]: http://hg.openjdk.java.net/jdk/submit > > > > On 06/09/2018 03:29 PM, Hohensee, Paul wrote: > > > Didn't seem to make it to hotspot-gc-dev... > > > > > > On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > Back after a long hiatus... > > > > > > Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ > > > > > > TIA for your re-review. Plus, may I have another reviewer look at it please? > > > > > > Paul > > > > > > On 2/26/18, 8:47 AM, "Erik Helin" wrote: > > > > > > Hi Paul, > > > > > > a couple of comments on the patch: > > > > > > - memoryService.hpp: > > > + 150 bool countCollection, > > > + 151 bool allMemoryPoolsAffected = true); > > > > > > There is no need to use a default value for the parameter > > > allMemoryPoolsAffected here. Skipping the default value also allows > > > you to put allMemoryPoolsAffected to TraceMemoryManager::initialize > > > in the same relative position as for the constructor parameter (this > > > will make the code more uniform and easier to follow). > > > > > > - memoryManager.cpp > > > > > > Instead of adding a default parameter, maybe add a new method? > > > Something like GCMemoryManager::add_not_always_affected_pool() > > > (I couldn't come up with a shorter name at the moment). > > > > > > - TestMixedOldGenCollectionUsage.java > > > > > > The test is too strict about how and when collections should > > > occur. Tests written this way often become very brittle, they might > > > e.g. fail to finish a concurrent mark on time on a very slow, single > > > core, machine. It is better to either force collections by using the > > > WhiteBox API or make the test more lenient. > > > > > > Thanks, > > > Erik > > > > > > On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > > > > Ping for a review please. > > > > > > > > Thanks, > > > > > > > > Paul > > > > > > > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > > > > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > > > > > > > From the original RR: > > > > > > > > > The bug is that from the JMX point of view, G1?s incremental collector > > > > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > > > > survivor and eden spaces. In fact, mixed collections run by this > > > > > collector also affect the G1 old generation. > > > > > > > > > > This proposed fix is to record, for each of a JMX garbage collector's > > > > > memory pools, whether that memory pool is affected by all collections > > > > > using that collector. And, for each collection, record whether or not > > > > > all the collector's memory pools are affected. After each collection, > > > > > for each memory pool, if either all the collector's memory pools were > > > > > affected or the memory pool is affected for all collections, record > > > > > CollectionUsage for that pool. > > > > > > > > > > For collectors other than G1 Young Generation, all pools are recorded as > > > > > affected by all collections and every collection is recorded as > > > > > affecting all the collector?s memory pools. For the G1 Young Generation > > > > > collector, the G1 Old Gen pool is recorded as not being affected by all > > > > > collections, and non-mixed collections are recorded as not affecting all > > > > > memory pools. The result is that for non-mixed collections, > > > > > CollectionUsage is recorded after a collection only the G1 Eden Space > > > > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > > > > is recorded for G1 Old Gen as well. > > > > > > > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > > > > CollectionUsage, the only external behavior change is that > > > > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > > > > rather than 2. > > > > > > > > > > With this fix, a collector?s memory pools can be divided into two > > > > > disjoint subsets, one that participates in all collections and one that > > > > > doesn?t. This is a bit more general than the minimum necessary to fix > > > > > G1, but not by much. Because I expect it to apply to other incremental > > > > > region-based collectors, I went with the more general solution. I > > > > > minimized the amount of code I had to touch by using default parameters > > > > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From hohensee at amazon.com Mon Jun 18 15:47:10 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 18 Jun 2018 15:47:10 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <49f2eabc-3d3e-46cd-00b7-d52e11d74dab@oracle.com> References: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> <49f2eabc-3d3e-46cd-00b7-d52e11d74dab@oracle.com> Message-ID: <83241C02-BDBB-4B81-994C-108B6559F986@amazon.com> Thanks for the review, Mandy. :) ?On 6/17/18, 8:43 PM, "mandy chung" wrote: Looks fine to me. Mandy On 6/16/18 12:00 PM, Hohensee, Paul wrote: > Thanks for the re-review, Erik. New webrev with your fixes: > > http://cr.openjdk.java.net/~phh/8195115/webrev.04/ > > Need another reviewer, please. > > Thanks, > > Paul From hohensee at amazon.com Mon Jun 18 16:14:31 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 18 Jun 2018 16:14:31 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: References: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> Message-ID: <63187C09-2278-4346-8E98-90D04F731D59@amazon.com> Thanks, Eric! I'd push, but it seems I don't seem to have permission at the moment. Who should I contact to get that fixed? Thanks, Paul ?On 6/18/18, 7:09 AM, "Erik Helin" wrote: On 06/16/2018 09:00 PM, Hohensee, Paul wrote: > Thanks for the re-review, Erik. New webrev with your fixes: > > http://cr.openjdk.java.net/~phh/8195115/webrev.04/ The patch is good to go now, Reviewed. Thanks, Erik > Need another reviewer, please. > > Thanks, > > Paul > > On 6/16/18, 1:25 AM, "Erik Helin" wrote: > > On 06/15/2018 10:21 PM, Hohensee, Paul wrote: > > After some difficulty with the submit cluster, with which Erik helped me out, the patch passes. It also passed fastdebug hotspot tier 1 testing on my Mac laptop, which former includes the new test. > > > > I had to increase -Xmx and -Xms to 12m in order to get TestOldGenCollectionUsage to pass on the submit cluster, though the old 10m works fine on my Mac. New webrev: > > Thanks, the change of -Xmx and -Xms to 12m now also makes the test pass > on my workstation. > > > http://cr.openjdk.java.net/~phh/8195115/webrev.03/ > > There seems to be some trailing whitespace in the patch, have you run > jcheck (or `hg diff` which highlights trailing whitespace in red)? > Please see > > + TraceMemoryManagerStats tms(&_memory_manager, gc_cause(), > + collector_state()->yc_type() == Mixed > /* allMemoryPoolsAffected */); > + > ^---- whitespace > > and > > +int MemoryManager::add_pool(MemoryPool* pool) { > + int index = _num_pools; > ^---- whitespace > > Another small comment, I would have written > > +void GCMemoryManager::add_pool(MemoryPool* pool) { > + int index = MemoryManager::add_pool(pool); > + _pool_always_affected_by_gc[index] = true; > +} > + > +void GCMemoryManager::add_pool(MemoryPool* pool, bool > always_affected_by_gc) { > + int index = MemoryManager::add_pool(pool); > + _pool_always_affected_by_gc[index] = always_affected_by_gc; > +} > + > > as > > +void GCMemoryManager::add_pool(MemoryPool* pool) { > + add_pool(pool, true); > +} > + > +void GCMemoryManager::add_pool(MemoryPool* pool, bool > always_affected_by_gc) { > + int index = MemoryManager::add_pool(pool); > + _pool_always_affected_by_gc[index] = always_affected_by_gc; > +} > + > > to not have to two duplicate implementations of > GCMemoryManager::add_pool. Would you mind updating the patch with this > change (and remove the trailing whitespace)? > > Thanks, > Erik > > > Thanks, > > > > Paul > > > > On 6/12/18, 6:52 AM, "Erik Helin" wrote: > > > > (adding back serviceability-dev, please keep both hotspot-gc-dev and > > serviceability-dev) > > > > Hi Paul, > > > > before I start re-reviewing, did you test the new version of the patch > > via the jdk/submit repository [0]? > > > > Thanks, > > Erik > > > > [0]: http://hg.openjdk.java.net/jdk/submit > > > > On 06/09/2018 03:29 PM, Hohensee, Paul wrote: > > > Didn't seem to make it to hotspot-gc-dev... > > > > > > On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > Back after a long hiatus... > > > > > > Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ > > > > > > TIA for your re-review. Plus, may I have another reviewer look at it please? > > > > > > Paul > > > > > > On 2/26/18, 8:47 AM, "Erik Helin" wrote: > > > > > > Hi Paul, > > > > > > a couple of comments on the patch: > > > > > > - memoryService.hpp: > > > + 150 bool countCollection, > > > + 151 bool allMemoryPoolsAffected = true); > > > > > > There is no need to use a default value for the parameter > > > allMemoryPoolsAffected here. Skipping the default value also allows > > > you to put allMemoryPoolsAffected to TraceMemoryManager::initialize > > > in the same relative position as for the constructor parameter (this > > > will make the code more uniform and easier to follow). > > > > > > - memoryManager.cpp > > > > > > Instead of adding a default parameter, maybe add a new method? > > > Something like GCMemoryManager::add_not_always_affected_pool() > > > (I couldn't come up with a shorter name at the moment). > > > > > > - TestMixedOldGenCollectionUsage.java > > > > > > The test is too strict about how and when collections should > > > occur. Tests written this way often become very brittle, they might > > > e.g. fail to finish a concurrent mark on time on a very slow, single > > > core, machine. It is better to either force collections by using the > > > WhiteBox API or make the test more lenient. > > > > > > Thanks, > > > Erik > > > > > > On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > > > > Ping for a review please. > > > > > > > > Thanks, > > > > > > > > Paul > > > > > > > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > > > > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > > > > > > > From the original RR: > > > > > > > > > The bug is that from the JMX point of view, G1?s incremental collector > > > > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > > > > survivor and eden spaces. In fact, mixed collections run by this > > > > > collector also affect the G1 old generation. > > > > > > > > > > This proposed fix is to record, for each of a JMX garbage collector's > > > > > memory pools, whether that memory pool is affected by all collections > > > > > using that collector. And, for each collection, record whether or not > > > > > all the collector's memory pools are affected. After each collection, > > > > > for each memory pool, if either all the collector's memory pools were > > > > > affected or the memory pool is affected for all collections, record > > > > > CollectionUsage for that pool. > > > > > > > > > > For collectors other than G1 Young Generation, all pools are recorded as > > > > > affected by all collections and every collection is recorded as > > > > > affecting all the collector?s memory pools. For the G1 Young Generation > > > > > collector, the G1 Old Gen pool is recorded as not being affected by all > > > > > collections, and non-mixed collections are recorded as not affecting all > > > > > memory pools. The result is that for non-mixed collections, > > > > > CollectionUsage is recorded after a collection only the G1 Eden Space > > > > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > > > > is recorded for G1 Old Gen as well. > > > > > > > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > > > > CollectionUsage, the only external behavior change is that > > > > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > > > > rather than 2. > > > > > > > > > > With this fix, a collector?s memory pools can be divided into two > > > > > disjoint subsets, one that participates in all collections and one that > > > > > doesn?t. This is a bit more general than the minimum necessary to fix > > > > > G1, but not by much. Because I expect it to apply to other incremental > > > > > region-based collectors, I went with the more general solution. I > > > > > minimized the amount of code I had to touch by using default parameters > > > > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From vinay.k.awasthi at intel.com Mon Jun 18 16:47:15 2018 From: vinay.k.awasthi at intel.com (Awasthi, Vinay K) Date: Mon, 18 Jun 2018 16:47:15 +0000 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: References: Message-ID: Hi Thomas, Os::commit_memory calls map_memory_to_file which is same as os::reserve_memory. I am failing to see why os::reserve_memory can call map_memory_to_file (i.e. tie it to mmap) but commit_memory can't... Before this patch, commit_memory never dealt with incrementally committing pages to device so there has to be a way to pass file descriptor and offset. Windows has no such capability to manage incremental commits. All other OSes do and that is why map_memory_to_file is used (which by the way also works on Windows). Thanks, Vinay -----Original Message----- From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: Saturday, June 16, 2018 12:01 PM To: Awasthi, Vinay K Cc: Thomas Schatzl ; Paul Su ; hotspot-gc-dev at openjdk.java.net; Hotspot dev runtime ; Viswanathan, Sandhya ; Aundhe, Shirish ; Kharbas, Kishor Subject: Re: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. Hi Vinay! this is the third thread you opened for this issue; it would helpful if you would not change subjects, because it splinters discussions on the mailing list. For reference, this is the first mail thread with our first exchange: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022342.html --- Thank you for your perseverance and patience. But unfortunately, my concerns have not been alleviated a lot by the current patch. I still think this stretches an already partly-ill-defined interface further. About os::commit_memory(): as I wrote in my first mail: "So far, for the most part, the os::{reserve|commit}_memory APIs have been agnostic to the underlying implementation. You pretty much tie it to mmap() now. This adds implicit restrictions to the API we did not have before (e.g. will not work if platform uses SysV shm APIs to implement these APIs)." In your response http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022356.html you explain why you do this. I understand your motives. I still dislike this, for the reasons I gave before: it adds a generic API to a set of generic APIs which IMHO breaks the implicit contract they all have with each other. Your patch also blurrs the difference between runtime "os::" layer and GC. "os" is a general OS-wrapping API layer, and should not know nor care about GC internals: + static inline int nvdimm_fd() { + // ParallelOldGC adaptive sizing requires nvdimm fd. + return _nvdimm_fd; + } + static inline address dram_heapbase() { + return _dram_heap_base; + } + static inline address nvdimm_heapbase() { + return _nvdimm_heap_base; + } + static inline uint nvdimm_regionlength() { + return _nvdimm_region_length; + } IMHO, currently the memory management is ill prepared for your patch; yes, one could shove it in, but at the expense of maintainability and code clearness. This expense would have to be carried by all JVM developers, regardless whether they work on your hardware and benefit from this feature. So I think this would work better with some preparatory refactoring done in the VM. Red Hat and Oracle did similar efforts by refactoring the GC interface before adding new GCs: see https://bugs.openjdk.java.net/browse/JDK-8163329. Maybe we could think about how to do this. It certainly would be a worthy goal. Kind Regards, Thomas (BTW, I really do not like the fact that in os_posix.cpp, os::map_memory_to_file() and os::allocate_file() do an vm_exit_during_initialization() in case of an error! These are (supposed to be) general purpose APIs and under no circumstances should they end the process. This is already in the hotspot now, added as part of JDK-8190308. This should be fixed.) On Fri, Jun 15, 2018 at 7:59 PM, Awasthi, Vinay K wrote: > HI Thomas, > > Thanks for your input.. > > Now there is *no* change in virtualspace.cpp... > > I moved reserve and commit (this is how memory backed by file is handled) from reserve space to commit places in respective gcs... All changes are again localized and isolated with os::has_nvdimm()/AllocateOldGenAT. > > There are also fixes (1 line changes) added related to alignment and there is no un-mapping etc.. before mapping nvdimm backed dax file. > > Full Patch patch is here.. > http://cr.openjdk.java.net/~kkharbas/8204908/webrev.04 > > Any input is welcome. > > Thanks, > Vinay > > -----Original Message----- > From: Thomas Schatzl [mailto:thomas.schatzl at oracle.com] > Sent: Friday, June 15, 2018 6:53 AM > To: Awasthi, Vinay K ; 'Paul Su' > ; 'hotspot-gc-dev at openjdk.java.net' > ; 'Hotspot dev runtime' > > Cc: Kharbas, Kishor ; Aundhe, Shirish > ; Viswanathan, Sandhya > > Subject: Re: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. > > Hi Vinay, > > On Thu, 2018-06-14 at 20:49 +0000, Awasthi, Vinay K wrote: >> Now ReservedSpace.cpp has logic to only open NVDIMM File (as it was >> done for AllocateheapAt).. if successful, set up 3 flags >> (base/nvdimm_present/file handle) at the end. There is *NO* collector >> specific code. >> >> All work has been moved to g1PagebasedVirtualSpace.cpp.. I am >> committing memory here and setting dram_heapbase used by g1 here. >> >> JEP to support allocating Old generation on NV-DIMM - https://bugs.op >> enjdk.java.net/browse/JDK-8202286 >> Here is the implementation bug link: https://bugs.openjdk.java.net/br >> owse/JDK-8204908 >> >> >> Patch is Uploaded at (full patch/incremental patch) >> >> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02/ >> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02_to_01/ >> Tested default setup (i.e. no file is being passed for heap) and >> AllocateHeapAt/AllocateOldGenAt with POGC and G1GC.. all passing? Any >> and all comments are welcome! >> > > looking briefly through the changes, I think they look much better already to move the G1 specific stuff into G1 code; however I would like to think about how we could reduce the complexity further and solve the case of allowing multiple mapping sources (tmpfs file, nvram, different "types" of RAM) for different parts of the heap in an even cleaner way. > > Thanks, > Thomas > From erik.helin at oracle.com Mon Jun 18 17:25:07 2018 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 18 Jun 2018 19:25:07 +0200 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <63187C09-2278-4346-8E98-90D04F731D59@amazon.com> References: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> <63187C09-2278-4346-8E98-90D04F731D59@amazon.com> Message-ID: <43d0ba3d-3006-254f-2927-5ffef51e3e4d@oracle.com> On 06/18/2018 06:14 PM, Hohensee, Paul wrote: > Thanks, Eric! > > I'd push, but it seems I don't seem to have permission at the moment. Who should I contact to get that fixed? That would be ops at openjdk.java.net. Thanks, Erik > Thanks, > > Paul > > ?On 6/18/18, 7:09 AM, "Erik Helin" wrote: > > On 06/16/2018 09:00 PM, Hohensee, Paul wrote: > > Thanks for the re-review, Erik. New webrev with your fixes: > > > > http://cr.openjdk.java.net/~phh/8195115/webrev.04/ > > The patch is good to go now, Reviewed. > > Thanks, > Erik > > > Need another reviewer, please. > > > > Thanks, > > > > Paul > > > > On 6/16/18, 1:25 AM, "Erik Helin" wrote: > > > > On 06/15/2018 10:21 PM, Hohensee, Paul wrote: > > > After some difficulty with the submit cluster, with which Erik helped me out, the patch passes. It also passed fastdebug hotspot tier 1 testing on my Mac laptop, which former includes the new test. > > > > > > I had to increase -Xmx and -Xms to 12m in order to get TestOldGenCollectionUsage to pass on the submit cluster, though the old 10m works fine on my Mac. New webrev: > > > > Thanks, the change of -Xmx and -Xms to 12m now also makes the test pass > > on my workstation. > > > > > http://cr.openjdk.java.net/~phh/8195115/webrev.03/ > > > > There seems to be some trailing whitespace in the patch, have you run > > jcheck (or `hg diff` which highlights trailing whitespace in red)? > > Please see > > > > + TraceMemoryManagerStats tms(&_memory_manager, gc_cause(), > > + collector_state()->yc_type() == Mixed > > /* allMemoryPoolsAffected */); > > + > > ^---- whitespace > > > > and > > > > +int MemoryManager::add_pool(MemoryPool* pool) { > > + int index = _num_pools; > > ^---- whitespace > > > > Another small comment, I would have written > > > > +void GCMemoryManager::add_pool(MemoryPool* pool) { > > + int index = MemoryManager::add_pool(pool); > > + _pool_always_affected_by_gc[index] = true; > > +} > > + > > +void GCMemoryManager::add_pool(MemoryPool* pool, bool > > always_affected_by_gc) { > > + int index = MemoryManager::add_pool(pool); > > + _pool_always_affected_by_gc[index] = always_affected_by_gc; > > +} > > + > > > > as > > > > +void GCMemoryManager::add_pool(MemoryPool* pool) { > > + add_pool(pool, true); > > +} > > + > > +void GCMemoryManager::add_pool(MemoryPool* pool, bool > > always_affected_by_gc) { > > + int index = MemoryManager::add_pool(pool); > > + _pool_always_affected_by_gc[index] = always_affected_by_gc; > > +} > > + > > > > to not have to two duplicate implementations of > > GCMemoryManager::add_pool. Would you mind updating the patch with this > > change (and remove the trailing whitespace)? > > > > Thanks, > > Erik > > > > > Thanks, > > > > > > Paul > > > > > > On 6/12/18, 6:52 AM, "Erik Helin" wrote: > > > > > > (adding back serviceability-dev, please keep both hotspot-gc-dev and > > > serviceability-dev) > > > > > > Hi Paul, > > > > > > before I start re-reviewing, did you test the new version of the patch > > > via the jdk/submit repository [0]? > > > > > > Thanks, > > > Erik > > > > > > [0]: http://hg.openjdk.java.net/jdk/submit > > > > > > On 06/09/2018 03:29 PM, Hohensee, Paul wrote: > > > > Didn't seem to make it to hotspot-gc-dev... > > > > > > > > On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > > > Back after a long hiatus... > > > > > > > > Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ > > > > > > > > TIA for your re-review. Plus, may I have another reviewer look at it please? > > > > > > > > Paul > > > > > > > > On 2/26/18, 8:47 AM, "Erik Helin" wrote: > > > > > > > > Hi Paul, > > > > > > > > a couple of comments on the patch: > > > > > > > > - memoryService.hpp: > > > > + 150 bool countCollection, > > > > + 151 bool allMemoryPoolsAffected = true); > > > > > > > > There is no need to use a default value for the parameter > > > > allMemoryPoolsAffected here. Skipping the default value also allows > > > > you to put allMemoryPoolsAffected to TraceMemoryManager::initialize > > > > in the same relative position as for the constructor parameter (this > > > > will make the code more uniform and easier to follow). > > > > > > > > - memoryManager.cpp > > > > > > > > Instead of adding a default parameter, maybe add a new method? > > > > Something like GCMemoryManager::add_not_always_affected_pool() > > > > (I couldn't come up with a shorter name at the moment). > > > > > > > > - TestMixedOldGenCollectionUsage.java > > > > > > > > The test is too strict about how and when collections should > > > > occur. Tests written this way often become very brittle, they might > > > > e.g. fail to finish a concurrent mark on time on a very slow, single > > > > core, machine. It is better to either force collections by using the > > > > WhiteBox API or make the test more lenient. > > > > > > > > Thanks, > > > > Erik > > > > > > > > On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > > > > > Ping for a review please. > > > > > > > > > > Thanks, > > > > > > > > > > Paul > > > > > > > > > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > > > > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > > > > > > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > > > > > > > > > From the original RR: > > > > > > > > > > > The bug is that from the JMX point of view, G1?s incremental collector > > > > > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > > > > > survivor and eden spaces. In fact, mixed collections run by this > > > > > > collector also affect the G1 old generation. > > > > > > > > > > > > This proposed fix is to record, for each of a JMX garbage collector's > > > > > > memory pools, whether that memory pool is affected by all collections > > > > > > using that collector. And, for each collection, record whether or not > > > > > > all the collector's memory pools are affected. After each collection, > > > > > > for each memory pool, if either all the collector's memory pools were > > > > > > affected or the memory pool is affected for all collections, record > > > > > > CollectionUsage for that pool. > > > > > > > > > > > > For collectors other than G1 Young Generation, all pools are recorded as > > > > > > affected by all collections and every collection is recorded as > > > > > > affecting all the collector?s memory pools. For the G1 Young Generation > > > > > > collector, the G1 Old Gen pool is recorded as not being affected by all > > > > > > collections, and non-mixed collections are recorded as not affecting all > > > > > > memory pools. The result is that for non-mixed collections, > > > > > > CollectionUsage is recorded after a collection only the G1 Eden Space > > > > > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > > > > > is recorded for G1 Old Gen as well. > > > > > > > > > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > > > > > CollectionUsage, the only external behavior change is that > > > > > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > > > > > rather than 2. > > > > > > > > > > > > With this fix, a collector?s memory pools can be divided into two > > > > > > disjoint subsets, one that participates in all collections and one that > > > > > > doesn?t. This is a bit more general than the minimum necessary to fix > > > > > > G1, but not by much. Because I expect it to apply to other incremental > > > > > > region-based collectors, I went with the more general solution. I > > > > > > minimized the amount of code I had to touch by using default parameters > > > > > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From hohensee at amazon.com Mon Jun 18 18:17:43 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 18 Jun 2018 18:17:43 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <43d0ba3d-3006-254f-2927-5ffef51e3e4d@oracle.com> References: <958dd448-efde-2414-9c88-a0f1c869dd56@oracle.com> <9F239517-8707-4786-B7E9-7F88CC0D4C6E@amazon.com> <63187C09-2278-4346-8E98-90D04F731D59@amazon.com> <43d0ba3d-3006-254f-2927-5ffef51e3e4d@oracle.com> Message-ID: Thanks, Erik. ?On 6/18/18, 10:26 AM, "Erik Helin" wrote: On 06/18/2018 06:14 PM, Hohensee, Paul wrote: > Thanks, Eric! > > I'd push, but it seems I don't seem to have permission at the moment. Who should I contact to get that fixed? That would be ops at openjdk.java.net. Thanks, Erik > Thanks, > > Paul > > On 6/18/18, 7:09 AM, "Erik Helin" wrote: > > On 06/16/2018 09:00 PM, Hohensee, Paul wrote: > > Thanks for the re-review, Erik. New webrev with your fixes: > > > > http://cr.openjdk.java.net/~phh/8195115/webrev.04/ > > The patch is good to go now, Reviewed. > > Thanks, > Erik > > > Need another reviewer, please. > > > > Thanks, > > > > Paul > > > > On 6/16/18, 1:25 AM, "Erik Helin" wrote: > > > > On 06/15/2018 10:21 PM, Hohensee, Paul wrote: > > > After some difficulty with the submit cluster, with which Erik helped me out, the patch passes. It also passed fastdebug hotspot tier 1 testing on my Mac laptop, which former includes the new test. > > > > > > I had to increase -Xmx and -Xms to 12m in order to get TestOldGenCollectionUsage to pass on the submit cluster, though the old 10m works fine on my Mac. New webrev: > > > > Thanks, the change of -Xmx and -Xms to 12m now also makes the test pass > > on my workstation. > > > > > http://cr.openjdk.java.net/~phh/8195115/webrev.03/ > > > > There seems to be some trailing whitespace in the patch, have you run > > jcheck (or `hg diff` which highlights trailing whitespace in red)? > > Please see > > > > + TraceMemoryManagerStats tms(&_memory_manager, gc_cause(), > > + collector_state()->yc_type() == Mixed > > /* allMemoryPoolsAffected */); > > + > > ^---- whitespace > > > > and > > > > +int MemoryManager::add_pool(MemoryPool* pool) { > > + int index = _num_pools; > > ^---- whitespace > > > > Another small comment, I would have written > > > > +void GCMemoryManager::add_pool(MemoryPool* pool) { > > + int index = MemoryManager::add_pool(pool); > > + _pool_always_affected_by_gc[index] = true; > > +} > > + > > +void GCMemoryManager::add_pool(MemoryPool* pool, bool > > always_affected_by_gc) { > > + int index = MemoryManager::add_pool(pool); > > + _pool_always_affected_by_gc[index] = always_affected_by_gc; > > +} > > + > > > > as > > > > +void GCMemoryManager::add_pool(MemoryPool* pool) { > > + add_pool(pool, true); > > +} > > + > > +void GCMemoryManager::add_pool(MemoryPool* pool, bool > > always_affected_by_gc) { > > + int index = MemoryManager::add_pool(pool); > > + _pool_always_affected_by_gc[index] = always_affected_by_gc; > > +} > > + > > > > to not have to two duplicate implementations of > > GCMemoryManager::add_pool. Would you mind updating the patch with this > > change (and remove the trailing whitespace)? > > > > Thanks, > > Erik > > > > > Thanks, > > > > > > Paul > > > > > > On 6/12/18, 6:52 AM, "Erik Helin" wrote: > > > > > > (adding back serviceability-dev, please keep both hotspot-gc-dev and > > > serviceability-dev) > > > > > > Hi Paul, > > > > > > before I start re-reviewing, did you test the new version of the patch > > > via the jdk/submit repository [0]? > > > > > > Thanks, > > > Erik > > > > > > [0]: http://hg.openjdk.java.net/jdk/submit > > > > > > On 06/09/2018 03:29 PM, Hohensee, Paul wrote: > > > > Didn't seem to make it to hotspot-gc-dev... > > > > > > > > On 6/8/18, 10:14 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > > > Back after a long hiatus... > > > > > > > > Thanks, Eric, for your review. Here's a new webrev incorporating your recommendations. > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.02/ > > > > > > > > TIA for your re-review. Plus, may I have another reviewer look at it please? > > > > > > > > Paul > > > > > > > > On 2/26/18, 8:47 AM, "Erik Helin" wrote: > > > > > > > > Hi Paul, > > > > > > > > a couple of comments on the patch: > > > > > > > > - memoryService.hpp: > > > > + 150 bool countCollection, > > > > + 151 bool allMemoryPoolsAffected = true); > > > > > > > > There is no need to use a default value for the parameter > > > > allMemoryPoolsAffected here. Skipping the default value also allows > > > > you to put allMemoryPoolsAffected to TraceMemoryManager::initialize > > > > in the same relative position as for the constructor parameter (this > > > > will make the code more uniform and easier to follow). > > > > > > > > - memoryManager.cpp > > > > > > > > Instead of adding a default parameter, maybe add a new method? > > > > Something like GCMemoryManager::add_not_always_affected_pool() > > > > (I couldn't come up with a shorter name at the moment). > > > > > > > > - TestMixedOldGenCollectionUsage.java > > > > > > > > The test is too strict about how and when collections should > > > > occur. Tests written this way often become very brittle, they might > > > > e.g. fail to finish a concurrent mark on time on a very slow, single > > > > core, machine. It is better to either force collections by using the > > > > WhiteBox API or make the test more lenient. > > > > > > > > Thanks, > > > > Erik > > > > > > > > On 02/22/2018 09:54 PM, Hohensee, Paul wrote: > > > > > Ping for a review please. > > > > > > > > > > Thanks, > > > > > > > > > > Paul > > > > > > > > > > On 2/16/18, 12:26 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > > > > > > > The CSR https://bugs.openjdk.java.net/browse/JDK-8196719 for the original fix has been approved, so I?m back to requesting a code review, please. > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > > > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > > > > > > > Passed a submit repo run, passes its jtreg test, and a JDK8 version is in production use at Amazon. > > > > > > > > > > From the original RR: > > > > > > > > > > > The bug is that from the JMX point of view, G1?s incremental collector > > > > > > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > > > > > > survivor and eden spaces. In fact, mixed collections run by this > > > > > > collector also affect the G1 old generation. > > > > > > > > > > > > This proposed fix is to record, for each of a JMX garbage collector's > > > > > > memory pools, whether that memory pool is affected by all collections > > > > > > using that collector. And, for each collection, record whether or not > > > > > > all the collector's memory pools are affected. After each collection, > > > > > > for each memory pool, if either all the collector's memory pools were > > > > > > affected or the memory pool is affected for all collections, record > > > > > > CollectionUsage for that pool. > > > > > > > > > > > > For collectors other than G1 Young Generation, all pools are recorded as > > > > > > affected by all collections and every collection is recorded as > > > > > > affecting all the collector?s memory pools. For the G1 Young Generation > > > > > > collector, the G1 Old Gen pool is recorded as not being affected by all > > > > > > collections, and non-mixed collections are recorded as not affecting all > > > > > > memory pools. The result is that for non-mixed collections, > > > > > > CollectionUsage is recorded after a collection only the G1 Eden Space > > > > > > and G1 Survivor Space pools, while for mixed collections CollectionUsage > > > > > > is recorded for G1 Old Gen as well. > > > > > > > > > > > > Other than the effect of the fix on G1 Old Gen MemoryPool. > > > > > > CollectionUsage, the only external behavior change is that > > > > > > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > > > > > > rather than 2. > > > > > > > > > > > > With this fix, a collector?s memory pools can be divided into two > > > > > > disjoint subsets, one that participates in all collections and one that > > > > > > doesn?t. This is a bit more general than the minimum necessary to fix > > > > > > G1, but not by much. Because I expect it to apply to other incremental > > > > > > region-based collectors, I went with the more general solution. I > > > > > > minimized the amount of code I had to touch by using default parameters > > > > > > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From Derek.White at cavium.com Mon Jun 18 21:57:52 2018 From: Derek.White at cavium.com (White, Derek) Date: Mon, 18 Jun 2018 21:57:52 +0000 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi Michihiro, Further testing is showing a moderate performance gain with your G1 patch in SPECjbb on AArch64. We haven?t completely pinned the earlier regression on user error on our part, but it?s looking likely. The new code on AArch64 is looking as relaxed as can be ? Thank you for working on this, and letting us investigate! * Derek From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Friday, June 15, 2018 11:44 PM To: White, Derek Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-gc-dev at openjdk.java.net; Kim Barrett ; Doerr, Martin Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Derek, Thank you for sharing your status of still having inconsistent results with the patch. I would wait for your updates. Thanks again, Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for "White, Derek" ---2018/06/15 17:55:19---Hi Michihiro, Status update:]"White, Derek" ---2018/06/15 17:55:19---Hi Michihiro, Status update: From: "White, Derek" > To: Kim Barrett >, Michihiro Horie > Cc: Gustavo Bueno Romero >, "david.holmes at oracle.com" >, "hotspot-gc-dev at openjdk.java.net" > Date: 2018/06/15 17:55 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space ________________________________ Hi Michihiro, Status update: My colleague and I are getting inconsistent results with this patch: -23% to +7% on SPECjbb, so we're trying to verify what's going on. On an unrelated note, the aarch64 port relies on GCC's __atomic_compare_exchange to implemented the relaxed case of Atomic::PlatformCmpxchg, and gcc 6 and earlier sometimes do a poor job on it. Not enough to account for the numbers we saw though. I hope to have an answer by Monday. - Derek > -----Original Message----- > From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] > On Behalf Of Kim Barrett > Sent: Wednesday, June 13, 2018 5:16 PM > To: Michihiro Horie > > Cc: Gustavo Bueno Romero >; > david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in > G1ParScanThreadState::copy_to_survivor_space > > External Email > > > On Jun 7, 2018, at 2:01 AM, Michihiro Horie > wrote: > > > > Dear all, > > > > Would you please review the following change? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 > > Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 > > I was going to say that this looks good to me. > > But then I saw Derek White?s reply about an unexpected performance > regression. > I?d like to wait until he reports back. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From HORIE at jp.ibm.com Mon Jun 18 22:19:08 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Mon, 18 Jun 2018 18:19:08 -0400 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi Derek, Thank you for the investigation on AArch64. It would be very helpful and I am glad to know that you had a moderate performance gain with this G1 change. Best regards, -- Michihiro, IBM Research - Tokyo From: "White, Derek" To: Michihiro Horie Cc: "david.holmes at oracle.com" , Gustavo Bueno Romero , "hotspot-gc-dev at openjdk.java.net" , Kim Barrett , "Doerr, Martin" Date: 2018/06/18 17:57 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Michihiro, Further testing is showing a moderate performance gain with your G1 patch in SPECjbb on AArch64. We haven?t completely pinned the earlier regression on user error on our part, but it?s looking likely. The new code on AArch64 is looking as relaxed as can be ? Thank you for working on this, and letting us investigate! Derek From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Friday, June 15, 2018 11:44 PM To: White, Derek Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-gc-dev at openjdk.java.net; Kim Barrett ; Doerr, Martin Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Derek, Thank you for sharing your status of still having inconsistent results with the patch. I would wait for your updates. Thanks again, Best regards, -- Michihiro, IBM Research - Tokyo Inactive hide details for "White, Derek" ---2018/06/15 17:55:19---Hi Michihiro, Status update:"White, Derek" ---2018/06/15 17:55:19---Hi Michihiro, Status update: From: "White, Derek" To: Kim Barrett , Michihiro Horie Cc: Gustavo Bueno Romero , "david.holmes at oracle.com" < david.holmes at oracle.com>, "hotspot-gc-dev at openjdk.java.net" < hotspot-gc-dev at openjdk.java.net> Date: 2018/06/15 17:55 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Michihiro, Status update: My colleague and I are getting inconsistent results with this patch: -23% to +7% on SPECjbb, so we're trying to verify what's going on. On an unrelated note, the aarch64 port relies on GCC's __atomic_compare_exchange to implemented the relaxed case of Atomic::PlatformCmpxchg, and gcc 6 and earlier sometimes do a poor job on it. Not enough to account for the numbers we saw though. I hope to have an answer by Monday. - Derek > -----Original Message----- > From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] > On Behalf Of Kim Barrett > Sent: Wednesday, June 13, 2018 5:16 PM > To: Michihiro Horie > Cc: Gustavo Bueno Romero ; > david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in > G1ParScanThreadState::copy_to_survivor_space > > External Email > > > On Jun 7, 2018, at 2:01 AM, Michihiro Horie wrote: > > > > Dear all, > > > > Would you please review the following change? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 > > Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 > > I was going to say that this looks good to me. > > But then I saw Derek White?s reply about an unexpected performance > regression. > I?d like to wait until he reports back. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From kim.barrett at oracle.com Tue Jun 19 02:49:14 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 18 Jun 2018 22:49:14 -0400 Subject: RFR: 8205163: ZGC: Keeps finalizable marked PhantomReference referents strongly alive In-Reply-To: <16513927-8c0e-1582-6ea9-c0cfd4b0cbcc@oracle.com> References: <16513927-8c0e-1582-6ea9-c0cfd4b0cbcc@oracle.com> Message-ID: <72E20CE3-AFD3-45D3-A049-3DEF9A119C6D@oracle.com> > On Jun 18, 2018, at 5:54 AM, Stefan Karlsson wrote: > > Hi all, > > Please review this patch to fix a bug where PhantomReferences can cause otherwise finalizable objects to become considered as strongly reachable. > > http://cr.openjdk.java.net/~stefank/8205163/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8205163 Looks good. From thomas.schatzl at oracle.com Tue Jun 19 07:36:14 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 19 Jun 2018 09:36:14 +0200 Subject: RFR (XS): 8205043: Make parallel reference processing default for G1 In-Reply-To: References: <7daf58dfd770761768a743a6fca330cf8dfb76ba.camel@oracle.com> Message-ID: <4cff6b055017ead6bbe8b0cc9f19aad2bcbc5a4e.camel@oracle.com> Hi Kim, Stefan, On Fri, 2018-06-15 at 12:45 -0400, Kim Barrett wrote: > > On Jun 14, 2018, at 8:08 AM, Thomas Schatzl > com> wrote: > > > > Hi all, > > > > can I have reviews for this split-off of JDK-8043575 to make > > parallel reference processing in conjunction with dynamic number of > > thread sizing default for G1? > > [..] > > There is also a linked CSR for that change. > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8205043 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8205043/webrev/ > > Testing: > > new included test case > > > > Thanks, > > Thomas > > Looks good. > thanks for your reviews. As I got the okay for the CSR too, it has been pushed :) Thanks, Thomas From martin.doerr at sap.com Tue Jun 19 09:27:53 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 19 Jun 2018 09:27:53 +0000 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi, I assume webrev.00 was used for reviews and tests as Thomas has emphasized that evacuation failures may be performance critical, too. It looks correct to me, too. I can sponsor the change if needed. Please let me know when I can consider it reviewed. Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Dienstag, 19. Juni 2018 00:19 To: White, Derek Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-gc-dev at openjdk.java.net; Kim Barrett ; Doerr, Martin Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Derek, Thank you for the investigation on AArch64. It would be very helpful and I am glad to know that you had a moderate performance gain with this G1 change. Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for "White, Derek" ---2018/06/18 17:57:58---Hi Michihiro, Further testing is showing a moderate performan]"White, Derek" ---2018/06/18 17:57:58---Hi Michihiro, Further testing is showing a moderate performance gain with your G1 patch in SPECjbb o From: "White, Derek" > To: Michihiro Horie > Cc: "david.holmes at oracle.com" >, Gustavo Bueno Romero >, "hotspot-gc-dev at openjdk.java.net" >, Kim Barrett >, "Doerr, Martin" > Date: 2018/06/18 17:57 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space ________________________________ Hi Michihiro, Further testing is showing a moderate performance gain with your G1 patch in SPECjbb on AArch64. We haven?t completely pinned the earlier regression on user error on our part, but it?s looking likely. The new code on AArch64 is looking as relaxed as can be ? Thank you for working on this, and letting us investigate! * Derek From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Friday, June 15, 2018 11:44 PM To: White, Derek > Cc: david.holmes at oracle.com; Gustavo Bueno Romero >; hotspot-gc-dev at openjdk.java.net; Kim Barrett >; Doerr, Martin > Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Derek, Thank you for sharing your status of still having inconsistent results with the patch. I would wait for your updates. Thanks again, Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for "White, Derek" ---2018/06/15 17:55:19---Hi Michihiro, Status update:]"White, Derek" ---2018/06/15 17:55:19---Hi Michihiro, Status update: From: "White, Derek" > To: Kim Barrett >, Michihiro Horie > Cc: Gustavo Bueno Romero >, "david.holmes at oracle.com" >, "hotspot-gc-dev at openjdk.java.net" > Date: 2018/06/15 17:55 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space ________________________________ Hi Michihiro, Status update: My colleague and I are getting inconsistent results with this patch: -23% to +7% on SPECjbb, so we're trying to verify what's going on. On an unrelated note, the aarch64 port relies on GCC's __atomic_compare_exchange to implemented the relaxed case of Atomic::PlatformCmpxchg, and gcc 6 and earlier sometimes do a poor job on it. Not enough to account for the numbers we saw though. I hope to have an answer by Monday. - Derek > -----Original Message----- > From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] > On Behalf Of Kim Barrett > Sent: Wednesday, June 13, 2018 5:16 PM > To: Michihiro Horie > > Cc: Gustavo Bueno Romero >; > david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in > G1ParScanThreadState::copy_to_survivor_space > > External Email > > > On Jun 7, 2018, at 2:01 AM, Michihiro Horie > wrote: > > > > Dear all, > > > > Would you please review the following change? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 > > Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 > > I was going to say that this looks good to me. > > But then I saw Derek White?s reply about an unexpected performance > regression. > I?d like to wait until he reports back. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From thomas.stuefe at gmail.com Tue Jun 19 11:40:22 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 19 Jun 2018 13:40:22 +0200 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: References: Message-ID: Hi Vinay, On Mon, Jun 18, 2018 at 6:47 PM, Awasthi, Vinay K wrote: > Hi Thomas, > > Os::commit_memory calls map_memory_to_file which is same as os::reserve_memory. > > I am failing to see why os::reserve_memory can call map_memory_to_file (i.e. tie it to mmap) but commit_memory can't... Before this patch, commit_memory never dealt with incrementally committing pages to device so there has to be a way to pass file descriptor and offset. Windows has no such capability to manage incremental commits. All other OSes do and that is why map_memory_to_file is used (which by the way also works on Windows). AIX uses System V shared memory by default, which follows a different allocation scheme (semantics more like Windows VirtualAlloc... calls). But my doubts are not limited to that one, see my earlier replies and those of others. It really makes sense to step back one step and discuss the JEP first. BTW, I openend https://bugs.openjdk.java.net/browse/JDK-8205335, could you please take a look? Thanks, Thomas > > Thanks, > Vinay > > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Saturday, June 16, 2018 12:01 PM > To: Awasthi, Vinay K > Cc: Thomas Schatzl ; Paul Su ; hotspot-gc-dev at openjdk.java.net; Hotspot dev runtime ; Viswanathan, Sandhya ; Aundhe, Shirish ; Kharbas, Kishor > Subject: Re: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. > > Hi Vinay! > > this is the third thread you opened for this issue; it would helpful if you would not change subjects, because it splinters discussions on the mailing list. > > For reference, this is the first mail thread with our first exchange: > > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022342.html > > --- > > Thank you for your perseverance and patience. > > But unfortunately, my concerns have not been alleviated a lot by the current patch. I still think this stretches an already partly-ill-defined interface further. > > About os::commit_memory(): as I wrote in my first mail: > > "So far, for the most part, the os::{reserve|commit}_memory APIs have been agnostic to the underlying implementation. You pretty much tie it to mmap() now. This adds implicit restrictions to the API we did not have before (e.g. will not work if platform uses SysV shm APIs to implement these APIs)." > > In your response > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022356.html > you explain why you do this. I understand your motives. I still dislike this, for the reasons I gave before: it adds a generic API to a set of generic APIs which IMHO breaks the implicit contract they all have with each other. > > Your patch also blurrs the difference between runtime "os::" layer and GC. "os" is a general OS-wrapping API layer, and should not know nor care about GC internals: > > + static inline int nvdimm_fd() { > + // ParallelOldGC adaptive sizing requires nvdimm fd. > + return _nvdimm_fd; > + } > + static inline address dram_heapbase() { > + return _dram_heap_base; > + } > + static inline address nvdimm_heapbase() { > + return _nvdimm_heap_base; > + } > + static inline uint nvdimm_regionlength() { > + return _nvdimm_region_length; > + } > > IMHO, currently the memory management is ill prepared for your patch; yes, one could shove it in, but at the expense of maintainability and code clearness. This expense would have to be carried by all JVM developers, regardless whether they work on your hardware and benefit from this feature. > > So I think this would work better with some preparatory refactoring done in the VM. Red Hat and Oracle did similar efforts by refactoring the GC interface before adding new GCs: see https://bugs.openjdk.java.net/browse/JDK-8163329. > > Maybe we could think about how to do this. It certainly would be a worthy goal. > > Kind Regards, Thomas > > (BTW, I really do not like the fact that in os_posix.cpp, > os::map_memory_to_file() and os::allocate_file() do an > vm_exit_during_initialization() in case of an error! These are (supposed to be) general purpose APIs and under no circumstances should they end the process. This is already in the hotspot now, added as part of JDK-8190308. This should be fixed.) > > > > On Fri, Jun 15, 2018 at 7:59 PM, Awasthi, Vinay K wrote: >> HI Thomas, >> >> Thanks for your input.. >> >> Now there is *no* change in virtualspace.cpp... >> >> I moved reserve and commit (this is how memory backed by file is handled) from reserve space to commit places in respective gcs... All changes are again localized and isolated with os::has_nvdimm()/AllocateOldGenAT. >> >> There are also fixes (1 line changes) added related to alignment and there is no un-mapping etc.. before mapping nvdimm backed dax file. >> >> Full Patch patch is here.. >> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.04 >> >> Any input is welcome. >> >> Thanks, >> Vinay >> >> -----Original Message----- >> From: Thomas Schatzl [mailto:thomas.schatzl at oracle.com] >> Sent: Friday, June 15, 2018 6:53 AM >> To: Awasthi, Vinay K ; 'Paul Su' >> ; 'hotspot-gc-dev at openjdk.java.net' >> ; 'Hotspot dev runtime' >> >> Cc: Kharbas, Kishor ; Aundhe, Shirish >> ; Viswanathan, Sandhya >> >> Subject: Re: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. >> >> Hi Vinay, >> >> On Thu, 2018-06-14 at 20:49 +0000, Awasthi, Vinay K wrote: >>> Now ReservedSpace.cpp has logic to only open NVDIMM File (as it was >>> done for AllocateheapAt).. if successful, set up 3 flags >>> (base/nvdimm_present/file handle) at the end. There is *NO* collector >>> specific code. >>> >>> All work has been moved to g1PagebasedVirtualSpace.cpp.. I am >>> committing memory here and setting dram_heapbase used by g1 here. >>> >>> JEP to support allocating Old generation on NV-DIMM - https://bugs.op >>> enjdk.java.net/browse/JDK-8202286 >>> Here is the implementation bug link: https://bugs.openjdk.java.net/br >>> owse/JDK-8204908 >>> >>> >>> Patch is Uploaded at (full patch/incremental patch) >>> >>> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02/ >>> http://cr.openjdk.java.net/~kkharbas/8204908/webrev.02_to_01/ >>> Tested default setup (i.e. no file is being passed for heap) and >>> AllocateHeapAt/AllocateOldGenAt with POGC and G1GC.. all passing? Any >>> and all comments are welcome! >>> >> >> looking briefly through the changes, I think they look much better already to move the G1 specific stuff into G1 code; however I would like to think about how we could reduce the complexity further and solve the case of allowing multiple mapping sources (tmpfs file, nvram, different "types" of RAM) for different parts of the heap in an even cleaner way. >> >> Thanks, >> Thomas >> From per.liden at oracle.com Tue Jun 19 12:30:55 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Jun 2018 14:30:55 +0200 Subject: RFR: 8205339: ZGC: VerifyBeforeIteration not yet supported Message-ID: <9d91b81c-bcb0-c786-960f-7f5427abc212@oracle.com> Always disable the diagnostic flag VerifyBeforeIteration. This option is not yet supported for the same reason ZGC don't yet support VerifyStack, i.e. this code makes assumptions about the state of oops, which aren't true in a ZGC context. Bug: https://bugs.openjdk.java.net/browse/JDK-8205339 Webrev: http://cr.openjdk.java.net/~pliden/8205339/webrev.0 /Per From stefan.karlsson at oracle.com Tue Jun 19 15:00:27 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 19 Jun 2018 17:00:27 +0200 Subject: RFR: 8205339: ZGC: VerifyBeforeIteration not yet supported In-Reply-To: <9d91b81c-bcb0-c786-960f-7f5427abc212@oracle.com> References: <9d91b81c-bcb0-c786-960f-7f5427abc212@oracle.com> Message-ID: <2739590e-8941-b771-6b1a-d4b5a01410e8@oracle.com> Looks good. StefanK On 2018-06-19 14:30, Per Liden wrote: > Always disable the diagnostic flag VerifyBeforeIteration. This option is > not yet supported for the same reason ZGC don't yet support VerifyStack, > i.e. this code makes assumptions about the state of oops, which aren't > true in a ZGC context. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205339 > Webrev: http://cr.openjdk.java.net/~pliden/8205339/webrev.0 > > /Per From erik.osterlund at oracle.com Tue Jun 19 15:06:57 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 19 Jun 2018 17:06:57 +0200 Subject: RFR: 8205339: ZGC: VerifyBeforeIteration not yet supported In-Reply-To: <9d91b81c-bcb0-c786-960f-7f5427abc212@oracle.com> References: <9d91b81c-bcb0-c786-960f-7f5427abc212@oracle.com> Message-ID: <5B291C11.3090703@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2018-06-19 14:30, Per Liden wrote: > Always disable the diagnostic flag VerifyBeforeIteration. This option > is not yet supported for the same reason ZGC don't yet support > VerifyStack, i.e. this code makes assumptions about the state of oops, > which aren't true in a ZGC context. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205339 > Webrev: http://cr.openjdk.java.net/~pliden/8205339/webrev.0 > > /Per From per.liden at oracle.com Tue Jun 19 15:13:07 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Jun 2018 17:13:07 +0200 Subject: RFR: 8205339: ZGC: VerifyBeforeIteration not yet supported In-Reply-To: <5B291C11.3090703@oracle.com> References: <9d91b81c-bcb0-c786-960f-7f5427abc212@oracle.com> <5B291C11.3090703@oracle.com> Message-ID: Thanks Stefan and Erik! /Per On 06/19/2018 05:06 PM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2018-06-19 14:30, Per Liden wrote: >> Always disable the diagnostic flag VerifyBeforeIteration. This option >> is not yet supported for the same reason ZGC don't yet support >> VerifyStack, i.e. this code makes assumptions about the state of oops, >> which aren't true in a ZGC context. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205339 >> Webrev: http://cr.openjdk.java.net/~pliden/8205339/webrev.0 >> >> /Per > From erik.gahlin at oracle.com Tue Jun 19 15:57:42 2018 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Tue, 19 Jun 2018 17:57:42 +0200 Subject: Liveset information for Old Object sample event Message-ID: <5B2927F6.1090804@oracle.com> Hi, Could I have a review of an enhancement that adds heap usage after GC to the Old Object Sample (OOS) event. This is useful for spotting an increasing liveset. Recordings typically only contain data for a limited period, i .e. last 30 minutes, but the OOS event contains samples from when JFR/JVM was started, potentially several days back. The liveset trend is useful for tools such as Mission Control to detect if there is a memory leak. If that is the case, information in the OOS event can be used to pinpoint where the leak occurred and what is keeping it alive. Presenting this information when there is not memory leak is confusing. Bug: https://bugs.openjdk.java.net/browse/JDK-8197425 Webrev: http://cr.openjdk.java.net/~egahlin/8197425/ Testings: Tests in test/jdk/jdk/jfr Thanks Erik From per.liden at oracle.com Tue Jun 19 16:19:12 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Jun 2018 18:19:12 +0200 Subject: RFR: 8205344: TraceMemoryManagerStats changes in JDK-8195115 broke ZGC Message-ID: <385c3b8e-4864-2e0c-3a74-bf4c2af9b705@oracle.com> JDK-8195115 changed the TraceMemoryManagerStats constructor interface but failed to take ZGC into account, causing failure in Tier2 testing. This patch should restore the previous behavior for ZGC. Bug: https://bugs.openjdk.java.net/browse/JDK-8205344 Webrev: http://cr.openjdk.java.net/~pliden/8205344/webrev.0 Testing: With this patch I can no longer reproduce locally. Running in mach5 t{1,2,3} right now. /Per From shade at redhat.com Tue Jun 19 16:23:49 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Jun 2018 18:23:49 +0200 Subject: RFR: 8205344: TraceMemoryManagerStats changes in JDK-8195115 broke ZGC In-Reply-To: <385c3b8e-4864-2e0c-3a74-bf4c2af9b705@oracle.com> References: <385c3b8e-4864-2e0c-3a74-bf4c2af9b705@oracle.com> Message-ID: <1cebdf49-9ab8-68a4-0cde-3bb5a74bd070@redhat.com> On 06/19/2018 06:19 PM, Per Liden wrote: > JDK-8195115 changed the TraceMemoryManagerStats constructor interface but failed to take ZGC into > account, causing failure in Tier2 testing. This patch should restore the previous behavior for ZGC. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205344 > Webrev: http://cr.openjdk.java.net/~pliden/8205344/webrev.0 Looks good. -Aleksey P.S. (in grandpa voice) See, this is why experimental GCs should built by default in OpenJDK upstream: we would not have to play catch-up with simple breaking changes. ;) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From per.liden at oracle.com Tue Jun 19 16:31:48 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Jun 2018 18:31:48 +0200 Subject: RFR: 8205344: TraceMemoryManagerStats changes in JDK-8195115 broke ZGC In-Reply-To: <1cebdf49-9ab8-68a4-0cde-3bb5a74bd070@redhat.com> References: <385c3b8e-4864-2e0c-3a74-bf4c2af9b705@oracle.com> <1cebdf49-9ab8-68a4-0cde-3bb5a74bd070@redhat.com> Message-ID: <076b0c8a-e1ee-9aa5-c28d-8ceb7bf0681c@oracle.com> On 06/19/2018 06:23 PM, Aleksey Shipilev wrote: > On 06/19/2018 06:19 PM, Per Liden wrote: >> JDK-8195115 changed the TraceMemoryManagerStats constructor interface but failed to take ZGC into >> account, causing failure in Tier2 testing. This patch should restore the previous behavior for ZGC. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205344 >> Webrev: http://cr.openjdk.java.net/~pliden/8205344/webrev.0 > > Looks good. Thanks! > > -Aleksey > > P.S. (in grandpa voice) See, this is why experimental GCs should built by default in OpenJDK > upstream: we would not have to play catch-up with simple breaking changes. ;) In this case, I don't think it would have been caught anyway since it's not a build error (thanks to those damn default values on arguments). You actually need to run the specific test with -XX:+UseZGC to see it. cheers, Per From erik.osterlund at oracle.com Tue Jun 19 17:07:39 2018 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Tue, 19 Jun 2018 19:07:39 +0200 Subject: RFR: 8205344: TraceMemoryManagerStats changes in JDK-8195115 broke ZGC In-Reply-To: <385c3b8e-4864-2e0c-3a74-bf4c2af9b705@oracle.com> References: <385c3b8e-4864-2e0c-3a74-bf4c2af9b705@oracle.com> Message-ID: Hi Per, Looks good. Thanks, /Erik > On 19 Jun 2018, at 18:19, Per Liden wrote: > > JDK-8195115 changed the TraceMemoryManagerStats constructor interface but failed to take ZGC into account, causing failure in Tier2 testing. This patch should restore the previous behavior for ZGC. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205344 > Webrev: http://cr.openjdk.java.net/~pliden/8205344/webrev.0 > > Testing: With this patch I can no longer reproduce locally. Running in mach5 t{1,2,3} right now. > > /Per From per.liden at oracle.com Tue Jun 19 17:27:47 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Jun 2018 19:27:47 +0200 Subject: RFR: 8205344: TraceMemoryManagerStats changes in JDK-8195115 broke ZGC In-Reply-To: References: <385c3b8e-4864-2e0c-3a74-bf4c2af9b705@oracle.com> Message-ID: <60c11510-3ce9-6401-8d48-9aa7c7dd462d@oracle.com> Thanks Erik! Pushed immediately since this broke tier2. /Per On 06/19/2018 07:07 PM, Erik Osterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > >> On 19 Jun 2018, at 18:19, Per Liden wrote: >> >> JDK-8195115 changed the TraceMemoryManagerStats constructor interface but failed to take ZGC into account, causing failure in Tier2 testing. This patch should restore the previous behavior for ZGC. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205344 >> Webrev: http://cr.openjdk.java.net/~pliden/8205344/webrev.0 >> >> Testing: With this patch I can no longer reproduce locally. Running in mach5 t{1,2,3} right now. >> >> /Per > From markus.gronlund at oracle.com Tue Jun 19 18:29:02 2018 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Tue, 19 Jun 2018 11:29:02 -0700 (PDT) Subject: Liveset information for Old Object sample event In-Reply-To: <5B2927F6.1090804@oracle.com> References: <5B2927F6.1090804@oracle.com> Message-ID: Hi Erik, Looks good. Thanks Markus -----Original Message----- From: Erik Gahlin Sent: den 19 juni 2018 17:58 To: hotspot-jfr-dev at openjdk.java.net Cc: hotspot-gc-dev at openjdk.java.net Subject: Liveset information for Old Object sample event Hi, Could I have a review of an enhancement that adds heap usage after GC to the Old Object Sample (OOS) event. This is useful for spotting an increasing liveset. Recordings typically only contain data for a limited period, i .e. last 30 minutes, but the OOS event contains samples from when JFR/JVM was started, potentially several days back. The liveset trend is useful for tools such as Mission Control to detect if there is a memory leak. If that is the case, information in the OOS event can be used to pinpoint where the leak occurred and what is keeping it alive. Presenting this information when there is not memory leak is confusing. Bug: https://bugs.openjdk.java.net/browse/JDK-8197425 Webrev: http://cr.openjdk.java.net/~egahlin/8197425/ Testings: Tests in test/jdk/jdk/jfr Thanks Erik From rbruno at gsd.inesc-id.pt Tue Jun 19 18:46:46 2018 From: rbruno at gsd.inesc-id.pt (Rodrigo Bruno) Date: Tue, 19 Jun 2018 20:46:46 +0200 Subject: RFR: 8204088: Dynamic Max Memory Limit Message-ID: Hi all, here is the first version of our contribution for draft JEP-8204088. More details at the CR. CR: https://bugs.openjdk.java.net/browse/JDK-8204088 Webrev: http://cr.openjdk.java.net/~tschatzl/jelastic/cmx/ Thanks, Rodrigo -------------- next part -------------- An HTML attachment was scrubbed... URL: From rbruno at gsd.inesc-id.pt Tue Jun 19 18:46:44 2018 From: rbruno at gsd.inesc-id.pt (Rodrigo Bruno) Date: Tue, 19 Jun 2018 20:46:44 +0200 Subject: RFR: 8204089: Timely Reducing Unused Committed Memory Message-ID: Hi all, here is the first version of our contribution for draft JEP-8204089. More details at the CR. CR: https://bugs.openjdk.java.net/browse/JDK-8204089 Webrev: http://cr.openjdk.java.net/~tschatzl/jelastic/pgc/ Thanks, Rodrigo -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.rodriguez at oracle.com Tue Jun 19 21:34:46 2018 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Tue, 19 Jun 2018 14:34:46 -0700 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV In-Reply-To: References: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> <5B1E3C35.3090506@oracle.com> Message-ID: <7809c59d-5f2f-30aa-44e6-7cf4db03df05@oracle.com> I've generated a webrev with a new KlassRefHandle protecting questionable uses in JVMCI. http://cr.openjdk.java.net/~never/8198909.1/webrev One outstanding question is whether ObjArrayKlass also needs a working holder_phantom method. It would seem so to me but maybe there's some reason not? tom Tom Rodriguez wrote on 6/11/18 10:04 AM: > > > Erik ?sterlund wrote on 6/11/18 2:09 AM: >> Hi Tom, >> >> Could you please call InstanceKlass::holder_phantom() instead to keep >> the class alive? That is the more general mechanism that is also used >> by ciInstanceKlass. We don't want to use explicit G1 enqueue calls >> anymore. > > Ok. I guess the same fix in JDK8 will have the use the explicit enqueue > though or is it not required in JDK8? > >> Also, you must not perform any thread transition between loading the >> weak klass from the MDO until you call holder_phantom, otherwise it >> might have been unloaded before you get to call holder_phantom(). Is >> this guaranteed somehow in this scenario? I looked through all >> callsites and could not find where the Klass pointer is read in the >> MDO and subsequently passed into the CompilerToVM::get_jvmci_type API, >> and therefore I do not know if this is guaranteed. > > The obviously problematic path is at > http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l334 > when either base_address is a Klass* or base_object is NULL which is > where we are reading from non-heap memory. There are other paths which > are reading Klasses through more standard APIs from the ConstantPool for > instance. > > There isn't an easy way to ensure no safepoint occurs in between so > maybe we require the caller of get_jvmci_type to pass in the > phantom_holder() as a way of forcing the caller to call holder_phantom() > at the appropriate places? Or is it the case that getResolvedType is > the only place where special effort is required? All the other paths > are fairly normal HotSpot code but though place that uses > klass->implementor() for instance seems like it could be considered to > be weak by G1. > > http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l368 > > > The lack of a properly working KlassHandle seems like an oversight in > the API to me. > > tom > >> >> Thanks, >> /Erik >> >> On 2018-06-08 22:46, Tom Rodriguez wrote: >>> The JVMCI API may read Klass* and java.lang.Class instances from >>> locations which G1 would consider to be weakly referenced. This can >>> result in HotSpotResolvedObjectTypeImpl instances with references to >>> Classes that have been unloaded. In the this crash, JVMCI was >>> reading a Klass* from the profile in an MDO and building a wrapper >>> around it. The MDO reference is weak and was the only remaining >>> reference to the type so it could be dropped resulting in an eventual >>> crash. >>> >>> I've added an explicit G1 enqueue before we call out to create the >>> wrapper object but is there a more recommended way of doing this? >>> Dean had pointed out the oddly named InstanceKlass::holder_phantom >>> which is used by the CI. Should I be using that? The G1 barrier is >>> only really need when reading from non-Java heap memory but since the >>> get_jvmci_type method is the main entry point for this logic it >>> safest to always perform it in this path. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8198909 >>> http://cr.openjdk.java.net/~never/8198909/webrev >> From per.liden at oracle.com Wed Jun 20 09:27:20 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Jun 2018 11:27:20 +0200 Subject: RFR: 8205405: ZGC: Decouple JFR type registration Message-ID: <1aac0820-6ff0-f695-dee9-40f72d8a9d1d@oracle.com> When JDK-8205053 is completed, it becomes possible to register subsystem-specific JFR types from outside of JFR. The code that registers ZGC's two types is currently embedded into JFR itself. This code can now be moved into ZGC itself. Bug: https://bugs.openjdk.java.net/browse/JDK-8205405 Webrev: http://cr.openjdk.java.net/~pliden/8205405/webrev.0 Testing: Manually generated and verified a test recording /Per From markus.gronlund at oracle.com Wed Jun 20 09:56:49 2018 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Wed, 20 Jun 2018 02:56:49 -0700 (PDT) Subject: RFR: 8205405: ZGC: Decouple JFR type registration In-Reply-To: <1aac0820-6ff0-f695-dee9-40f72d8a9d1d@oracle.com> References: <1aac0820-6ff0-f695-dee9-40f72d8a9d1d@oracle.com> Message-ID: Hi Per, Looks good. Thanks Markus -----Original Message----- From: Per Liden Sent: den 20 juni 2018 11:27 To: hotspot-gc-dev at openjdk.java.net; hotspot-jfr-dev at openjdk.java.net Subject: RFR: 8205405: ZGC: Decouple JFR type registration When JDK-8205053 is completed, it becomes possible to register subsystem-specific JFR types from outside of JFR. The code that registers ZGC's two types is currently embedded into JFR itself. This code can now be moved into ZGC itself. Bug: https://bugs.openjdk.java.net/browse/JDK-8205405 Webrev: http://cr.openjdk.java.net/~pliden/8205405/webrev.0 Testing: Manually generated and verified a test recording /Per From per.liden at oracle.com Wed Jun 20 10:33:42 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Jun 2018 12:33:42 +0200 Subject: RFR: 8205405: ZGC: Decouple JFR type registration In-Reply-To: References: <1aac0820-6ff0-f695-dee9-40f72d8a9d1d@oracle.com> Message-ID: Thanks Markus! /Per On 06/20/2018 11:56 AM, Markus Gronlund wrote: > Hi Per, > > Looks good. > > Thanks > Markus > > -----Original Message----- > From: Per Liden > Sent: den 20 juni 2018 11:27 > To: hotspot-gc-dev at openjdk.java.net; hotspot-jfr-dev at openjdk.java.net > Subject: RFR: 8205405: ZGC: Decouple JFR type registration > > When JDK-8205053 is completed, it becomes possible to register subsystem-specific JFR types from outside of JFR. The code that registers ZGC's two types is currently embedded into JFR itself. This code can now be moved into ZGC itself. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205405 > Webrev: http://cr.openjdk.java.net/~pliden/8205405/webrev.0 > > Testing: Manually generated and verified a test recording > > /Per > From thomas.schatzl at oracle.com Wed Jun 20 11:27:19 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 20 Jun 2018 13:27:19 +0200 Subject: RFR (S): 8204082: Make names of Young GCs more uniform in logs In-Reply-To: References: <360d423eb720e93a1921e66dae71ee2e794439ec.camel@oracle.com> <01cf559a1c1762d66d26c1ac9ad7e473bb8cb1ef.camel@oracle.com> Message-ID: Hi, some changes need to be done for the gtests too, so here are new webrevs: http://cr.openjdk.java.net/~tschatzl/8204082/webrev.3_to_4/ http://cr.openjdk.java.net/~tschatzl/8204082/webrev.4/ Thanks, Thomas On Fri, 2018-06-15 at 15:47 +0200, Thomas Schatzl wrote: > Hi, > > during some more discussion about the messages there was the > concern > that the "Concurrent End" message is not really an indication of the > concurrent marking end (this happens asynchronuosly), also that pause > may not occur (if no mixed gc starts after marking), so we came up > with > "Prepare Mixed" for it. > "Concurrent End" would probably fit better for the current "Cleanup" > pause. > > I.e. > > Pause Young (Normal) ... > Pause Young (Concurrent Start) ... > Pause Young (Prepare Mixed) ... > Pause Young (Mixed) ... > > > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.2_to_3 (diff) > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.3 (full) > > Thanks, > Thomas > > On Thu, 2018-06-14 at 15:02 +0200, Thomas Schatzl wrote: > > Hi all, > > > > another round of reviews after some more internal remarks :P > > > > I also changed the title of the CR. > > > > The set of "final" tags would be: > > > > Pause Young (Normal) ... > > Pause Young (Concurrent Start) ... > > Pause Young (Concurrent End) ... > > Pause Young (Mixed) ... > > > > I also adapted the strings in the GCVerifyType functionality. > > > > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.1_to_2 (diff) > > http://cr.openjdk.java.net/~tschatzl/8204082/webrev.2 (full) > > > > Testing: > > running through all gc tests locally > > > > Thanks, > > Thomas > > > > From HORIE at jp.ibm.com Wed Jun 20 11:41:44 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Wed, 20 Jun 2018 07:41:44 -0400 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi Martin, Kim, all, > I assume webrev.00 was used for reviews and tests as Thomas has emphasized that evacuation failures may be performance critical, too. It looks correct to me, too. > > I can sponsor the change if needed. Please let me know when I can consider it reviewed. Thanks a lot for sponsoring the change, Martin. Yes, webrev.00 is the one used for the review: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00/ I think Kim would review the change because Derek concluded there is a moderate performance gain in SPECjbb on AArch64. Kim, would you agree with this change? > I was going to say that this looks good to me. > > But then I saw Derek White?s reply about an unexpected performance regression. > I?d like to wait until he reports back. Best regards, -- Michihiro, IBM Research - Tokyo From: "Doerr, Martin" To: Michihiro Horie , "White, Derek" Cc: "david.holmes at oracle.com" , Gustavo Bueno Romero , "hotspot-gc-dev at openjdk.java.net" , Kim Barrett Date: 2018/06/19 05:28 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi, I assume webrev.00 was used for reviews and tests as Thomas has emphasized that evacuation failures may be performance critical, too. It looks correct to me, too. I can sponsor the change if needed. Please let me know when I can consider it reviewed. Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Dienstag, 19. Juni 2018 00:19 To: White, Derek Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-gc-dev at openjdk.java.net; Kim Barrett ; Doerr, Martin Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Derek, Thank you for the investigation on AArch64. It would be very helpful and I am glad to know that you had a moderate performance gain with this G1 change. Best regards, -- Michihiro, IBM Research - Tokyo Inactive hide details for "White, Derek" ---2018/06/18 17:57:58---Hi Michihiro, Further testing is showing a moderate performan"White, Derek" ---2018/06/18 17:57:58---Hi Michihiro, Further testing is showing a moderate performance gain with your G1 patch in SPECjbb o From: "White, Derek" To: Michihiro Horie Cc: "david.holmes at oracle.com" , Gustavo Bueno Romero , "hotspot-gc-dev at openjdk.java.net" < hotspot-gc-dev at openjdk.java.net>, Kim Barrett , "Doerr, Martin" Date: 2018/06/18 17:57 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Michihiro, Further testing is showing a moderate performance gain with your G1 patch in SPECjbb on AArch64. We haven?t completely pinned the earlier regression on user error on our part, but it?s looking likely. The new code on AArch64 is looking as relaxed as can be ? Thank you for working on this, and letting us investigate! Derek From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Friday, June 15, 2018 11:44 PM To: White, Derek Cc: david.holmes at oracle.com; Gustavo Bueno Romero ; hotspot-gc-dev at openjdk.java.net; Kim Barrett ; Doerr, Martin Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Derek, Thank you for sharing your status of still having inconsistent results with the patch. I would wait for your updates. Thanks again, Best regards, -- Michihiro, IBM Research - Tokyo Inactive hide details for "White, Derek" ---2018/06/15 17:55:19---Hi Michihiro, Status update:"White, Derek" ---2018/06/15 17:55:19---Hi Michihiro, Status update: From: "White, Derek" To: Kim Barrett , Michihiro Horie Cc: Gustavo Bueno Romero , "david.holmes at oracle.com" < david.holmes at oracle.com>, "hotspot-gc-dev at openjdk.java.net" < hotspot-gc-dev at openjdk.java.net> Date: 2018/06/15 17:55 Subject: RE: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi Michihiro, Status update: My colleague and I are getting inconsistent results with this patch: -23% to +7% on SPECjbb, so we're trying to verify what's going on. On an unrelated note, the aarch64 port relies on GCC's __atomic_compare_exchange to implemented the relaxed case of Atomic::PlatformCmpxchg, and gcc 6 and earlier sometimes do a poor job on it. Not enough to account for the numbers we saw though. I hope to have an answer by Monday. - Derek > -----Original Message----- > From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] > On Behalf Of Kim Barrett > Sent: Wednesday, June 13, 2018 5:16 PM > To: Michihiro Horie > Cc: Gustavo Bueno Romero ; > david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in > G1ParScanThreadState::copy_to_survivor_space > > External Email > > > On Jun 7, 2018, at 2:01 AM, Michihiro Horie wrote: > > > > Dear all, > > > > Would you please review the following change? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8204524 > > Webrev: http://cr.openjdk.java.net/~mhorie/8204524/webrev.00 > > I was going to say that this looks good to me. > > But then I saw Derek White?s reply about an unexpected performance > regression. > I?d like to wait until he reports back. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From thomas.schatzl at oracle.com Wed Jun 20 12:15:03 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 20 Jun 2018 14:15:03 +0200 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi, On Wed, 2018-06-20 at 07:41 -0400, Michihiro Horie wrote: > Hi Martin, Kim, all, > > > I assume webrev.00 was used for reviews and tests as Thomas has > > emphasized that evacuation failures may be performance critical, > > too. It looks correct to me, too. > > > > I can sponsor the change if needed. Please let me know when I can > > consider it reviewed. > > Thanks a lot for sponsoring the change, Martin. Yes, webrev.00 is the > one used for the review: > http://cr.openjdk.java.net/~mhorie/8204524/webrev.00/ > > I think Kim would review the change because Derek concluded there is > a moderate performance gain in SPECjbb on AArch64. Kim, would you > agree with this change? Kim is on vacation, but from the context of his review he looked at the 00 change, not the later, reduced 01 one (http://mail.openjdk.java.net/ pipermail/hotspot-gc-dev/2018-June/022358.html). So I guess it can be considered as reviewed from our POV. Thanks, Thomas From per.liden at oracle.com Wed Jun 20 16:32:56 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Jun 2018 18:32:56 +0200 Subject: RFR: 8204088: Dynamic Max Memory Limit In-Reply-To: References: Message-ID: Hi Rodrigo, A general comment, which applies to both JDK-8204088 and JDK-8204089, is that I would like to see this code move into G1 itself. We should strive to avoid introducing global concepts and options that are not applicable to other GCs. I can't see anything obvious in neither JDK-8204088 nor JDK-8204089 that would stop you from doing that. cheers, Per On 06/19/2018 08:46 PM, Rodrigo Bruno wrote: > Hi all, > > here is the first version of our contribution for draft JEP-8204088. > > More details at the CR. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204088 > > Webrev: > http://cr.openjdk.java.net/~tschatzl/jelastic/cmx/ > > > Thanks, > Rodrigo From stefan.karlsson at oracle.com Thu Jun 21 09:42:49 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 21 Jun 2018 11:42:49 +0200 Subject: RFR: 8205405: ZGC: Decouple JFR type registration In-Reply-To: <1aac0820-6ff0-f695-dee9-40f72d8a9d1d@oracle.com> References: <1aac0820-6ff0-f695-dee9-40f72d8a9d1d@oracle.com> Message-ID: <50076a3c-255e-26eb-3b6a-97b04015fee5@oracle.com> Looks good. StefanK On 2018-06-20 11:27, Per Liden wrote: > When JDK-8205053 is completed, it becomes possible to register > subsystem-specific JFR types from outside of JFR. The code that > registers ZGC's two types is currently embedded into JFR itself. This > code can now be moved into ZGC itself. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205405 > Webrev: http://cr.openjdk.java.net/~pliden/8205405/webrev.0 > > Testing: Manually generated and verified a test recording > > /Per From per.liden at oracle.com Thu Jun 21 09:47:51 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 21 Jun 2018 11:47:51 +0200 Subject: RFR: 8205405: ZGC: Decouple JFR type registration In-Reply-To: <50076a3c-255e-26eb-3b6a-97b04015fee5@oracle.com> References: <1aac0820-6ff0-f695-dee9-40f72d8a9d1d@oracle.com> <50076a3c-255e-26eb-3b6a-97b04015fee5@oracle.com> Message-ID: <7f02e67a-aa1c-9964-a1c1-bbc35e86f437@oracle.com> Thanks Stefan! /Per On 06/21/2018 11:42 AM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2018-06-20 11:27, Per Liden wrote: >> When JDK-8205053 is completed, it becomes possible to register >> subsystem-specific JFR types from outside of JFR. The code that >> registers ZGC's two types is currently embedded into JFR itself. This >> code can now be moved into ZGC itself. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205405 >> Webrev: http://cr.openjdk.java.net/~pliden/8205405/webrev.0 >> >> Testing: Manually generated and verified a test recording >> >> /Per From erik.osterlund at oracle.com Thu Jun 21 13:26:16 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 21 Jun 2018 15:26:16 +0200 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV In-Reply-To: <7809c59d-5f2f-30aa-44e6-7cf4db03df05@oracle.com> References: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> <5B1E3C35.3090506@oracle.com> <7809c59d-5f2f-30aa-44e6-7cf4db03df05@oracle.com> Message-ID: <5B2BA778.70508@oracle.com> Hi, In src/hotspot/share/jvmci/jvmciCompilerToVM.cpp Please remove #include "gc/g1/g1BarrierSet.hpp" In src/hotspot/share/jvmci/jvmciCompilerToVM.hpp: I would prefer if KlassRefHandle was called JVMCIKlassHandle, because it is very specific to JVMCI. On that note it is unfortunate that we can not simply reuse ciInstanceKlass, which is the klass handle used by the other compilers. Klass* _value; should be called _klass Handle _phantom; should be called _holder Klass* obj() should be called klass() Otherwise, this looks good, and I don't need another webrev for this. Thanks, /Erik On 2018-06-19 23:34, Tom Rodriguez wrote: > I've generated a webrev with a new KlassRefHandle protecting > questionable uses in JVMCI. > http://cr.openjdk.java.net/~never/8198909.1/webrev > > One outstanding question is whether ObjArrayKlass also needs a working > holder_phantom method. It would seem so to me but maybe there's some > reason not? > > tom > > Tom Rodriguez wrote on 6/11/18 10:04 AM: >> >> >> Erik ?sterlund wrote on 6/11/18 2:09 AM: >>> Hi Tom, >>> >>> Could you please call InstanceKlass::holder_phantom() instead to >>> keep the class alive? That is the more general mechanism that is >>> also used by ciInstanceKlass. We don't want to use explicit G1 >>> enqueue calls anymore. >> >> Ok. I guess the same fix in JDK8 will have the use the explicit >> enqueue though or is it not required in JDK8? >> >>> Also, you must not perform any thread transition between loading the >>> weak klass from the MDO until you call holder_phantom, otherwise it >>> might have been unloaded before you get to call holder_phantom(). Is >>> this guaranteed somehow in this scenario? I looked through all >>> callsites and could not find where the Klass pointer is read in the >>> MDO and subsequently passed into the CompilerToVM::get_jvmci_type >>> API, and therefore I do not know if this is guaranteed. >> >> The obviously problematic path is at >> http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l334 >> when either base_address is a Klass* or base_object is NULL which is >> where we are reading from non-heap memory. There are other paths >> which are reading Klasses through more standard APIs from the >> ConstantPool for instance. >> >> There isn't an easy way to ensure no safepoint occurs in between so >> maybe we require the caller of get_jvmci_type to pass in the >> phantom_holder() as a way of forcing the caller to call >> holder_phantom() at the appropriate places? Or is it the case that >> getResolvedType is the only place where special effort is required? >> All the other paths are fairly normal HotSpot code but though place >> that uses klass->implementor() for instance seems like it could be >> considered to be weak by G1. >> >> http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l368 >> >> >> The lack of a properly working KlassHandle seems like an oversight in >> the API to me. >> >> tom >> >>> >>> Thanks, >>> /Erik >>> >>> On 2018-06-08 22:46, Tom Rodriguez wrote: >>>> The JVMCI API may read Klass* and java.lang.Class instances from >>>> locations which G1 would consider to be weakly referenced. This >>>> can result in HotSpotResolvedObjectTypeImpl instances with >>>> references to Classes that have been unloaded. In the this crash, >>>> JVMCI was reading a Klass* from the profile in an MDO and building >>>> a wrapper around it. The MDO reference is weak and was the only >>>> remaining reference to the type so it could be dropped resulting in >>>> an eventual crash. >>>> >>>> I've added an explicit G1 enqueue before we call out to create the >>>> wrapper object but is there a more recommended way of doing this? >>>> Dean had pointed out the oddly named InstanceKlass::holder_phantom >>>> which is used by the CI. Should I be using that? The G1 barrier is >>>> only really need when reading from non-Java heap memory but since >>>> the get_jvmci_type method is the main entry point for this logic it >>>> safest to always perform it in this path. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8198909 >>>> http://cr.openjdk.java.net/~never/8198909/webrev >>> From tom.rodriguez at oracle.com Thu Jun 21 20:46:53 2018 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Thu, 21 Jun 2018 13:46:53 -0700 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV In-Reply-To: <5B2BA778.70508@oracle.com> References: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> <5B1E3C35.3090506@oracle.com> <7809c59d-5f2f-30aa-44e6-7cf4db03df05@oracle.com> <5B2BA778.70508@oracle.com> Message-ID: Erik ?sterlund wrote on 6/21/18 6:26 AM: > Hi, > > > In src/hotspot/share/jvmci/jvmciCompilerToVM.cpp > > Please remove > #include "gc/g1/g1BarrierSet.hpp" > > In src/hotspot/share/jvmci/jvmciCompilerToVM.hpp: > > I would prefer if KlassRefHandle was called JVMCIKlassHandle, because it > is very specific to JVMCI. On that note it is unfortunate that we can > not simply reuse ciInstanceKlass, which is the klass handle used by the > other compilers. > > Klass* _value; > should be called _klass > > Handle _phantom; > should be called _holder > > Klass* obj() > should be called klass() > > Otherwise, this looks good, and I don't need another webrev for this. I've made all the requested edits. Additionally I never really got an answer to my question about handling of ObjArrayKlass but concluded that it must be handled, so I've moved phantom_holder from InstanceKlass to Klass so it can be used in a uniform way. I guess the CI handles it implicitly under the assumption that klass->class_loader_data() == ObjArrayKlass::cast(klass)->bottom_klass()->class_loader_data() which should presumably be true. The new webrev is http://cr.openjdk.java.net/~never/8198909.2/webrev. I'll consider the movement of phantom_holder to be acceptable unless I hear an objection soon. tom > > Thanks, > /Erik > > On 2018-06-19 23:34, Tom Rodriguez wrote: >> I've generated a webrev with a new KlassRefHandle protecting >> questionable uses in JVMCI. >> http://cr.openjdk.java.net/~never/8198909.1/webrev >> >> One outstanding question is whether ObjArrayKlass also needs a working >> holder_phantom method. It would seem so to me but maybe there's some >> reason not? >> >> tom >> >> Tom Rodriguez wrote on 6/11/18 10:04 AM: >>> >>> >>> Erik ?sterlund wrote on 6/11/18 2:09 AM: >>>> Hi Tom, >>>> >>>> Could you please call InstanceKlass::holder_phantom() instead to >>>> keep the class alive? That is the more general mechanism that is >>>> also used by ciInstanceKlass. We don't want to use explicit G1 >>>> enqueue calls anymore. >>> >>> Ok. I guess the same fix in JDK8 will have the use the explicit >>> enqueue though or is it not required in JDK8? >>> >>>> Also, you must not perform any thread transition between loading the >>>> weak klass from the MDO until you call holder_phantom, otherwise it >>>> might have been unloaded before you get to call holder_phantom(). Is >>>> this guaranteed somehow in this scenario? I looked through all >>>> callsites and could not find where the Klass pointer is read in the >>>> MDO and subsequently passed into the CompilerToVM::get_jvmci_type >>>> API, and therefore I do not know if this is guaranteed. >>> >>> The obviously problematic path is at >>> http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l334 >>> when either base_address is a Klass* or base_object is NULL which is >>> where we are reading from non-heap memory. There are other paths >>> which are reading Klasses through more standard APIs from the >>> ConstantPool for instance. >>> >>> There isn't an easy way to ensure no safepoint occurs in between so >>> maybe we require the caller of get_jvmci_type to pass in the >>> phantom_holder() as a way of forcing the caller to call >>> holder_phantom() at the appropriate places? Or is it the case that >>> getResolvedType is the only place where special effort is required? >>> All the other paths are fairly normal HotSpot code but though place >>> that uses klass->implementor() for instance seems like it could be >>> considered to be weak by G1. >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l368 >>> >>> >>> The lack of a properly working KlassHandle seems like an oversight in >>> the API to me. >>> >>> tom >>> >>>> >>>> Thanks, >>>> /Erik >>>> >>>> On 2018-06-08 22:46, Tom Rodriguez wrote: >>>>> The JVMCI API may read Klass* and java.lang.Class instances from >>>>> locations which G1 would consider to be weakly referenced. This >>>>> can result in HotSpotResolvedObjectTypeImpl instances with >>>>> references to Classes that have been unloaded. In the this crash, >>>>> JVMCI was reading a Klass* from the profile in an MDO and building >>>>> a wrapper around it. The MDO reference is weak and was the only >>>>> remaining reference to the type so it could be dropped resulting in >>>>> an eventual crash. >>>>> >>>>> I've added an explicit G1 enqueue before we call out to create the >>>>> wrapper object but is there a more recommended way of doing this? >>>>> Dean had pointed out the oddly named InstanceKlass::holder_phantom >>>>> which is used by the CI. Should I be using that? The G1 barrier is >>>>> only really need when reading from non-Java heap memory but since >>>>> the get_jvmci_type method is the main entry point for this logic it >>>>> safest to always perform it in this path. >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8198909 >>>>> http://cr.openjdk.java.net/~never/8198909/webrev >>>> > From erik.osterlund at oracle.com Thu Jun 21 22:03:38 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 22 Jun 2018 00:03:38 +0200 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV In-Reply-To: References: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> <5B1E3C35.3090506@oracle.com> <7809c59d-5f2f-30aa-44e6-7cf4db03df05@oracle.com> <5B2BA778.70508@oracle.com> Message-ID: <881e3c84-5835-e6ba-b1aa-1976c5054112@oracle.com> Hi Tom, I approve of having holder_phantom() on Klass. I tried to introduce it there a long time ago but got some push back at the time. But I think it really ought to be on Klass. Thanks, /Erik On 2018-06-21 22:46, Tom Rodriguez wrote: > > > Erik ?sterlund wrote on 6/21/18 6:26 AM: >> Hi, >> >> >> In src/hotspot/share/jvmci/jvmciCompilerToVM.cpp >> >> Please remove >> #include "gc/g1/g1BarrierSet.hpp" >> >> In src/hotspot/share/jvmci/jvmciCompilerToVM.hpp: >> >> I would prefer if KlassRefHandle was called JVMCIKlassHandle, because >> it is very specific to JVMCI. On that note it is unfortunate that we >> can not simply reuse ciInstanceKlass, which is the klass handle used >> by the other compilers. >> >> Klass*???? _value; >> should be called _klass >> >> Handle???? _phantom; >> should be called _holder >> >> Klass*??????? obj() >> should be called klass() >> >> Otherwise, this looks good, and I don't need another webrev for this. > > I've made all the requested edits.? Additionally I never really got an > answer to my question about handling of ObjArrayKlass but concluded > that it must be handled, so I've moved phantom_holder from > InstanceKlass to Klass so it can be used in a uniform way.? I guess > the CI handles it implicitly under the assumption that > klass->class_loader_data() == > ObjArrayKlass::cast(klass)->bottom_klass()->class_loader_data() which > should presumably be true.? The new webrev is > http://cr.openjdk.java.net/~never/8198909.2/webrev.? I'll consider the > movement of phantom_holder to be acceptable unless I hear an objection > soon. > > tom > >> >> Thanks, >> /Erik >> >> On 2018-06-19 23:34, Tom Rodriguez wrote: >>> I've generated a webrev with a new KlassRefHandle protecting >>> questionable uses in JVMCI. >>> http://cr.openjdk.java.net/~never/8198909.1/webrev >>> >>> One outstanding question is whether ObjArrayKlass also needs a >>> working holder_phantom method.? It would seem so to me but maybe >>> there's some reason not? >>> >>> tom >>> >>> Tom Rodriguez wrote on 6/11/18 10:04 AM: >>>> >>>> >>>> Erik ?sterlund wrote on 6/11/18 2:09 AM: >>>>> Hi Tom, >>>>> >>>>> Could you please call InstanceKlass::holder_phantom() instead to >>>>> keep the class alive? That is the more general mechanism that is >>>>> also used by ciInstanceKlass. We don't want to use explicit G1 >>>>> enqueue calls anymore. >>>> >>>> Ok.? I guess the same fix in JDK8 will have the use the explicit >>>> enqueue though or is it not required in JDK8? >>>> >>>>> Also, you must not perform any thread transition between loading >>>>> the weak klass from the MDO until you call holder_phantom, >>>>> otherwise it might have been unloaded before you get to call >>>>> holder_phantom(). Is this guaranteed somehow in this scenario? I >>>>> looked through all callsites and could not find where the Klass >>>>> pointer is read in the MDO and subsequently passed into the >>>>> CompilerToVM::get_jvmci_type API, and therefore I do not know if >>>>> this is guaranteed. >>>> >>>> The obviously problematic path is at >>>> http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l334 >>>> when either base_address is a Klass* or base_object is NULL which >>>> is where we are reading from non-heap memory.? There are other >>>> paths which are reading Klasses through more standard APIs from the >>>> ConstantPool for instance. >>>> >>>> There isn't an easy way to ensure no safepoint occurs in between so >>>> maybe we require the caller of get_jvmci_type to pass in the >>>> phantom_holder() as a way of forcing the caller to call >>>> holder_phantom() at the appropriate places?? Or is it the case that >>>> getResolvedType is the only place where special effort is required? >>>> All the other paths are fairly normal HotSpot code but though place >>>> that uses klass->implementor() for instance seems like it could be >>>> considered to be weak by G1. >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l368 >>>> >>>> >>>> The lack of a properly working KlassHandle seems like an oversight >>>> in the API to me. >>>> >>>> tom >>>> >>>>> >>>>> Thanks, >>>>> /Erik >>>>> >>>>> On 2018-06-08 22:46, Tom Rodriguez wrote: >>>>>> The JVMCI API may read Klass* and java.lang.Class instances from >>>>>> locations which G1 would consider to be weakly referenced.? This >>>>>> can result in HotSpotResolvedObjectTypeImpl instances with >>>>>> references to Classes that have been unloaded.? In the this >>>>>> crash, JVMCI was reading a Klass* from the profile in an MDO and >>>>>> building a wrapper around it. The MDO reference is weak and was >>>>>> the only remaining reference to the type so it could be dropped >>>>>> resulting in an eventual crash. >>>>>> >>>>>> I've added an explicit G1 enqueue before we call out to create >>>>>> the wrapper object but is there a more recommended way of doing >>>>>> this? Dean had pointed out the oddly named >>>>>> InstanceKlass::holder_phantom which is used by the CI. Should I >>>>>> be using that?? The G1 barrier is only really need when reading >>>>>> from non-Java heap memory but since the get_jvmci_type method is >>>>>> the main entry point for this logic it safest to always perform >>>>>> it in this path. >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8198909 >>>>>> http://cr.openjdk.java.net/~never/8198909/webrev >>>>> >> From igor.veresov at oracle.com Fri Jun 22 04:37:45 2018 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 21 Jun 2018 21:37:45 -0700 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV In-Reply-To: References: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> <5B1E3C35.3090506@oracle.com> <7809c59d-5f2f-30aa-44e6-7cf4db03df05@oracle.com> <5B2BA778.70508@oracle.com> Message-ID: <53FDF0AD-330F-4363-9C8B-4DDFC1517888@oracle.com> Looks good to me. igor > On Jun 21, 2018, at 1:46 PM, Tom Rodriguez wrote: > > > > Erik ?sterlund wrote on 6/21/18 6:26 AM: >> Hi, >> In src/hotspot/share/jvmci/jvmciCompilerToVM.cpp >> Please remove >> #include "gc/g1/g1BarrierSet.hpp" >> In src/hotspot/share/jvmci/jvmciCompilerToVM.hpp: >> I would prefer if KlassRefHandle was called JVMCIKlassHandle, because it is very specific to JVMCI. On that note it is unfortunate that we can not simply reuse ciInstanceKlass, which is the klass handle used by the other compilers. >> Klass* _value; >> should be called _klass >> Handle _phantom; >> should be called _holder >> Klass* obj() >> should be called klass() >> Otherwise, this looks good, and I don't need another webrev for this. > > I've made all the requested edits. Additionally I never really got an answer to my question about handling of ObjArrayKlass but concluded that it must be handled, so I've moved phantom_holder from InstanceKlass to Klass so it can be used in a uniform way. I guess the CI handles it implicitly under the assumption that klass->class_loader_data() == ObjArrayKlass::cast(klass)->bottom_klass()->class_loader_data() which should presumably be true. The new webrev is http://cr.openjdk.java.net/~never/8198909.2/webrev. I'll consider the movement of phantom_holder to be acceptable unless I hear an objection soon. > > tom > >> Thanks, >> /Erik >> On 2018-06-19 23:34, Tom Rodriguez wrote: >>> I've generated a webrev with a new KlassRefHandle protecting questionable uses in JVMCI. http://cr.openjdk.java.net/~never/8198909.1/webrev >>> >>> One outstanding question is whether ObjArrayKlass also needs a working holder_phantom method. It would seem so to me but maybe there's some reason not? >>> >>> tom >>> >>> Tom Rodriguez wrote on 6/11/18 10:04 AM: >>>> >>>> >>>> Erik ?sterlund wrote on 6/11/18 2:09 AM: >>>>> Hi Tom, >>>>> >>>>> Could you please call InstanceKlass::holder_phantom() instead to keep the class alive? That is the more general mechanism that is also used by ciInstanceKlass. We don't want to use explicit G1 enqueue calls anymore. >>>> >>>> Ok. I guess the same fix in JDK8 will have the use the explicit enqueue though or is it not required in JDK8? >>>> >>>>> Also, you must not perform any thread transition between loading the weak klass from the MDO until you call holder_phantom, otherwise it might have been unloaded before you get to call holder_phantom(). Is this guaranteed somehow in this scenario? I looked through all callsites and could not find where the Klass pointer is read in the MDO and subsequently passed into the CompilerToVM::get_jvmci_type API, and therefore I do not know if this is guaranteed. >>>> >>>> The obviously problematic path is at http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l334 when either base_address is a Klass* or base_object is NULL which is where we are reading from non-heap memory. There are other paths which are reading Klasses through more standard APIs from the ConstantPool for instance. >>>> >>>> There isn't an easy way to ensure no safepoint occurs in between so maybe we require the caller of get_jvmci_type to pass in the phantom_holder() as a way of forcing the caller to call holder_phantom() at the appropriate places? Or is it the case that getResolvedType is the only place where special effort is required? All the other paths are fairly normal HotSpot code but though place that uses klass->implementor() for instance seems like it could be considered to be weak by G1. >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l368 >>>> >>>> The lack of a properly working KlassHandle seems like an oversight in the API to me. >>>> >>>> tom >>>> >>>>> >>>>> Thanks, >>>>> /Erik >>>>> >>>>> On 2018-06-08 22:46, Tom Rodriguez wrote: >>>>>> The JVMCI API may read Klass* and java.lang.Class instances from locations which G1 would consider to be weakly referenced. This can result in HotSpotResolvedObjectTypeImpl instances with references to Classes that have been unloaded. In the this crash, JVMCI was reading a Klass* from the profile in an MDO and building a wrapper around it. The MDO reference is weak and was the only remaining reference to the type so it could be dropped resulting in an eventual crash. >>>>>> >>>>>> I've added an explicit G1 enqueue before we call out to create the wrapper object but is there a more recommended way of doing this? Dean had pointed out the oddly named InstanceKlass::holder_phantom which is used by the CI. Should I be using that? The G1 barrier is only really need when reading from non-Java heap memory but since the get_jvmci_type method is the main entry point for this logic it safest to always perform it in this path. >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8198909 >>>>>> http://cr.openjdk.java.net/~never/8198909/webrev >>>>> From tom.rodriguez at oracle.com Fri Jun 22 05:19:33 2018 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Thu, 21 Jun 2018 22:19:33 -0700 Subject: RFR (S): 8198909: [Graal] compiler/codecache/stress/UnexpectedDeoptimizationTest.java crashed with SIGSEGV In-Reply-To: <53FDF0AD-330F-4363-9C8B-4DDFC1517888@oracle.com> References: <64c722cd-14e3-e0db-6a16-08ca69605205@oracle.com> <5B1E3C35.3090506@oracle.com> <7809c59d-5f2f-30aa-44e6-7cf4db03df05@oracle.com> <5B2BA778.70508@oracle.com> <53FDF0AD-330F-4363-9C8B-4DDFC1517888@oracle.com> Message-ID: Thanks! tom Igor Veresov wrote on 6/21/18 9:37 PM: > Looks good to me. > > igor > >> On Jun 21, 2018, at 1:46 PM, Tom Rodriguez wrote: >> >> >> >> Erik ?sterlund wrote on 6/21/18 6:26 AM: >>> Hi, >>> In src/hotspot/share/jvmci/jvmciCompilerToVM.cpp >>> Please remove >>> #include "gc/g1/g1BarrierSet.hpp" >>> In src/hotspot/share/jvmci/jvmciCompilerToVM.hpp: >>> I would prefer if KlassRefHandle was called JVMCIKlassHandle, because it is very specific to JVMCI. On that note it is unfortunate that we can not simply reuse ciInstanceKlass, which is the klass handle used by the other compilers. >>> Klass* _value; >>> should be called _klass >>> Handle _phantom; >>> should be called _holder >>> Klass* obj() >>> should be called klass() >>> Otherwise, this looks good, and I don't need another webrev for this. >> >> I've made all the requested edits. Additionally I never really got an answer to my question about handling of ObjArrayKlass but concluded that it must be handled, so I've moved phantom_holder from InstanceKlass to Klass so it can be used in a uniform way. I guess the CI handles it implicitly under the assumption that klass->class_loader_data() == ObjArrayKlass::cast(klass)->bottom_klass()->class_loader_data() which should presumably be true. The new webrev is http://cr.openjdk.java.net/~never/8198909.2/webrev. I'll consider the movement of phantom_holder to be acceptable unless I hear an objection soon. >> >> tom >> >>> Thanks, >>> /Erik >>> On 2018-06-19 23:34, Tom Rodriguez wrote: >>>> I've generated a webrev with a new KlassRefHandle protecting questionable uses in JVMCI. http://cr.openjdk.java.net/~never/8198909.1/webrev >>>> >>>> One outstanding question is whether ObjArrayKlass also needs a working holder_phantom method. It would seem so to me but maybe there's some reason not? >>>> >>>> tom >>>> >>>> Tom Rodriguez wrote on 6/11/18 10:04 AM: >>>>> >>>>> >>>>> Erik ?sterlund wrote on 6/11/18 2:09 AM: >>>>>> Hi Tom, >>>>>> >>>>>> Could you please call InstanceKlass::holder_phantom() instead to keep the class alive? That is the more general mechanism that is also used by ciInstanceKlass. We don't want to use explicit G1 enqueue calls anymore. >>>>> >>>>> Ok. I guess the same fix in JDK8 will have the use the explicit enqueue though or is it not required in JDK8? >>>>> >>>>>> Also, you must not perform any thread transition between loading the weak klass from the MDO until you call holder_phantom, otherwise it might have been unloaded before you get to call holder_phantom(). Is this guaranteed somehow in this scenario? I looked through all callsites and could not find where the Klass pointer is read in the MDO and subsequently passed into the CompilerToVM::get_jvmci_type API, and therefore I do not know if this is guaranteed. >>>>> >>>>> The obviously problematic path is at http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l334 when either base_address is a Klass* or base_object is NULL which is where we are reading from non-heap memory. There are other paths which are reading Klasses through more standard APIs from the ConstantPool for instance. >>>>> >>>>> There isn't an easy way to ensure no safepoint occurs in between so maybe we require the caller of get_jvmci_type to pass in the phantom_holder() as a way of forcing the caller to call holder_phantom() at the appropriate places? Or is it the case that getResolvedType is the only place where special effort is required? All the other paths are fairly normal HotSpot code but though place that uses klass->implementor() for instance seems like it could be considered to be weak by G1. >>>>> >>>>> http://hg.openjdk.java.net/jdk/jdk/file/50469fb301c4/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#l368 >>>>> >>>>> The lack of a properly working KlassHandle seems like an oversight in the API to me. >>>>> >>>>> tom >>>>> >>>>>> >>>>>> Thanks, >>>>>> /Erik >>>>>> >>>>>> On 2018-06-08 22:46, Tom Rodriguez wrote: >>>>>>> The JVMCI API may read Klass* and java.lang.Class instances from locations which G1 would consider to be weakly referenced. This can result in HotSpotResolvedObjectTypeImpl instances with references to Classes that have been unloaded. In the this crash, JVMCI was reading a Klass* from the profile in an MDO and building a wrapper around it. The MDO reference is weak and was the only remaining reference to the type so it could be dropped resulting in an eventual crash. >>>>>>> >>>>>>> I've added an explicit G1 enqueue before we call out to create the wrapper object but is there a more recommended way of doing this? Dean had pointed out the oddly named InstanceKlass::holder_phantom which is used by the CI. Should I be using that? The G1 barrier is only really need when reading from non-Java heap memory but since the get_jvmci_type method is the main entry point for this logic it safest to always perform it in this path. >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8198909 >>>>>>> http://cr.openjdk.java.net/~never/8198909/webrev >>>>>> > From rbruno at gsd.inesc-id.pt Fri Jun 22 07:54:45 2018 From: rbruno at gsd.inesc-id.pt (Rodrigo Bruno) Date: Fri, 22 Jun 2018 09:54:45 +0200 Subject: RFR: 8204088: Dynamic Max Memory Limit In-Reply-To: References: Message-ID: Hi Per, yes, I think we can definitely do that. Thank you for your comment. cheers, Rodrigo 2018-06-20 18:32 GMT+02:00 Per Liden : > Hi Rodrigo, > > A general comment, which applies to both JDK-8204088 and JDK-8204089, is > that I would like to see this code move into G1 itself. We should strive to > avoid introducing global concepts and options that are not applicable to > other GCs. I can't see anything obvious in neither JDK-8204088 nor > JDK-8204089 that would stop you from doing that. > > cheers, > Per > > On 06/19/2018 08:46 PM, Rodrigo Bruno wrote: > >> Hi all, >> >> here is the first version of our contribution for draft JEP-8204088. >> >> More details at the CR. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8204088 < >> https://bugs.openjdk.java.net/browse/JDK-8204088> >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/jelastic/cmx/ < >> http://cr.openjdk.java.net/~tschatzl/jelastic/cmx/> >> >> Thanks, >> Rodrigo >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Fri Jun 22 12:57:07 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Jun 2018 14:57:07 +0200 Subject: RFR: 8204088: Dynamic Max Memory Limit In-Reply-To: References: Message-ID: <000ae6a51e5d6f71d36b8765ea64005e0d8187c9.camel@oracle.com> Hi, On Tue, 2018-06-19 at 20:46 +0200, Rodrigo Bruno wrote: > Hi all, > > here is the first version of our contribution for draft JEP-8204088. > > More details at the CR. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204088 > Webrev: > http://cr.openjdk.java.net/~tschatzl/jelastic/cmx/ > > Thanks, > Rodrigo this looks even less change that I imagined :) Some thoughts on the changes: - the flag is called CurrentMaxCapacity and the corresponding getter in the code is called "max_current_capacity()". It would be nice to be uniform in where the "current" should be :) - please move the checks to prevent expansion into G1's actual heap memory manager, i.e. HeapRegionManager. - the change in G1CollectedHeap::humongous_obj_allocate() would be automatically subsumed by HeapRegionManager::find_contiguous_empty_or_unavailable() if it checked the amount of current region + requested regions <= current max regions first. I am actually not sure what the change actually prevents - both capacity() and max_capacity() do not check CurrentMaxCapacity. Maybe this should be s/max_capacity()/max_current_capacity()/ ? - G1CollectedHeap::resize_if_necessary_after_full_collection() should probably just use for G1CollectedHeap::max_current_capacity(). - the change in G1CollectedHeap::expand() will disappear because is_maximal_no_gc() will automatically adapt (need to fix the implementation of HeapRegionManager::is_available()). - the existing implementation of G1CollectedHeap::max_current_capacity() should probably just defer to the HeapRegionManager then. - the problem is that now there is an inconsistency with G1CollectedHeap::is_maximal_no_gc() and CurrentMaxHeapSize. I.e. is_maximal_no_gc() uses the number of available regions compared to the absolute maximum number of regions, and not the current maximum number of regions. - contrary to Per I am not sure this feature should be done as a collector specific change. It would also be a pity to have the flag have a "G1" prefix only to see other collectors pick this up fairly quickly; the usefulness of such an option beyond just G1 has already been shown in that other VMs (e.g. J9) already have the same feature. This would just mean having two options that do exactly the same thing, and getting rid of managable flags is not easy. - there is some issue with the java.lang.Runtime.maxMemory() API which may need adjustment: in the change that method returns the current maximum amount of memory the JVM will use, not the absolute maximum. It may confuse applications which use this function to get different values depending on when it's called in the future, and might actually need an update of the specification in that regard. Does somebody happen to know what J9 reports here if you change current heap size? - I need to think more about potential issues with allowing to change and use the flag (and dependent methods) without other synchronization. Consider changing this and different threads picking up different values of the current max heap size (in the VM; the Java application is another problem). Very conservatively I would suggest to have a safepoint for that. Also I think the entry point to the VM for setting manageable flags into the VM is the SetVMFlagDCmd which may need to be adapted to allow more code to be run (more below). Otoh from what I saw the current uses of the CurrentMaxHeapSize flag and HeapRegionManager are already under the Heap_lock, but that needs more careful further review. - I am not sure how the current implementation satisfies this condition from the JEP: "If it is not possible to increase or decrease the amount of memory available to the application, the operation should fail. The user should be informed of the result of the operation." at this time. As far as I understand it reports back to the VM that it could write the flag to the new value, but not that the memory management (GC) accepted that value. Options could be to have the flags data structure enhanced with a VM callback like the constraint function (or one could maybe somehow misuse the constraints function for that). I guess the runtime team (in the hotspot-runtime-dev mailing list) would be the people to ask how to best do that if they do not read this :) An alternative to the manageable flag would be to use a separate Jmap ("GC.set_current_max_heapsize") command that could do anything already. - Also I think at the moment any collector would accept changing CurrentMaxHeapSize at runtime, happily obliging I guess. - in G1 at least heap sizes must be multiples of region size. This needs to be checked every time it changes (playing into the custom callback issues). (The existing flags will be automatically aligned in CollectorPolicy::initialize_flags()). I think overriding CollectorPolicy::initialize_flags() in G1CollectorPolicy can handle that. - please fix the indentation in the constraint function for CurrentMaxHeapSize. We use two spaces indent and it's all over the place (particularly closing braces) there :) - the CurrentMaxHeapSize flag should move to the other *HeapSize flags at least into gc_globals.hpp, not globals.hpp. (May need to move to g1_globals.hpp pending the discussion whether this should be a g1 specific flag or not). - there is no test of this functionality. There should be a test which programmatically tries to update current heap size. This value could be checked (crudely) by extending WB_PrintHeapSizes() in whitebox.cpp and then checked like the tests in test/hotspot/jtreg/gc/arguments/Test*HeapSizeFlags.java tests do. That's all for my initial comments. Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 22 13:05:26 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Jun 2018 15:05:26 +0200 Subject: RFR: 8204088: Dynamic Max Memory Limit In-Reply-To: References: Message-ID: <76abc82f552ecfbcdc2eb6a2e11b10594457a7c7.camel@oracle.com> Hi, On Wed, 2018-06-20 at 18:32 +0200, Per Liden wrote: > Hi Rodrigo, > > A general comment, which applies to both JDK-8204088 and JDK-8204089, > is that I would like to see this code move into G1 itself. We should > strive to avoid introducing global concepts and options that are not > applicable to other GCs. I can't see anything obvious in neither JDK- > 8204088 nor JDK-8204089 that would stop you from doing that. only talking about the option changes for JDK-8204088 right now, but I think it would be useful that the CurrentMaxHeapSize flag should be gc global - it is a known useful feature that is also already available in J9. It would be a waste of time to first implement it G1 specific (and potentially using a G1 prefix for it, e.g. G1CurrentMaxHeapSize) and then it gets picked up by others. I am mostly concerned about removing and maintaining both G1CurrentMaxHeapSize and CurrentHeapMaxSize for some time if it can be avoided (given that other VMs already implemented it, and it does seem more generally useful than many other "generic" GC options there are). Thanks, Thomas From martin.doerr at sap.com Fri Jun 22 14:29:43 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 22 Jun 2018 14:29:43 +0000 Subject: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space In-Reply-To: References: <82571A7F-6113-4C0F-A4DB-C18C2E48D8E2@oracle.com> Message-ID: Hi Thomas, submission repo testing has passed and it was "Reviewed-by: kbarrett, mdoerr, drwhite, tschatzl" if I see it correctly. I'll push it next week if I hear no objections. Thanks, Martin -----Original Message----- From: Thomas Schatzl [mailto:thomas.schatzl at oracle.com] Sent: Mittwoch, 20. Juni 2018 14:15 To: Michihiro Horie ; Doerr, Martin ; Kim Barrett Cc: Gustavo Bueno Romero ; david.holmes at oracle.com; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(M): 8204524: Unnecessary memory barriers in G1ParScanThreadState::copy_to_survivor_space Hi, On Wed, 2018-06-20 at 07:41 -0400, Michihiro Horie wrote: > Hi Martin, Kim, all, > > > I assume webrev.00 was used for reviews and tests as Thomas has > > emphasized that evacuation failures may be performance critical, > > too. It looks correct to me, too. > > > > I can sponsor the change if needed. Please let me know when I can > > consider it reviewed. > > Thanks a lot for sponsoring the change, Martin. Yes, webrev.00 is the > one used for the review: > http://cr.openjdk.java.net/~mhorie/8204524/webrev.00/ > > I think Kim would review the change because Derek concluded there is > a moderate performance gain in SPECjbb on AArch64. Kim, would you > agree with this change? Kim is on vacation, but from the context of his review he looked at the 00 change, not the later, reduced 01 one (http://mail.openjdk.java.net/ pipermail/hotspot-gc-dev/2018-June/022358.html). So I guess it can be considered as reviewed from our POV. Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 22 14:30:48 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Jun 2018 16:30:48 +0200 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: References: Message-ID: <306156e383e97e21fb6007220e0e2534203a4361.camel@oracle.com> Hi, On Sat, 2018-06-16 at 21:00 +0200, Thomas St?fe wrote: > Hi Vinay! > > this is the third thread you opened for this issue; it would helpful > if you would not change subjects, because it splinters discussions on > the mailing list. I agree. It is very hard to follow who mentioned what suggestion already. It would also help to answer inline as now it is very hard to see who answered what to which issue, but I think you repeated all points below. > --- > > Thank you for your perseverance and patience. > > But unfortunately, my concerns have not been alleviated a lot by the > current patch. I still think this stretches an already > partly-ill-defined interface further. > > About os::commit_memory(): as I wrote in my first mail: > > "So far, for the most part, the os::{reserve|commit}_memory APIs have > been agnostic to the underlying implementation. You pretty much tie > it to mmap() now. This adds implicit restrictions to the API we did > not have before (e.g. will not work if platform uses SysV shm APIs > to implement these APIs)." > > In your response > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/02235 > 6.html > you explain why you do this. I understand your motives. I still > dislike this, for the reasons I gave before: it adds a generic API to > a set of generic APIs which IMHO breaks the implicit contract they > all have with each other. > > Your patch also blurrs the difference between runtime "os::" layer > and GC. "os" is a general OS-wrapping API layer, and should not know > nor care about GC internals: > > + static inline int nvdimm_fd() { > + // ParallelOldGC adaptive sizing requires nvdimm fd. > + return _nvdimm_fd; > + } > + static inline address dram_heapbase() { > + return _dram_heap_base; > + } > + static inline address nvdimm_heapbase() { > + return _nvdimm_heap_base; > + } > + static inline uint nvdimm_regionlength() { > + return _nvdimm_region_length; > + } > > IMHO, currently the memory management is ill prepared for your patch; > yes, one could shove it in, but at the expense of maintainability and > code clearness. This expense would have to be carried by all JVM > developers, regardless whether they work on your hardware and benefit > from this feature. > > So I think this would work better with some preparatory refactoring > done in the VM. Red Hat and Oracle did similar efforts by refactoring > the GC interface before adding new GCs: see > https://bugs.openjdk.java.net/browse/JDK-8163329. > > Maybe we could think about how to do this. It certainly would be a > worthy goal. I generally agree with all these sentiments: much of the nvdimm specific parts should not be in OS nor in the base VirtualSpace class. Having looked a bit at the code (currently it is a bit hard to get spare time, this is the reason for the late answer) I also think that the existing VirtualSpace (and ReservedSpace) code is too much cluttered up. Part of this is that the existing change to have the entire heap backed by one file used the _special flag in ReservedSpace (which *probably* should be called something like _reservation_already_commits) for its purposes. One suggestion that I had when reading the code was that maybe it would be useful to have VirtualSpace subclasses that handle one particular way of committing memory within: e.g. one doing nothing (for cases when the reserved space is precommitted, e.g. SHM page based memory), one using a regular backing file (I agree that piggy-backing on _special in ReservedSpace was not a good idea; it would probably very similar to the no-op one except for constructor/destructor), and another one handling nvdimm (because apparently compared to a regular backing file it needs special exit code handling) and one that roughly does what the current one does, i.e. allowing commit of "lower", "middle" and "high" part using different page sizes. Further, then let the user of a ReservedSpace (eg. the collector) select what VirtualSpace(s) it uses (via e.g. a factory method that takes the ReservedSpace so that it can return the no-op VirtualSpace in case everything is already committed). In case of collectors, they can do more sophisticated selection of the type of VirtualSpace they need; and apart from G1 they already use separate VirtualSpaces per generation, so there should be minimal tweaking to support different VirtualSpaces than now for them. Looking at Parallel GC, it already owns two different PSVirtualSpaces that have a very similar interface to VirtualSpace; the PSVirtualSpace used may already be easily replaced by this "NVDIMMVirtualSpace" implementation holding all NVDIMM specific information (at the moment I am not sure why there is a PSVirtualSpace actually, from a functionality POV it looks very similar to the existing VirtualSpace). G1 still needs to be changed to properly handle two VirtualSpaces; for G1 I suggest to use a subclass of HeapRegionManager that simply uses two VirtualSpaces instead of one; the NVDIMMVirtualSpace where you internally in HeapRegionManager ignore commits and uncommits, and the existing one for the remainder. Then the actual changes to existing code would be limited to, depending on the command line flag, select one or the other VirtualSpace (Parallel) or HeapRegionmanager (G1). I do not see that incorporating the noaccess prefix or the executable property of ReservedSpace gives too much complication: the former is to be ignored for the VirtualSpace and completely transparent to them, and the latter just some flag to pass through). Forgive me if above is completely outrageously stupid, but that would probably be what I would start with when standing before this problem. Thanks, Thomas From thomas.schatzl at oracle.com Fri Jun 22 14:44:12 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Jun 2018 16:44:12 +0200 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: References: Message-ID: <5df6d080894cfad5e6486a00f28b6ccfc5ca633f.camel@oracle.com> Hi Thomas, On Tue, 2018-06-19 at 13:40 +0200, Thomas St?fe wrote: > Hi Vinay, > > On Mon, Jun 18, 2018 at 6:47 PM, Awasthi, Vinay K > wrote: > > Hi Thomas, > > > > Os::commit_memory calls map_memory_to_file which is same as > > os::reserve_memory. > > > > I am failing to see why os::reserve_memory can call > > map_memory_to_file (i.e. tie it to mmap) but commit_memory can't... > > Before this patch, commit_memory never dealt with incrementally > > committing pages to device so there has to be a way to pass file > > descriptor and offset. Windows has no such capability to manage > > incremental commits. All other OSes do and that is why > > map_memory_to_file is used (which by the way also works on > > Windows). > > AIX uses System V shared memory by default, which follows a different > allocation scheme (semantics more like Windows VirtualAlloc... > calls). > > But my doubts are not limited to that one, see my earlier replies and > those of others. It really makes sense to step back one step and > discuss the JEP first. > There is already a discussion thread as David mentioned (http://mail.op enjdk.java.net/pipermail/hotspot-gc-dev/2018-May/022092.html) that so far nobody answered to. I believe discussion about contents the JEP and the implementation should be separate. So far what I gathered from the responses to the implementation, the proposed idea itself is not the issue here (allowing the use of NVDIMM memory for parts of the heap for allowing the use of larger heaps to improve overall performance; I am not saying that the text doesn't need a bit more work :) ), but rather how an implementation of this JEP should proceed. Let's discuss the non-implementation stuff in that thread. Vinay or I can repost the proposal email if you do not have it any more so that answers will be in-thread. Thanks, Thomas From thomas.stuefe at gmail.com Fri Jun 22 16:25:20 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 22 Jun 2018 18:25:20 +0200 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: <5df6d080894cfad5e6486a00f28b6ccfc5ca633f.camel@oracle.com> References: <5df6d080894cfad5e6486a00f28b6ccfc5ca633f.camel@oracle.com> Message-ID: Hi Thomas, On Fri, Jun 22, 2018 at 4:44 PM, Thomas Schatzl wrote: > Hi Thomas, > > On Tue, 2018-06-19 at 13:40 +0200, Thomas St?fe wrote: >> Hi Vinay, >> >> On Mon, Jun 18, 2018 at 6:47 PM, Awasthi, Vinay K >> wrote: >> > Hi Thomas, >> > >> > Os::commit_memory calls map_memory_to_file which is same as >> > os::reserve_memory. >> > >> > I am failing to see why os::reserve_memory can call >> > map_memory_to_file (i.e. tie it to mmap) but commit_memory can't... >> > Before this patch, commit_memory never dealt with incrementally >> > committing pages to device so there has to be a way to pass file >> > descriptor and offset. Windows has no such capability to manage >> > incremental commits. All other OSes do and that is why >> > map_memory_to_file is used (which by the way also works on >> > Windows). >> >> AIX uses System V shared memory by default, which follows a different >> allocation scheme (semantics more like Windows VirtualAlloc... >> calls). >> >> But my doubts are not limited to that one, see my earlier replies and >> those of others. It really makes sense to step back one step and >> discuss the JEP first. >> > > There is already a discussion thread as David mentioned (http://mail.op > enjdk.java.net/pipermail/hotspot-gc-dev/2018-May/022092.html) that so > far nobody answered to. > Ah, I thought he wanted to have the JEP discussed in the comments section of the JEP itself. > I believe discussion about contents the JEP and the implementation > should be separate. > makes sense. > So far what I gathered from the responses to the implementation, the > proposed idea itself is not the issue here (allowing the use of NVDIMM > memory for parts of the heap for allowing the use of larger heaps to > improve overall performance; I am not saying that the text doesn't need > a bit more work :) ), but rather how an implementation of this JEP > should proceed. I have no problem with adding NVDIMM support. I think it is a cool feature. Also, the awkwardness off the memory management abstraction layer in hotspot has always been a sore point with me (I originally implemented the AIX mm layer in os_aix.cpp, and those are painful memories). So, I have a lot of sympathy for Vinays struggles. Unfortunately not much time atm, but I will respond to your mail. > > Let's discuss the non-implementation stuff in that thread. Vinay or I > can repost the proposal email if you do not have it any more so that > answers will be in-thread. > Okay, sounds good. Thanks, Thomas (one of us should change his name to make this less confusing :-) > Thanks, > Thomas > From david.holmes at oracle.com Mon Jun 25 05:35:59 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 25 Jun 2018 15:35:59 +1000 Subject: RFR(M): 8204908: NVDIMM for POGC and G1GC - ReserveSpace.cpp changes are mostly eliminated/no collector specific code. In-Reply-To: <5df6d080894cfad5e6486a00f28b6ccfc5ca633f.camel@oracle.com> References: <5df6d080894cfad5e6486a00f28b6ccfc5ca633f.camel@oracle.com> Message-ID: <62719308-890a-a7d1-c311-51a5c39e4d8f@oracle.com> Hi Thomas Schatzl (I was going to use Thomas S. but even that doesn't disambiguate in this case :) ) On 23/06/2018 12:44 AM, Thomas Schatzl wrote: > Hi Thomas, > > On Tue, 2018-06-19 at 13:40 +0200, Thomas St?fe wrote: >> Hi Vinay, >> >> On Mon, Jun 18, 2018 at 6:47 PM, Awasthi, Vinay K >> wrote: >>> Hi Thomas, >>> >>> Os::commit_memory calls map_memory_to_file which is same as >>> os::reserve_memory. >>> >>> I am failing to see why os::reserve_memory can call >>> map_memory_to_file (i.e. tie it to mmap) but commit_memory can't... >>> Before this patch, commit_memory never dealt with incrementally >>> committing pages to device so there has to be a way to pass file >>> descriptor and offset. Windows has no such capability to manage >>> incremental commits. All other OSes do and that is why >>> map_memory_to_file is used (which by the way also works on >>> Windows). >> >> AIX uses System V shared memory by default, which follows a different >> allocation scheme (semantics more like Windows VirtualAlloc... >> calls). >> >> But my doubts are not limited to that one, see my earlier replies and >> those of others. It really makes sense to step back one step and >> discuss the JEP first. >> > > There is already a discussion thread as David mentioned (http://mail.op > enjdk.java.net/pipermail/hotspot-gc-dev/2018-May/022092.html) that so > far nobody answered to. > > I believe discussion about contents the JEP and the implementation > should be separate. Eventually, but personally I think discussion of a prototype/POC implementation is extremely premature when the JEP itself has not been discussed. I expect the JEP discussion to cover higher-level design issues (i.e the right abstraction layer) which would then guide the implementation. Cheers, David ----- > > So far what I gathered from the responses to the implementation, the > proposed idea itself is not the issue here (allowing the use of NVDIMM > memory for parts of the heap for allowing the use of larger heaps to > improve overall performance; I am not saying that the text doesn't need > a bit more work :) ), but rather how an implementation of this JEP > should proceed. > > Let's discuss the non-implementation stuff in that thread. Vinay or I > can repost the proposal email if you do not have it any more so that > answers will be in-thread. > > Thanks, > Thomas > From stefan.karlsson at oracle.com Mon Jun 25 12:25:21 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 25 Jun 2018 14:25:21 +0200 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header Message-ID: Hi all, Please review this patch to use oop_iterate instead of oop_iterate_no_headers. http://cr.openjdk.java.net/~stefank/8205607/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8205607 oop_iterate_no_header is a convenience function that allows developers to pass in sub classes of OopClosures, instead of sub classes of OopIterateClosure, to the oop_iterate machinery. I propose that we remove this function, and change the few closures where this is used, to inherit from OopIterateClosure instead of OopClosure. Thanks, StefanK From stefan.karlsson at oracle.com Mon Jun 25 13:58:04 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 25 Jun 2018 15:58:04 +0200 Subject: RFR: 8144992: Remove OopIterateClosure::idempotent Message-ID: <51ffe58b-791f-6218-49dc-127cef909dc5@oracle.com> Hi all, Please review this patch to remove the OopIterateClosure::idempotent() function. http://cr.openjdk.java.net/~stefank/8144992/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8144992 There's no closure that overrides idempotent() anymore, so this patch removes the function an all usages. Thanks, StefanK From kim.barrett at oracle.com Mon Jun 25 16:25:02 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 25 Jun 2018 12:25:02 -0400 Subject: RFR: 8144992: Remove OopIterateClosure::idempotent In-Reply-To: <51ffe58b-791f-6218-49dc-127cef909dc5@oracle.com> References: <51ffe58b-791f-6218-49dc-127cef909dc5@oracle.com> Message-ID: > On Jun 25, 2018, at 9:58 AM, Stefan Karlsson wrote: > > Hi all, > > Please review this patch to remove the OopIterateClosure::idempotent() function. > > http://cr.openjdk.java.net/~stefank/8144992/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8144992 > > There's no closure that overrides idempotent() anymore, so this patch removes the function an all usages. > > Thanks, > StefanK Looks good. From stefan.karlsson at oracle.com Mon Jun 25 19:00:29 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 25 Jun 2018 21:00:29 +0200 Subject: RFR: 8144992: Remove OopIterateClosure::idempotent In-Reply-To: References: <51ffe58b-791f-6218-49dc-127cef909dc5@oracle.com> Message-ID: <4f69e000-902f-c019-066c-516a36ee2f66@oracle.com> Thanks, Kim. StefanK On 2018-06-25 18:25, Kim Barrett wrote: >> On Jun 25, 2018, at 9:58 AM, Stefan Karlsson wrote: >> >> Hi all, >> >> Please review this patch to remove the OopIterateClosure::idempotent() function. >> >> http://cr.openjdk.java.net/~stefank/8144992/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8144992 >> >> There's no closure that overrides idempotent() anymore, so this patch removes the function an all usages. >> >> Thanks, >> StefanK > Looks good. > From per.liden at oracle.com Mon Jun 25 21:01:28 2018 From: per.liden at oracle.com (Per Liden) Date: Mon, 25 Jun 2018 23:01:28 +0200 Subject: RFR: 8144992: Remove OopIterateClosure::idempotent In-Reply-To: <51ffe58b-791f-6218-49dc-127cef909dc5@oracle.com> References: <51ffe58b-791f-6218-49dc-127cef909dc5@oracle.com> Message-ID: <00cb731b-21ad-1514-d735-fe80b93caf9b@oracle.com> Looks good! /Per On 06/25/2018 03:58 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to remove the OopIterateClosure::idempotent() > function. > > http://cr.openjdk.java.net/~stefank/8144992/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8144992 > > There's no closure that overrides idempotent() anymore, so this patch > removes the function an all usages. > > Thanks, > StefanK From stefan.karlsson at oracle.com Mon Jun 25 21:02:22 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 25 Jun 2018 23:02:22 +0200 Subject: RFR: 8144992: Remove OopIterateClosure::idempotent In-Reply-To: <00cb731b-21ad-1514-d735-fe80b93caf9b@oracle.com> References: <51ffe58b-791f-6218-49dc-127cef909dc5@oracle.com> <00cb731b-21ad-1514-d735-fe80b93caf9b@oracle.com> Message-ID: Thanks, Per. StefanK On 2018-06-25 23:01, Per Liden wrote: > Looks good! > > /Per > > On 06/25/2018 03:58 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to remove the >> OopIterateClosure::idempotent() function. >> >> http://cr.openjdk.java.net/~stefank/8144992/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8144992 >> >> There's no closure that overrides idempotent() anymore, so this patch >> removes the function an all usages. >> >> Thanks, >> StefanK From per.liden at oracle.com Mon Jun 25 22:10:06 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Jun 2018 00:10:06 +0200 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header In-Reply-To: References: Message-ID: Looks good! /Per On 06/25/2018 02:25 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to use oop_iterate instead of > oop_iterate_no_headers. > > http://cr.openjdk.java.net/~stefank/8205607/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8205607 > > oop_iterate_no_header is a convenience function that allows developers > to pass in sub classes of OopClosures, instead of sub classes of > OopIterateClosure, to the oop_iterate machinery. > > I propose that we remove this function, and change the few closures > where this is used, to inherit from OopIterateClosure instead of > OopClosure. > > Thanks, > StefanK From stefan.karlsson at oracle.com Mon Jun 25 22:10:48 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 26 Jun 2018 00:10:48 +0200 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header In-Reply-To: References: Message-ID: <4eaa171f-3ea6-9cd4-7201-4f23aa531458@oracle.com> Thanks, Per! StefanK On 2018-06-26 00:10, Per Liden wrote: > Looks good! > > /Per > > On 06/25/2018 02:25 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to use oop_iterate instead of >> oop_iterate_no_headers. >> >> http://cr.openjdk.java.net/~stefank/8205607/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8205607 >> >> oop_iterate_no_header is a convenience function that allows >> developers to pass in sub classes of OopClosures, instead of sub >> classes of OopIterateClosure, to the oop_iterate machinery. >> >> I propose that we remove this function, and change the few closures >> where this is used, to inherit from OopIterateClosure instead of >> OopClosure. >> >> Thanks, >> StefanK From kim.barrett at oracle.com Mon Jun 25 22:11:53 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 25 Jun 2018 18:11:53 -0400 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header In-Reply-To: References: Message-ID: > On Jun 25, 2018, at 8:25 AM, Stefan Karlsson wrote: > > Hi all, > > Please review this patch to use oop_iterate instead of oop_iterate_no_headers. > > http://cr.openjdk.java.net/~stefank/8205607/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8205607 > > oop_iterate_no_header is a convenience function that allows developers to pass in sub classes of OopClosures, instead of sub classes of OopIterateClosure, to the oop_iterate machinery. > > I propose that we remove this function, and change the few closures where this is used, to inherit from OopIterateClosure instead of OopClosure. > > Thanks, > StefanK ------------------------------------------------------------------------------ src/hotspot/share/gc/cms/compactibleFreeListSpace.cpp 2438 class VerifyAllOopsClosure: public BasicOopIterateClosure { Seeing this here and elsewhere, I wish BasicOopIterateClosure were called something like OopIterateNoMetadataClosure or NoMetadataOopIterateClosure. "Basic" doesn't really tell me much. If you agree, such a name change can be another RFE. ------------------------------------------------------------------------------ src/hotspot/share/gc/cms/compactibleFreeListSpace.cpp 2527 // Iterate over all oops in the heap. Uses the _no_header version 2528 // since we are not interested in following the klass pointers. 2529 CMSHeap::heap()->oop_iterate(&cl); Comment needs updating; there isn't a _no_header version anymore. ------------------------------------------------------------------------------ Some places that used to do oop_iterate_no_header now do oop_iterate using some arbitrary (based on signature type) OopIterateClosure. MutableSpace GenCollectedHeap PsOldSpace It's only by knowing the closure is a BasicOopIterateClosure that we know the metadata won't be processed. And that may require tracing through several levels of calls. [There may also be a (small) performance cost associated with that in some cases, since we'll be calling the virtual do_metadata function for each of the objects being processed to discover the metadata should be skipped. But I'm mostly concerned about knowing that we're dealing with a no-header iteration.] It might be clearer if the oop_iterate_no_header names were retained here, with the argument type being changed to BasicOopIterateClosure*. ------------------------------------------------------------------------------ From kim.barrett at oracle.com Mon Jun 25 22:18:59 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 25 Jun 2018 18:18:59 -0400 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header In-Reply-To: References: Message-ID: <835C8711-5EBC-431B-A3F5-8CDB2D594DDB@oracle.com> > On Jun 25, 2018, at 6:11 PM, Kim Barrett wrote: > Some places that used to do oop_iterate_no_header now do oop_iterate > using some arbitrary (based on signature type) OopIterateClosure. > > MutableSpace > GenCollectedHeap > PsOldSpace > > It's only by knowing the closure is a BasicOopIterateClosure that we > know the metadata won't be processed. And that may require tracing > through several levels of calls. [There may also be a (small) > performance cost associated with that in some cases, since we'll be > calling the virtual do_metadata function for each of the objects being > processed to discover the metadata should be skipped. But I'm mostly > concerned about knowing that we're dealing with a no-header iteration.] > > It might be clearer if the oop_iterate_no_header names were retained > here, with the argument type being changed to BasicOopIterateClosure*. > To be more explicit, I think the above named classes were providing a more restrictive API than a full oop_iterate, and I'm not sure that should change. From stefan.karlsson at oracle.com Mon Jun 25 23:25:51 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 26 Jun 2018 01:25:51 +0200 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header In-Reply-To: References: Message-ID: On 2018-06-26 00:11, Kim Barrett wrote: >> On Jun 25, 2018, at 8:25 AM, Stefan Karlsson wrote: >> >> Hi all, >> >> Please review this patch to use oop_iterate instead of oop_iterate_no_headers. >> >> http://cr.openjdk.java.net/~stefank/8205607/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8205607 >> >> oop_iterate_no_header is a convenience function that allows developers to pass in sub classes of OopClosures, instead of sub classes of OopIterateClosure, to the oop_iterate machinery. >> >> I propose that we remove this function, and change the few closures where this is used, to inherit from OopIterateClosure instead of OopClosure. >> >> Thanks, >> StefanK > ------------------------------------------------------------------------------ > src/hotspot/share/gc/cms/compactibleFreeListSpace.cpp > 2438 class VerifyAllOopsClosure: public BasicOopIterateClosure { > > Seeing this here and elsewhere, I wish BasicOopIterateClosure were > called something like OopIterateNoMetadataClosure or > NoMetadataOopIterateClosure. "Basic" doesn't really tell me much. > > If you agree, such a name change can be another RFE. Me and Per went over a few alternatives. We wanted a short name and didn't like the negation in NoMetadata. We couldn't find a perfect name, and BasicOopIterateClosure was the least disliked name that we came up with. It's not an optimal name, but I hope that anyone reading this code will end up reading the class comment: // An OopIterateClosure that can be used when there's no need to visit the Metadata. class BasicOopIterateClosure : public OopIterateClosure { But, yes, we can change the name if we can agree on a good name. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/cms/compactibleFreeListSpace.cpp > 2527 // Iterate over all oops in the heap. Uses the _no_header version > 2528 // since we are not interested in following the klass pointers. > 2529 CMSHeap::heap()->oop_iterate(&cl); > > Comment needs updating; there isn't a _no_header version anymore. Will fix. > > ------------------------------------------------------------------------------ > > Some places that used to do oop_iterate_no_header now do oop_iterate > using some arbitrary (based on signature type) OopIterateClosure. > > MutableSpace > GenCollectedHeap > PsOldSpace > > It's only by knowing the closure is a BasicOopIterateClosure that we > know the metadata won't be processed. And that may require tracing > through several levels of calls. Yes, this is exactly what I want. I don't think the oop_iterate functions of the Space, Gen, or Heap classes should have to know, or care, if the passed in closure visits the metadata or not. I want that decision to be made by passing in the appropriate closure, which means that you need to know what your closure does. > [There may also be a (small) > performance cost associated with that in some cases, since we'll be > calling the virtual do_metadata function for each of the objects being > processed to discover the metadata should be skipped. But I'm mostly > concerned about knowing that we're dealing with a no-header iteration.] There shouldn't be a difference here. The old code also took a virtual call for the do_metadata function. > > It might be clearer if the oop_iterate_no_header names were retained > here, with the argument type being changed to BasicOopIterateClosure*. I would prefer to not do that. This patch provides generic oop_iterate functions that can be used by both metadata visiting closures, and closures that don't visit metadata, and we have moved all responsibility to determine if metadata should be visited to the closures. I don't think we need to be extra restrictive here. Thanks, StefanK > > ------------------------------------------------------------------------------ > From kim.barrett at oracle.com Tue Jun 26 07:12:25 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 26 Jun 2018 03:12:25 -0400 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header In-Reply-To: References: Message-ID: > On Jun 25, 2018, at 7:25 PM, Stefan Karlsson wrote: > > On 2018-06-26 00:11, Kim Barrett wrote: >>> On Jun 25, 2018, at 8:25 AM, Stefan Karlsson wrote: >>> >>> Hi all, >>> >>> Please review this patch to use oop_iterate instead of oop_iterate_no_headers. >>> >>> http://cr.openjdk.java.net/~stefank/8205607/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8205607 >>> >>> oop_iterate_no_header is a convenience function that allows developers to pass in sub classes of OopClosures, instead of sub classes of OopIterateClosure, to the oop_iterate machinery. >>> >>> I propose that we remove this function, and change the few closures where this is used, to inherit from OopIterateClosure instead of OopClosure. >>> >>> Thanks, >>> StefanK >> ------------------------------------------------------------------------------ >> src/hotspot/share/gc/cms/compactibleFreeListSpace.cpp >> 2438 class VerifyAllOopsClosure: public BasicOopIterateClosure { >> >> Seeing this here and elsewhere, I wish BasicOopIterateClosure were >> called something like OopIterateNoMetadataClosure or >> NoMetadataOopIterateClosure. "Basic" doesn't really tell me much. >> >> If you agree, such a name change can be another RFE. > > Me and Per went over a few alternatives. We wanted a short name and didn't like the negation in NoMetadata. We couldn't find a perfect name, and BasicOopIterateClosure was the least disliked name that we came up with. It's not an optimal name, but I hope that anyone reading this code will end up reading the class comment: > > // An OopIterateClosure that can be used when there's no need to visit the Metadata. > class BasicOopIterateClosure : public OopIterateClosure { > > But, yes, we can change the name if we can agree on a good name. I don't know what you might have tried. Here are a few ideas, after grovelling around with a thesaraus: MetadataObliviousOopIterateClosure MetadataIgnoringOopIterateClosure OopIterateVisitingMetadataClosure (replacing MetadataVisitingOopIterateClosure) OopIterateIgnoringMetadataClosure >> Some places that used to do oop_iterate_no_header now do oop_iterate >> using some arbitrary (based on signature type) OopIterateClosure. >> >> MutableSpace >> GenCollectedHeap >> PsOldSpace >> >> It's only by knowing the closure is a BasicOopIterateClosure that we >> know the metadata won't be processed. And that may require tracing >> through several levels of calls. > > Yes, this is exactly what I want. I don't think the oop_iterate functions of the Space, Gen, or Heap classes should have to know, or care, if the passed in closure visits the metadata or not. I want that decision to be made by passing in the appropriate closure, which means that you need to know what your closure does. > >> [There may also be a (small) >> performance cost associated with that in some cases, since we'll be >> calling the virtual do_metadata function for each of the objects being >> processed to discover the metadata should be skipped. But I'm mostly >> concerned about knowing that we're dealing with a no-header iteration.] > > There shouldn't be a difference here. The old code also took a virtual call for the do_metadata function. The old code completely ignored do_metadata and the like, didn?t it? Else what was the point of _no_header? >> It might be clearer if the oop_iterate_no_header names were retained >> here, with the argument type being changed to BasicOopIterateClosure*. > > I would prefer to not do that. This patch provides generic oop_iterate functions that can be used by both metadata visiting closures, and closures that don't visit metadata, and we have moved all responsibility to determine if metadata should be visited to the closures. I don't think we need to be extra restrictive here. It's not obvious whether these oop_iterate_no_header were intended to be a semantic distinction or simply an optimization. But since all the uses seem to be verification-like, it becomes harder to use an optimization justification for the unusual API. So I'm willing to buy your argument that the semantic distinction isn't needed. Looks good. From per.liden at oracle.com Tue Jun 26 07:49:29 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Jun 2018 09:49:29 +0200 Subject: RFR: 8205663: ZGC: Log metaspace used/capacity/committed/reserved Message-ID: <5c71ad6c-fa23-49b9-aab4-fd6f8828d1b0@oracle.com> In ZGC, we currently don't log metaspace used/capacity/committed/reserved. This is useful information, which should be logged. Bug: https://bugs.openjdk.java.net/browse/JDK-8205663 Webrev: http://cr.openjdk.java.net/~pliden/8205663/webrev.0 /Per From stefan.karlsson at oracle.com Tue Jun 26 07:48:03 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 26 Jun 2018 09:48:03 +0200 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header In-Reply-To: References: Message-ID: On 2018-06-26 09:12, Kim Barrett wrote: >> On Jun 25, 2018, at 7:25 PM, Stefan Karlsson wrote: >> >> On 2018-06-26 00:11, Kim Barrett wrote: >>>> On Jun 25, 2018, at 8:25 AM, Stefan Karlsson wrote: >>>> >>>> Hi all, >>>> >>>> Please review this patch to use oop_iterate instead of oop_iterate_no_headers. >>>> >>>> http://cr.openjdk.java.net/~stefank/8205607/webrev.01/ >>>> https://bugs.openjdk.java.net/browse/JDK-8205607 >>>> >>>> oop_iterate_no_header is a convenience function that allows developers to pass in sub classes of OopClosures, instead of sub classes of OopIterateClosure, to the oop_iterate machinery. >>>> >>>> I propose that we remove this function, and change the few closures where this is used, to inherit from OopIterateClosure instead of OopClosure. >>>> >>>> Thanks, >>>> StefanK >>> ------------------------------------------------------------------------------ >>> src/hotspot/share/gc/cms/compactibleFreeListSpace.cpp >>> 2438 class VerifyAllOopsClosure: public BasicOopIterateClosure { >>> >>> Seeing this here and elsewhere, I wish BasicOopIterateClosure were >>> called something like OopIterateNoMetadataClosure or >>> NoMetadataOopIterateClosure. "Basic" doesn't really tell me much. >>> >>> If you agree, such a name change can be another RFE. >> >> Me and Per went over a few alternatives. We wanted a short name and didn't like the negation in NoMetadata. We couldn't find a perfect name, and BasicOopIterateClosure was the least disliked name that we came up with. It's not an optimal name, but I hope that anyone reading this code will end up reading the class comment: >> >> // An OopIterateClosure that can be used when there's no need to visit the Metadata. >> class BasicOopIterateClosure : public OopIterateClosure { >> >> But, yes, we can change the name if we can agree on a good name. > > I don't know what you might have tried. Here are a few ideas, after > grovelling around with a thesaraus: > > MetadataObliviousOopIterateClosure > MetadataIgnoringOopIterateClosure > > OopIterateVisitingMetadataClosure (replacing MetadataVisitingOopIterateClosure) > OopIterateIgnoringMetadataClosure Here are some of the names on our whiteboard: PureOopIterateClosure BasicOopIterateClosure PlainOopIterateClosure StandardOopIterateClosure NoMetadataOopIterateClosure WithoutMetadataOopIterateClosure MetadataAgnosticOopIterateClosure MetadataIgnoringOopIterateClosure MetadataOopIterateClosure MetadataAwareOopIterateClosure MetadataVisitingOopIterateClosure OopIterateClosureWithMetadata OopIterateClosureWithoutMetadata OopIterateWithMetadataClosure OopIterateWithoutMetadataClosure OopIterateMetadataClosure OopIterateNoMetadataClosure > >>> Some places that used to do oop_iterate_no_header now do oop_iterate >>> using some arbitrary (based on signature type) OopIterateClosure. >>> >>> MutableSpace >>> GenCollectedHeap >>> PsOldSpace >>> >>> It's only by knowing the closure is a BasicOopIterateClosure that we >>> know the metadata won't be processed. And that may require tracing >>> through several levels of calls. >> >> Yes, this is exactly what I want. I don't think the oop_iterate functions of the Space, Gen, or Heap classes should have to know, or care, if the passed in closure visits the metadata or not. I want that decision to be made by passing in the appropriate closure, which means that you need to know what your closure does. >> >>> [There may also be a (small) >>> performance cost associated with that in some cases, since we'll be >>> calling the virtual do_metadata function for each of the objects being >>> processed to discover the metadata should be skipped. But I'm mostly >>> concerned about knowing that we're dealing with a no-header iteration.] >> >> There shouldn't be a difference here. The old code also took a virtual call for the do_metadata function. > > The old code completely ignored do_metadata and the like, didn?t it? Else what was the point of _no_header? No. See: -int oopDesc::oop_iterate_no_header(OopClosure* blk) { - // The NoHeaderExtendedOopClosure wraps the OopClosure and proxies all - // the do_oop calls, but turns off all other features in OopIterateClosure. - NoHeaderExtendedOopClosure cl(blk); - return oop_iterate_size(&cl); -} The old code wrapped OopClosure in another closure (NoHeaderExtendedOopClosure) that turned off the metadata visiting parts. The oop_iterate framework still called the virtual do_metadata that NoHeaderExtendedOopClosure inherited from ExtendedOopClosure. I added them during the permgen removal so that we wouldn't have to change all OopClosures into ExtendedOopClosure, and thereby didn't have to deal with all the baggage that ExtendedOopClosure brought. Now that we are creating a leaner OopIterateClosure, I'd like to get rid of this. > >>> It might be clearer if the oop_iterate_no_header names were retained >>> here, with the argument type being changed to BasicOopIterateClosure*. >> >> I would prefer to not do that. This patch provides generic oop_iterate functions that can be used by both metadata visiting closures, and closures that don't visit metadata, and we have moved all responsibility to determine if metadata should be visited to the closures. I don't think we need to be extra restrictive here. > > It's not obvious whether these oop_iterate_no_header were intended to > be a semantic distinction or simply an optimization. But since all > the uses seem to be verification-like, it becomes harder to use an > optimization justification for the unusual API. So I'm willing to buy > your argument that the semantic distinction isn't needed. It was added as a convenience wrapper, so that closures that didn't need to care about metadata could remain inheriting from OopClosures instead of ExtendedOopClosure. Regarding the name, do_metadata used to be named do_header when it only guarded the klass pointer in objects: http://hg.openjdk.java.net/jdk6/jdk6/hotspot/file/abf71d5a0e42/src/share/vm/oops/instanceKlass.cpp // closure's do_header() method dicates whether the given closure should be // applied to the klass ptr in the object header. #define InstanceKlass_OOP_OOP_ITERATE_DEFN(OopClosureType, nv_suffix) \ \ int instanceKlass::oop_oop_iterate##nv_suffix(oop obj, OopClosureType* closure) { \ SpecializationStats::record_iterate_call##nv_suffix(SpecializationStats::ik);\ /* header */ \ if (closure->do_header()) { \ obj->oop_iterate_header(closure); \ } \ InstanceKlass_OOP_MAP_ITERATE( \ obj, \ SpecializationStats:: \ record_do_oop_call##nv_suffix(SpecializationStats::ik); \ (closure)->do_oop##nv_suffix(p), \ assert_is_in_closed_subset) \ return size_helper(); \ } > > Looks good. Thanks! StefanK > > > From stefan.karlsson at oracle.com Tue Jun 26 07:53:22 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 26 Jun 2018 09:53:22 +0200 Subject: RFR: 8205663: ZGC: Log metaspace used/capacity/committed/reserved In-Reply-To: <5c71ad6c-fa23-49b9-aab4-fd6f8828d1b0@oracle.com> References: <5c71ad6c-fa23-49b9-aab4-fd6f8828d1b0@oracle.com> Message-ID: Looks good. StefanK On 2018-06-26 09:49, Per Liden wrote: > In ZGC, we currently don't log metaspace > used/capacity/committed/reserved. This is useful information, which > should be logged. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205663 > Webrev: http://cr.openjdk.java.net/~pliden/8205663/webrev.0 > > /Per From erik.helin at oracle.com Tue Jun 26 08:00:51 2018 From: erik.helin at oracle.com (Erik Helin) Date: Tue, 26 Jun 2018 10:00:51 +0200 Subject: RFR: 8205663: ZGC: Log metaspace used/capacity/committed/reserved In-Reply-To: <5c71ad6c-fa23-49b9-aab4-fd6f8828d1b0@oracle.com> References: <5c71ad6c-fa23-49b9-aab4-fd6f8828d1b0@oracle.com> Message-ID: On 06/26/2018 09:49 AM, Per Liden wrote: > In ZGC, we currently don't log metaspace > used/capacity/committed/reserved. This is useful information, which > should be logged. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205663 > Webrev: http://cr.openjdk.java.net/~pliden/8205663/webrev.0 Looks good, Reviewed. Thanks, Erik From per.liden at oracle.com Tue Jun 26 08:08:48 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Jun 2018 10:08:48 +0200 Subject: RFR: 8205663: ZGC: Log metaspace used/capacity/committed/reserved In-Reply-To: References: <5c71ad6c-fa23-49b9-aab4-fd6f8828d1b0@oracle.com> Message-ID: <9f3381ec-332f-9642-a278-da61936c5958@oracle.com> Thanks for reviewing, Stefan and Erik! /Per On 06/26/2018 10:00 AM, Erik Helin wrote: > On 06/26/2018 09:49 AM, Per Liden wrote: >> In ZGC, we currently don't log metaspace >> used/capacity/committed/reserved. This is useful information, which >> should be logged. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205663 >> Webrev: http://cr.openjdk.java.net/~pliden/8205663/webrev.0 > > Looks good, Reviewed. > > Thanks, > Erik From thomas.schatzl at oracle.com Tue Jun 26 10:25:40 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Jun 2018 12:25:40 +0200 Subject: RFC: Patch for 8203848: Missing remembered set entry in j.l.ref.references after JDK-8203028 In-Reply-To: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> References: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> Message-ID: <1afe1980057950f4df1cb0159f03039eda5154cf.camel@oracle.com> Hi, On Thu, 2018-06-07 at 11:47 +0200, Thomas Schatzl wrote: any comments on this issue? Thanks, Thomas > Hi all, > > I would like to ask for comments on the fix for the issue handled > in > JDK-8203848. > > In particular, the problem is that currently the "discovered" field > of > j.l.ref.References is managed completely opaque to the rest of the > VM, > which causes the described error: remembered set entries are missing > for that reference when doing *concurrent* reference discovery. > > There is no problem with liveness of objects referenced by that > because > a) a j.l.ref.Reference object found by reference discovery will be > automatically kept alive and > b) no GC in Hotspot at this time evacuates old gen objects during > marking (and Z does not use the reference processing framework at > all), > so that reference in the "discovered" field will never be outdated. > > However in the future, G1 might want to move objects in old gen at > any > time for e.g. defragmentation purposes, and I am a bit unclear about > Shenandoah tbh :) > > I see two solutions for this issue: > - improve the access modifier so that at least the post-barrier that > is > responsible for adding remembered set entries is invoked on this > field. > E.g. in ReferenceProcessor::add_to_discovered_list_mt(), instead of > > oop retest = RawAccess<>::oop_atomic_cmpxchg(next_discovered, > discovered_addr, oop(NULL)); > > do a > > oop retest = > HeapAccess::oop_atomic_cmpxchg(next_discovered, > discovered_addr, oop(NULL)); > > Note that I am almost confident that this only makes G1 work as far > as > I understand the access framework; since the previous value is NULL > when we cmpxchg, G1 can skip the pre-barrier; maybe more is needed > for > Shenandoah, but I hope that Shenandoah devs can chime in here. > > I tested this, and with this change the problem does not occur after > 2000 iterations of the test. > > (see the preliminary webrev at http://cr.openjdk.java.net/~tschatzl/8 > 20 > 3848/webrev/ ; only the change to referenceProcessor.cpp is relevant > here). > > - the other "solution" is to fix the remembered set verification to > ignore this field, and try to fix this again in the future when/if G1 > evacuates old regions during marking. > > Any comments? > > Thanks, > Thomas > From per.liden at oracle.com Tue Jun 26 10:45:07 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Jun 2018 12:45:07 +0200 Subject: RFR: 8205676: ZGC: Remove TLAB allocations in relocation path Message-ID: <62139932-68af-ae87-335f-c8973c3ad892@oracle.com> During Java-thread aided relocation, ZGC tries to use the TLAB to allocate the new object. However, this interacts badly with JEP 331: Low-Overhead Heap Profiling, as it distorts the profiling statistics. I propose we remove the use of the TALB in the relocation path, essentially changing a thread-local pointer bump to a uncontended CPU-local CAS. Bug: https://bugs.openjdk.java.net/browse/JDK-8205676 Webrev: http://cr.openjdk.java.net/~pliden/8205676/webrev.0 Testing: Currently running benchmarks to verify that this has no unexpected performance hit. I've filed two follow up RFEs (also out for review now), to clean up some functions that are now unused: https://bugs.openjdk.java.net/browse/JDK-8205678 https://bugs.openjdk.java.net/browse/JDK-8205679 /Per From per.liden at oracle.com Tue Jun 26 10:45:10 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Jun 2018 12:45:10 +0200 Subject: RFR: 8205678: ZGC: Remove unused ZAllocationFlags::java_thread() Message-ID: <0a3eba94-4c99-eff6-067a-17bf8bbb789c@oracle.com> After JDK-8205676, the java_thread field in the ZAllocationFlags is unused and can be removed. Bug: https://bugs.openjdk.java.net/browse/JDK-8205678 Webrev: http://cr.openjdk.java.net/~pliden/8205678/webrev.0 /Per From per.liden at oracle.com Tue Jun 26 10:45:12 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Jun 2018 12:45:12 +0200 Subject: RFR: 8205679: Remove unused ThreadLocalAllocBuffer::undo_allocate() Message-ID: After JDK-8205676, the function ThreadLocalAllocBuffer::undo_allocate() is unused and can be removed. Bug: https://bugs.openjdk.java.net/browse/JDK-8205679 Webrev: http://cr.openjdk.java.net/~pliden/8205679/webrev.0 /Per From kim.barrett at oracle.com Tue Jun 26 12:48:42 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 26 Jun 2018 08:48:42 -0400 Subject: RFR: 8205607: Use oop_iterate instead of oop_iterate_no_header In-Reply-To: References: Message-ID: <23894442-BF35-4269-BC05-6659589EF78F@oracle.com> > On Jun 26, 2018, at 3:48 AM, Stefan Karlsson wrote: > On 2018-06-26 09:12, Kim Barrett wrote: >>> On Jun 25, 2018, at 7:25 PM, Stefan Karlsson wrote: >>> >>> On 2018-06-26 00:11, Kim Barrett wrote: >>>> src/hotspot/share/gc/cms/compactibleFreeListSpace.cpp >>>> 2438 class VerifyAllOopsClosure: public BasicOopIterateClosure { >>>> >>>> Seeing this here and elsewhere, I wish BasicOopIterateClosure were >>>> called something like OopIterateNoMetadataClosure or >>>> NoMetadataOopIterateClosure. "Basic" doesn't really tell me much. >>>> >>>> If you agree, such a name change can be another RFE. > [?] > Here are some of the names on our whiteboard: > [?] Looks like there is some overlap in thinking. We can discuss this off-line later. >>>> Some places that used to do oop_iterate_no_header now do oop_iterate >>>> using some arbitrary (based on signature type) OopIterateClosure. >>>> [?] >>> >>> There shouldn't be a difference here. The old code also took a virtual call for the do_metadata function. >> The old code completely ignored do_metadata and the like, didn?t it? Else what was the point of _no_header? > > No. See: > > -int oopDesc::oop_iterate_no_header(OopClosure* blk) { > - // The NoHeaderExtendedOopClosure wraps the OopClosure and proxies all > - // the do_oop calls, but turns off all other features in OopIterateClosure. > - NoHeaderExtendedOopClosure cl(blk); > - return oop_iterate_size(&cl); > -} Oh, right, I forgot about that. Happy to see that gone. > I added them during the permgen removal so that we wouldn't have to change all OopClosures into ExtendedOopClosure, and thereby didn't have to deal with all the baggage that ExtendedOopClosure brought. Now that we are creating a leaner OopIterateClosure, I'd like to get rid of this. > > [?] > It was added as a convenience wrapper, so that closures that didn't need to care about metadata could remain inheriting from OopClosures instead of ExtendedOopClosure. That's all before my time. Thanks for the history lesson; that clears things up for me. From erik.osterlund at oracle.com Tue Jun 26 15:13:56 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 26 Jun 2018 17:13:56 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns Message-ID: Hi, After a bunch of stuff was added to the CollectedHeap object allocation path, it is starting to look quite ugly. It has function pointers passed around with arguments to be sent to them, mixing up allocation, initialization, tracing, sampling and exception handling all over the place. I propose to improve the situation by having an abstract MemAllocator stack object own the allocation path, with suitable subclasses, ObjAllocator, ClassAllocator and ObjArrayAllocator for allocating the 3 types we support allocating on the heap. Variation points now simply use virtual calls, and most of the tracing/sampling code can be moved out from the allocation path. This should make it easier to understand what is going on here, and the different concerns are more separated. A collector can override the virtual member functions for allocating objects, arrays, and classes, and inject their own variation points into this framework. For example for installing a forwarding pointer in the cell header required by Brooks pointers, if you are into that kind of stuff. Bug: https://bugs.openjdk.java.net/browse/JDK-8205683 Webrev: http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02/ Thanks, /Erik From erik.osterlund at oracle.com Tue Jun 26 16:05:57 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 26 Jun 2018 18:05:57 +0200 Subject: RFR: 8205676: ZGC: Remove TLAB allocations in relocation path In-Reply-To: <62139932-68af-ae87-335f-c8973c3ad892@oracle.com> References: <62139932-68af-ae87-335f-c8973c3ad892@oracle.com> Message-ID: <54ff68ba-7082-a49a-530f-9d1a5be64ab1@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2018-06-26 12:45, Per Liden wrote: > During Java-thread aided relocation, ZGC tries to use the TLAB to > allocate the new object. However, this interacts badly with JEP 331: > Low-Overhead Heap Profiling, as it distorts the profiling statistics. > I propose we remove the use of the TALB in the relocation path, > essentially changing a thread-local pointer bump to a uncontended > CPU-local CAS. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205676 > Webrev: http://cr.openjdk.java.net/~pliden/8205676/webrev.0 > > Testing: Currently running benchmarks to verify that this has no > unexpected performance hit. > > I've filed two follow up RFEs (also out for review now), to clean up > some functions that are now unused: > https://bugs.openjdk.java.net/browse/JDK-8205678 > https://bugs.openjdk.java.net/browse/JDK-8205679 > > /Per From shade at redhat.com Tue Jun 26 16:09:04 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 26 Jun 2018 18:09:04 +0200 Subject: RFR: 8205679: Remove unused ThreadLocalAllocBuffer::undo_allocate() In-Reply-To: References: Message-ID: <1815fb83-9bd0-423e-bb36-af14b1033e58@redhat.com> On 06/26/2018 12:45 PM, Per Liden wrote: > After JDK-8205676, the function ThreadLocalAllocBuffer::undo_allocate() is unused and can be removed. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205679 > Webrev: http://cr.openjdk.java.net/~pliden/8205679/webrev.0 The patch looks good, but the method itself looks generic enough to keep around. At some point, Shenandoah may switch to using this instead of PLABs. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From erik.osterlund at oracle.com Tue Jun 26 16:10:55 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 26 Jun 2018 18:10:55 +0200 Subject: RFC: Patch for 8203848: Missing remembered set entry in j.l.ref.references after JDK-8203028 In-Reply-To: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> References: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> Message-ID: <5ca9113a-0c9e-e745-f66c-56faab49ab95@oracle.com> Hi Thomas, A HeapAccess::oop_atomic_cmpxchg as you proposed sounds good to me. Thanks, /Erik On 2018-06-07 11:47, Thomas Schatzl wrote: > Hi all, > > I would like to ask for comments on the fix for the issue handled in > JDK-8203848. > > In particular, the problem is that currently the "discovered" field of > j.l.ref.References is managed completely opaque to the rest of the VM, > which causes the described error: remembered set entries are missing > for that reference when doing *concurrent* reference discovery. > > There is no problem with liveness of objects referenced by that because > a) a j.l.ref.Reference object found by reference discovery will be > automatically kept alive and > b) no GC in Hotspot at this time evacuates old gen objects during > marking (and Z does not use the reference processing framework at all), > so that reference in the "discovered" field will never be outdated. > > However in the future, G1 might want to move objects in old gen at any > time for e.g. defragmentation purposes, and I am a bit unclear about > Shenandoah tbh :) > > I see two solutions for this issue: > - improve the access modifier so that at least the post-barrier that is > responsible for adding remembered set entries is invoked on this field. > E.g. in ReferenceProcessor::add_to_discovered_list_mt(), instead of > > oop retest = RawAccess<>::oop_atomic_cmpxchg(next_discovered, > discovered_addr, oop(NULL)); > > do a > > oop retest = > HeapAccess::oop_atomic_cmpxchg(next_discovered, > discovered_addr, oop(NULL)); > > Note that I am almost confident that this only makes G1 work as far as > I understand the access framework; since the previous value is NULL > when we cmpxchg, G1 can skip the pre-barrier; maybe more is needed for > Shenandoah, but I hope that Shenandoah devs can chime in here. > > I tested this, and with this change the problem does not occur after > 2000 iterations of the test. > > (see the preliminary webrev at http://cr.openjdk.java.net/~tschatzl/820 > 3848/webrev/ ; only the change to referenceProcessor.cpp is relevant > here). > > - the other "solution" is to fix the remembered set verification to > ignore this field, and try to fix this again in the future when/if G1 > evacuates old regions during marking. > > Any comments? > > Thanks, > Thomas > From erik.osterlund at oracle.com Tue Jun 26 16:12:27 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 26 Jun 2018 18:12:27 +0200 Subject: RFR: 8205678: ZGC: Remove unused ZAllocationFlags::java_thread() In-Reply-To: <0a3eba94-4c99-eff6-067a-17bf8bbb789c@oracle.com> References: <0a3eba94-4c99-eff6-067a-17bf8bbb789c@oracle.com> Message-ID: <82543e43-d629-cd82-10ee-0540afd0826a@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2018-06-26 12:45, Per Liden wrote: > After JDK-8205676, the java_thread field in the ZAllocationFlags is > unused and can be removed. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205678 > Webrev: http://cr.openjdk.java.net/~pliden/8205678/webrev.0 > > /Per From thomas.schatzl at oracle.com Tue Jun 26 17:15:14 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Jun 2018 19:15:14 +0200 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink Message-ID: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> Hi all, can I have reviews for this bug in keeping remembered sets consistent between HC and HS regions, causing crashes with verification? The problem occurs during updating the remembered sets during the Remark pause. This process is parallel; it uses liveness information from marking to set the new remembered set states. However during marking G1 attributes all liveness information of a humongous object to the HS region; if that liveness information has not been updated yet for HC regions, and another thread is responsible for determining that HC region's remembered set state, the new remembered set state of the HC region will get a state as the HS region. The fix is to, for HC regions, just pass the liveness data of the HS region into the method that determines the new remembered set state. Further, in that latter method, make sure that the predicate for determining whether a region gets a remembered set assigned is completely disjoint for humongous and non-humongous regions. CR: https://bugs.openjdk.java.net/browse/JDK-8205426 Webrev: http://cr.openjdk.java.net/~tschatzl/8205426/webrev/ Testing: new test case, hs-tier1-4,jdk-tier1-3 Thanks, Thomas From coleen.phillimore at oracle.com Tue Jun 26 21:13:45 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 26 Jun 2018 17:13:45 -0400 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp Message-ID: Summary: Disable CDS with ZGC Tested with: java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8205702 Thanks, Coleen From calvin.cheung at oracle.com Tue Jun 26 21:42:36 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 26 Jun 2018 14:42:36 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: References: Message-ID: <5B32B34C.2060506@oracle.com> Hi Coleen, The code changes look good. Since there's a new error message, I'd suggest adding a test to runtime/SharedArchiveFile/SharedArchiveFile.java as follows: diff --git a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java --- a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java +++ b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java @@ -52,5 +52,13 @@ "-Xshare:on", "-version"); out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); CDSTestUtils.checkExec(out); + + // CDS dumping doesn't work with ZGC + ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(true, + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", + "-XX:+UseZGC", + "-Xshare:dump"); + out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); + CDSTestUtils.checkExecExpectError(out, 1, "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); } } (I haven't tested the above) Also, I think the new error message should be included in the release notes. thanks, Calvin On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: > Summary: Disable CDS with ZGC > > Tested with: > java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump > java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version > > open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8205702 > > Thanks, > Coleen From jiangli.zhou at oracle.com Tue Jun 26 21:57:10 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 26 Jun 2018 14:57:10 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: References: Message-ID: <81FCAF82-25AB-4C46-86AA-3D5F1A236C58@oracle.com> Hi Coleen, This looks good. Should we also disable UseSharedSpaces at runtime for ZGC in case an archive was dumped using a different GC algorithm? I ran into tmpfs error when trying to run with ZGC, so I couldn?t double check for that case... Thanks, Jiangli > On Jun 26, 2018, at 2:13 PM, coleen.phillimore at oracle.com wrote: > > Summary: Disable CDS with ZGC > > Tested with: > java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump > java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version > > open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8205702 > > Thanks, > Coleen From coleen.phillimore at oracle.com Tue Jun 26 23:00:24 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 26 Jun 2018 19:00:24 -0400 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <81FCAF82-25AB-4C46-86AA-3D5F1A236C58@oracle.com> References: <81FCAF82-25AB-4C46-86AA-3D5F1A236C58@oracle.com> Message-ID: On 6/26/18 5:57 PM, Jiangli Zhou wrote: > Hi Coleen, > > This looks good. > > Should we also disable UseSharedSpaces at runtime for ZGC in case an archive was dumped using a different GC algorithm? I ran into tmpfs error when trying to run with ZGC, so I couldn?t double check for that case... Yes, this change does this also. Thanks, Coleen > > Thanks, > Jiangli > >> On Jun 26, 2018, at 2:13 PM, coleen.phillimore at oracle.com wrote: >> >> Summary: Disable CDS with ZGC >> >> Tested with: >> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Tue Jun 26 23:09:42 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 26 Jun 2018 19:09:42 -0400 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <5B32B34C.2060506@oracle.com> References: <5B32B34C.2060506@oracle.com> Message-ID: <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> Hi Calvin, thank you for reporting the bug and the code review and test code. On 6/26/18 5:42 PM, Calvin Cheung wrote: > Hi Coleen, > > The code changes look good. > > Since there's a new error message, I'd suggest adding a test to > runtime/SharedArchiveFile/SharedArchiveFile.java as follows: > > diff --git > a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java > b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java > --- a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java > +++ b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java > @@ -52,5 +52,13 @@ > ?????????????????????????????? "-Xshare:on", "-version"); > ???????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); > ???????? CDSTestUtils.checkExec(out); > + > +??????? // CDS dumping doesn't work with ZGC > +??????? ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(true, > + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", > +??????????????????????????????? "-XX:+UseZGC", > +??????????????????????????????? "-Xshare:dump"); > +??????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); > +??????? CDSTestUtils.checkExecExpectError(out, 1, "DumpSharedSpaces > (-Xshare:dump) is not supported with ZGC."); > ???? } > ?} > > (I haven't tested the above) It needed an -XX:+UnlockExperimentalVMOptions as well, and not reclare pb. open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev > > Also, I think the new error message should be included in the release > notes. > I added the test case and it passes.? I don't think having a release note for something that nobody would ever do for an experimental option is worth having.?? But I can look into the ZGC release notes and see if there's something that says CDS is not supported. Thanks, Coleen > thanks, > Calvin > > On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >> Summary: Disable CDS with ZGC >> >> Tested with: >> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >> >> Thanks, >> Coleen From jiangli.zhou at oracle.com Tue Jun 26 23:24:47 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 26 Jun 2018 16:24:47 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: References: <81FCAF82-25AB-4C46-86AA-3D5F1A236C58@oracle.com> Message-ID: <2D4CBC4C-6831-4774-8C02-2070CE5B53ED@oracle.com> > On Jun 26, 2018, at 4:00 PM, coleen.phillimore at oracle.com wrote: > > > > On 6/26/18 5:57 PM, Jiangli Zhou wrote: >> Hi Coleen, >> >> This looks good. >> >> Should we also disable UseSharedSpaces at runtime for ZGC in case an archive was dumped using a different GC algorithm? I ran into tmpfs error when trying to run with ZGC, so I couldn?t double check for that case... > > Yes, this change does this also. You are right! Thanks, Jiangli > Thanks, > Coleen > >> >> Thanks, >> Jiangli >> >>> On Jun 26, 2018, at 2:13 PM, coleen.phillimore at oracle.com wrote: >>> >>> Summary: Disable CDS with ZGC >>> >>> Tested with: >>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>> >>> Thanks, >>> Coleen > From calvin.cheung at oracle.com Tue Jun 26 23:50:24 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 26 Jun 2018 16:50:24 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> Message-ID: <5B32D140.9000807@oracle.com> On 6/26/18, 4:09 PM, coleen.phillimore at oracle.com wrote: > > Hi Calvin, thank you for reporting the bug and the code review and > test code. > > On 6/26/18 5:42 PM, Calvin Cheung wrote: >> Hi Coleen, >> >> The code changes look good. >> >> Since there's a new error message, I'd suggest adding a test to >> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >> >> diff --git >> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >> --- >> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >> +++ >> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >> @@ -52,5 +52,13 @@ >> "-Xshare:on", "-version"); >> out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >> CDSTestUtils.checkExec(out); >> + >> + // CDS dumping doesn't work with ZGC >> + ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(true, >> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >> + "-XX:+UseZGC", >> + "-Xshare:dump"); >> + out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >> + CDSTestUtils.checkExecExpectError(out, 1, "DumpSharedSpaces >> (-Xshare:dump) is not supported with ZGC."); >> } >> } >> >> (I haven't tested the above) > > It needed an -XX:+UnlockExperimentalVMOptions as well, and not reclare > pb. > > open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev Looks good. > >> >> Also, I think the new error message should be included in the release >> notes. >> > > I added the test case and it passes. I don't think having a release > note for something that nobody would ever do for an experimental > option is worth having. But I can look into the ZGC release notes > and see if there's something that says CDS is not supported. Perhaps you can add something to https://bugs.openjdk.java.net/browse/JDK-8205334? thanks, Calvin > > Thanks, > Coleen >> thanks, >> Calvin >> >> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>> Summary: Disable CDS with ZGC >>> >>> Tested with: >>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>> >>> Thanks, >>> Coleen > From jiangli.zhou at oracle.com Wed Jun 27 00:22:05 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 26 Jun 2018 17:22:05 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <5B32D140.9000807@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <5B32D140.9000807@oracle.com> Message-ID: +1 Thanks, Jiangli > On Jun 26, 2018, at 4:50 PM, Calvin Cheung wrote: > > > > On 6/26/18, 4:09 PM, coleen.phillimore at oracle.com wrote: >> >> Hi Calvin, thank you for reporting the bug and the code review and test code. >> >> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>> Hi Coleen, >>> >>> The code changes look good. >>> >>> Since there's a new error message, I'd suggest adding a test to runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>> >>> diff --git a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>> --- a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>> +++ b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>> @@ -52,5 +52,13 @@ >>> "-Xshare:on", "-version"); >>> out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>> CDSTestUtils.checkExec(out); >>> + >>> + // CDS dumping doesn't work with ZGC >>> + ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(true, >>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>> + "-XX:+UseZGC", >>> + "-Xshare:dump"); >>> + out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>> + CDSTestUtils.checkExecExpectError(out, 1, "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>> } >>> } >>> >>> (I haven't tested the above) >> >> It needed an -XX:+UnlockExperimentalVMOptions as well, and not reclare pb. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev > Looks good. >> >>> >>> Also, I think the new error message should be included in the release notes. >>> >> >> I added the test case and it passes. I don't think having a release note for something that nobody would ever do for an experimental option is worth having. But I can look into the ZGC release notes and see if there's something that says CDS is not supported. > Perhaps you can add something to https://bugs.openjdk.java.net/browse/JDK-8205334? > > thanks, > Calvin >> >> Thanks, >> Coleen >>> thanks, >>> Calvin >>> >>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: Disable CDS with ZGC >>>> >>>> Tested with: >>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>> >>>> Thanks, >>>> Coleen >> From HORIE at jp.ibm.com Wed Jun 27 00:22:39 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Wed, 27 Jun 2018 09:22:39 +0900 Subject: RFR: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space Message-ID: Dear all, Would you please review the following change? Bug: https://bugs.openjdk.java.net/browse/JDK-8205908 Webrev: http://cr.openjdk.java.net/~mhorie/8205908/webrev.00/ [Current implementation] ParNewGeneration::copy_to_survivor_space tries to move live objects to a different location. There are two patterns on how to copy an object depending on whether there is space to allocate new_obj in to-space or not. If a thread cannot find space to allocate new_obj in to-space, the thread first executes the CAS with a dummy forwarding pointer "ClaimedForwardPtr", which is a sentinel to mark an object as claimed. After succeeding in the CAS, a thread can copy the new_obj in the old space. Here, suppose thread A succeeds in the CAS, while thread B fails in the CAS. When thread A finishes the copy, it replaces the dummy forwarding pointer with a real forwarding pointer. After thread B fails in the CAS, thread B returns the forwardee after waiting for the copy of the forwardee is completed. This is observable by checking the dummy forwarding pointer is replaced with a real forwarding pointer by thread A. In contrast, if a thread can find space to allocate new_obj in to-space, the thread first copies the new_obj and then executes the CAS with the new_obj. If a thread fails in the CAS, it deallocates the copied new_obj and returns the forwardee. Procedure of ParNewGeneration::copy_to_survivor_space : ([L****] represents the line number in src/hotspot/share/gc/cms/parNewGeneration.cpp) 1. Try to each allocate space for new_obj in to-space [L.1110] 2. If fail in the allocation in to-space [L1117] 2.1. Execute the CAS with the dummy forwarding pointer [L1122] ??? (A) 2.2. If fail in the CAS, return the forwardee via real_forwardee() [L1123] 2.3. If succeed in the CAS [L1128] 2.3.1. If promotion is allowed, copy new_obj in the old area [L1129] 2.3.2. If promotion is not allowed, forward to obj itself [L1133] 2.4. Set new_obj as forwardee [L1142] 3. If succeed in the allocation in to-space [L1144] 3.1. Copy new_obj [L1146] 3.2. Execute the CAS with new_obj [L1148] ??? (B) 4. Dereference the new_obj for logging. Each new_obj copied by each thread at step 3.1 is used instead of forwardee() [L1159] 5. If succeed in either CAS (A) or CAS (B), return new_obj [L1163] 6. If fail in CAS (B), get the forwardee via real_forwardee(). Unallocate new_obj in to-space [L1193] 7. Return forwardee [L1203] For reference, real_forwardee() is as shown below: oop ParNewGeneration::real_forwardee(oop obj) { oop forward_ptr = obj->forwardee(); if (forward_ptr != ClaimedForwardPtr) { return forward_ptr; } else { // manually inlined for readability. oop forward_ptr = obj->forwardee(); while (forward_ptr == ClaimedForwardPtr) { waste_some_time(); forward_ptr = obj->forwardee(); } return forward_ptr; } } Regarding the CAS (A), There is no copy before the CAS. Dereferencing the forwardee must be allowed after obtaining the forwardee. Regarding the CAS (B), There is a copy before the CAS. Dereferencing the forwardee must be allowed after obtaining the forwardee. [Observation on the current implementation] No fence is necessary before and after the CAS (A). Release barrier is necessary before the CAS (B). The forwardee_acquire() must be used instead of forwardee() in real_forwardee(). [Performance measurement] The critical-jOPS of SPECjbb2015 improved by 12% with this change. Best regards, -- Michihiro, IBM Research - Tokyo -------------- next part -------------- An HTML attachment was scrubbed... URL: From per.liden at oracle.com Wed Jun 27 07:15:04 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 09:15:04 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> Message-ID: <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> Hi Coleen, This doesn't look quite right to me. ZGC already disables UseCompressedOop and UseCompressedClassPointers, which should be the indicators that we can't use CDS. The problem is that CDS checks those flags _before_ the heap has had a change to say they it supports. So if we just move the call to set_shared_spaces_flags() after the call to GCConfig::arguments()->initialize() (which should be safe), then we're all good and you'll get the usual: $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump -XX:+UnlockExperimentalVMOptions -XX:+UseZGC Error occurred during initialization of VM Cannot dump shared archive when UseCompressedOops or UseCompressedClassPointers is off. Here's a proposed patch for this, which also adjusts the appropriate tests for this: http://cr.openjdk.java.net/~pliden/8205702/webrev.0 cheers, Per On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: > > Hi Calvin, thank you for reporting the bug and the code review and test > code. > > On 6/26/18 5:42 PM, Calvin Cheung wrote: >> Hi Coleen, >> >> The code changes look good. >> >> Since there's a new error message, I'd suggest adding a test to >> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >> >> diff --git >> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >> --- a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >> +++ b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >> @@ -52,5 +52,13 @@ >> ?????????????????????????????? "-Xshare:on", "-version"); >> ???????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >> ???????? CDSTestUtils.checkExec(out); >> + >> +??????? // CDS dumping doesn't work with ZGC >> +??????? ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(true, >> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >> +??????????????????????????????? "-XX:+UseZGC", >> +??????????????????????????????? "-Xshare:dump"); >> +??????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >> +??????? CDSTestUtils.checkExecExpectError(out, 1, "DumpSharedSpaces >> (-Xshare:dump) is not supported with ZGC."); >> ???? } >> ?} >> >> (I haven't tested the above) > > It needed an -XX:+UnlockExperimentalVMOptions as well, and not reclare pb. > > open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev > >> >> Also, I think the new error message should be included in the release >> notes. >> > > I added the test case and it passes.? I don't think having a release > note for something that nobody would ever do for an experimental option > is worth having.?? But I can look into the ZGC release notes and see if > there's something that says CDS is not supported. > > Thanks, > Coleen >> thanks, >> Calvin >> >> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>> Summary: Disable CDS with ZGC >>> >>> Tested with: >>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>> >>> Thanks, >>> Coleen > From per.liden at oracle.com Wed Jun 27 07:33:55 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 09:33:55 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <81FCAF82-25AB-4C46-86AA-3D5F1A236C58@oracle.com> References: <81FCAF82-25AB-4C46-86AA-3D5F1A236C58@oracle.com> Message-ID: <78b4517e-df7d-1e01-19e7-eff5ec79e1ef@oracle.com> On 06/26/2018 11:57 PM, Jiangli Zhou wrote: > Hi Coleen, > > This looks good. > > Should we also disable UseSharedSpaces at runtime for ZGC in case an archive was dumped using a different GC algorithm? I ran into tmpfs error when trying to run with ZGC, so I couldn?t double check for that case... What kind of tmpfs errors? I would like to know if we have a bug somewhere or if it's user error. /Per > > Thanks, > Jiangli > >> On Jun 26, 2018, at 2:13 PM, coleen.phillimore at oracle.com wrote: >> >> Summary: Disable CDS with ZGC >> >> Tested with: >> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >> >> Thanks, >> Coleen > From per.liden at oracle.com Wed Jun 27 07:37:44 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 09:37:44 +0200 Subject: RFR: 8205678: ZGC: Remove unused ZAllocationFlags::java_thread() In-Reply-To: <82543e43-d629-cd82-10ee-0540afd0826a@oracle.com> References: <0a3eba94-4c99-eff6-067a-17bf8bbb789c@oracle.com> <82543e43-d629-cd82-10ee-0540afd0826a@oracle.com> Message-ID: <556c2a4b-c225-b64a-cc10-4a1c9066028a@oracle.com> Thanks Erik! /Per On 06/26/2018 06:12 PM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2018-06-26 12:45, Per Liden wrote: >> After JDK-8205676, the java_thread field in the ZAllocationFlags is >> unused and can be removed. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205678 >> Webrev: http://cr.openjdk.java.net/~pliden/8205678/webrev.0 >> >> /Per > From erik.gahlin at oracle.com Wed Jun 27 07:49:13 2018 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Wed, 27 Jun 2018 09:49:13 +0200 Subject: RFR: 8197425: Liveset information for Old Object sample event In-Reply-To: <5B2927F6.1090804@oracle.com> References: <5B2927F6.1090804@oracle.com> Message-ID: <5B334179.7070309@oracle.com> Here is an updated webrev. http://cr.openjdk.java.net/~egahlin/8197425_2 Erik > Hi, > > Could I have a review of an enhancement that adds heap usage after GC > to the Old Object Sample (OOS) event. This is useful for spotting an > increasing liveset. > > Recordings typically only contain data for a limited period, i .e. > last 30 minutes, but the OOS event contains samples from when JFR/JVM > was started, potentially several days back. > > The liveset trend is useful for tools such as Mission Control to > detect if there is a memory leak. If that is the case, information in > the OOS event can be used to pinpoint where the leak occurred and what > is keeping it alive. Presenting this information when there is not > memory leak is confusing. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8197425 > > Webrev: > http://cr.openjdk.java.net/~egahlin/8197425/ > > Testings: > Tests in test/jdk/jdk/jfr > > Thanks > Erik From markus.gronlund at oracle.com Wed Jun 27 08:35:34 2018 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Wed, 27 Jun 2018 01:35:34 -0700 (PDT) Subject: RFR: 8197425: Liveset information for Old Object sample event In-Reply-To: <5B334179.7070309@oracle.com> References: <5B2927F6.1090804@oracle.com> <5B334179.7070309@oracle.com> Message-ID: Hi Erik, Looks good. Thanks Markus -----Original Message----- From: Erik Gahlin Sent: den 27 juni 2018 09:49 To: hotspot-jfr-dev at openjdk.java.net Cc: hotspot-gc-dev at openjdk.java.net Subject: RFR: 8197425: Liveset information for Old Object sample event Here is an updated webrev. http://cr.openjdk.java.net/~egahlin/8197425_2 Erik > Hi, > > Could I have a review of an enhancement that adds heap usage after GC > to the Old Object Sample (OOS) event. This is useful for spotting an > increasing liveset. > > Recordings typically only contain data for a limited period, i .e. > last 30 minutes, but the OOS event contains samples from when JFR/JVM > was started, potentially several days back. > > The liveset trend is useful for tools such as Mission Control to > detect if there is a memory leak. If that is the case, information in > the OOS event can be used to pinpoint where the leak occurred and what > is keeping it alive. Presenting this information when there is not > memory leak is confusing. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8197425 > > Webrev: > http://cr.openjdk.java.net/~egahlin/8197425/ > > Testings: > Tests in test/jdk/jdk/jfr > > Thanks > Erik From per.liden at oracle.com Wed Jun 27 08:37:13 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 10:37:13 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> Message-ID: Updated webrev, which adjusts the @requires tag, from: @requires vm.cds & vm.gc != "Z" to: @requires vm.cds.archived.java.heap which I believe is more correct in this case. http://cr.openjdk.java.net/~pliden/8205702/webrev.1 cheers, Per On 06/27/2018 09:15 AM, Per Liden wrote: > Hi Coleen, > > This doesn't look quite right to me. ZGC already disables > UseCompressedOop and UseCompressedClassPointers, which should be the > indicators that we can't use CDS. The problem is that CDS checks those > flags _before_ the heap has had a change to say they it supports. So if > we just move the call to set_shared_spaces_flags() after the call to > GCConfig::arguments()->initialize() (which should be safe), then we're > all good and you'll get the usual: > > $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump > -XX:+UnlockExperimentalVMOptions -XX:+UseZGC > Error occurred during initialization of VM > Cannot dump shared archive when UseCompressedOops or > UseCompressedClassPointers is off. > > Here's a proposed patch for this, which also adjusts the appropriate > tests for this: > > http://cr.openjdk.java.net/~pliden/8205702/webrev.0 > > cheers, > Per > > On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >> >> Hi Calvin, thank you for reporting the bug and the code review and >> test code. >> >> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>> Hi Coleen, >>> >>> The code changes look good. >>> >>> Since there's a new error message, I'd suggest adding a test to >>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>> >>> diff --git >>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>> --- >>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>> +++ >>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>> @@ -52,5 +52,13 @@ >>> ?????????????????????????????? "-Xshare:on", "-version"); >>> ???????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>> ???????? CDSTestUtils.checkExec(out); >>> + >>> +??????? // CDS dumping doesn't work with ZGC >>> +??????? ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(true, >>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>> +??????????????????????????????? "-XX:+UseZGC", >>> +??????????????????????????????? "-Xshare:dump"); >>> +??????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>> +??????? CDSTestUtils.checkExecExpectError(out, 1, "DumpSharedSpaces >>> (-Xshare:dump) is not supported with ZGC."); >>> ???? } >>> ?} >>> >>> (I haven't tested the above) >> >> It needed an -XX:+UnlockExperimentalVMOptions as well, and not reclare >> pb. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >> >>> >>> Also, I think the new error message should be included in the release >>> notes. >>> >> >> I added the test case and it passes.? I don't think having a release >> note for something that nobody would ever do for an experimental >> option is worth having.?? But I can look into the ZGC release notes >> and see if there's something that says CDS is not supported. >> >> Thanks, >> Coleen >>> thanks, >>> Calvin >>> >>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: Disable CDS with ZGC >>>> >>>> Tested with: >>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>> >>>> Thanks, >>>> Coleen >> From per.liden at oracle.com Wed Jun 27 08:41:24 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 10:41:24 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <5B32D140.9000807@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <5B32D140.9000807@oracle.com> Message-ID: Hi, On 06/27/2018 01:50 AM, Calvin Cheung wrote: [...] >>> Also, I think the new error message should be included in the release >>> notes. >>> >> >> I added the test case and it passes.? I don't think having a release >> note for something that nobody would ever do for an experimental >> option is worth having.?? But I can look into the ZGC release notes >> and see if there's something that says CDS is not supported. > Perhaps you can add something to > https://bugs.openjdk.java.net/browse/JDK-8205334? I'll add a note saying ZGC doesn't support compressed oops. /Per > > thanks, > Calvin >> >> Thanks, >> Coleen >>> thanks, >>> Calvin >>> >>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: Disable CDS with ZGC >>>> >>>> Tested with: >>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>> >>>> Thanks, >>>> Coleen >> From per.liden at oracle.com Wed Jun 27 08:44:53 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 10:44:53 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <5B32D140.9000807@oracle.com> Message-ID: <5d9d2c7f-82ef-21d8-f94d-0f6ffd5e898e@oracle.com> On 06/27/2018 10:41 AM, Per Liden wrote: > Hi, > > On 06/27/2018 01:50 AM, Calvin Cheung wrote: > [...] >>>> Also, I think the new error message should be included in the >>>> release notes. >>>> >>> >>> I added the test case and it passes.? I don't think having a release >>> note for something that nobody would ever do for an experimental >>> option is worth having.?? But I can look into the ZGC release notes >>> and see if there's something that says CDS is not supported. >> Perhaps you can add something to >> https://bugs.openjdk.java.net/browse/JDK-8205334? > > I'll add a note saying ZGC doesn't support compressed oops. Done! /Per From per.liden at oracle.com Wed Jun 27 08:53:50 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 10:53:50 +0200 Subject: RFR: 8205679: Remove unused ThreadLocalAllocBuffer::undo_allocate() In-Reply-To: <1815fb83-9bd0-423e-bb36-af14b1033e58@redhat.com> References: <1815fb83-9bd0-423e-bb36-af14b1033e58@redhat.com> Message-ID: <0c186b36-db83-9f8a-2f32-c6415daeba2a@oracle.com> Hi Aleksey, Thanks for reviewing. On 06/26/2018 06:09 PM, Aleksey Shipilev wrote: > On 06/26/2018 12:45 PM, Per Liden wrote: >> After JDK-8205676, the function ThreadLocalAllocBuffer::undo_allocate() is unused and can be removed. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205679 >> Webrev: http://cr.openjdk.java.net/~pliden/8205679/webrev.0 > > The patch looks good, but the method itself looks generic enough to keep around. At some point, > Shenandoah may switch to using this instead of PLABs. We typically don't leave unused code around (as it tends to rot quickly), unless we know it will be used in the near/mid-term future. So, I guess the question is, are you already using this in the shanandoah repo, or will soon-ish? [1] If not I'd prefer to remove it. The hg history will preserve it, so it's trivial to bring it back if you find that you really do needed it in the future. cheers, Per [1] Btw, if you do, will you not also have potentially bad interactions with the JEP 331: Low-Overhead Heap Profiling, which is the reason ZGC isn't using the TLAB anymore? From per.liden at oracle.com Wed Jun 27 09:33:45 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 11:33:45 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> Message-ID: <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> Actually, that seems a bit too restrictive as vm.cds.archived.java.heap is only true when G1 is enabled. So, this is probably even better: * @requires vm.cds * @requires vm.opt.final.UseCompressedOops * @requires vm.opt.final.UseCompressedClassPointers Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 /Per On 06/27/2018 10:37 AM, Per Liden wrote: > Updated webrev, which adjusts the @requires tag, from: > > ? @requires vm.cds & vm.gc != "Z" > > to: > > ? @requires vm.cds.archived.java.heap > > which I believe is more correct in this case. > > http://cr.openjdk.java.net/~pliden/8205702/webrev.1 > > cheers, > Per > > > On 06/27/2018 09:15 AM, Per Liden wrote: >> Hi Coleen, >> >> This doesn't look quite right to me. ZGC already disables >> UseCompressedOop and UseCompressedClassPointers, which should be the >> indicators that we can't use CDS. The problem is that CDS checks those >> flags _before_ the heap has had a change to say they it supports. So >> if we just move the call to set_shared_spaces_flags() after the call >> to GCConfig::arguments()->initialize() (which should be safe), then >> we're all good and you'll get the usual: >> >> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >> Error occurred during initialization of VM >> Cannot dump shared archive when UseCompressedOops or >> UseCompressedClassPointers is off. >> >> Here's a proposed patch for this, which also adjusts the appropriate >> tests for this: >> >> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >> >> cheers, >> Per >> >> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>> >>> Hi Calvin, thank you for reporting the bug and the code review and >>> test code. >>> >>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>> Hi Coleen, >>>> >>>> The code changes look good. >>>> >>>> Since there's a new error message, I'd suggest adding a test to >>>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>> >>>> diff --git >>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>> >>>> --- >>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>> +++ >>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>> @@ -52,5 +52,13 @@ >>>> ?????????????????????????????? "-Xshare:on", "-version"); >>>> ???????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>> ???????? CDSTestUtils.checkExec(out); >>>> + >>>> +??????? // CDS dumping doesn't work with ZGC >>>> +??????? ProcessBuilder pb = >>>> ProcessTools.createJavaProcessBuilder(true, >>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>> +??????????????????????????????? "-XX:+UseZGC", >>>> +??????????????????????????????? "-Xshare:dump"); >>>> +??????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>> +??????? CDSTestUtils.checkExecExpectError(out, 1, "DumpSharedSpaces >>>> (-Xshare:dump) is not supported with ZGC."); >>>> ???? } >>>> ?} >>>> >>>> (I haven't tested the above) >>> >>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>> reclare pb. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>> >>>> >>>> Also, I think the new error message should be included in the >>>> release notes. >>>> >>> >>> I added the test case and it passes.? I don't think having a release >>> note for something that nobody would ever do for an experimental >>> option is worth having.?? But I can look into the ZGC release notes >>> and see if there's something that says CDS is not supported. >>> >>> Thanks, >>> Coleen >>>> thanks, >>>> Calvin >>>> >>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Disable CDS with ZGC >>>>> >>>>> Tested with: >>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>> >>>>> Thanks, >>>>> Coleen >>> From per.liden at oracle.com Wed Jun 27 10:43:38 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 12:43:38 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: References: Message-ID: Hi Erik, On 06/26/2018 05:13 PM, Erik ?sterlund wrote: > Hi, > > After a bunch of stuff was added to the CollectedHeap object allocation > path, it is starting to look quite ugly. It has function pointers passed > around with arguments to be sent to them, mixing up allocation, > initialization, tracing, sampling and exception handling all over the > place. > > I propose to improve the situation by having an abstract MemAllocator > stack object own the allocation path, with suitable subclasses, > ObjAllocator, ClassAllocator and ObjArrayAllocator for allocating the 3 > types we support allocating on the heap. Variation points now simply use > virtual calls, and most of the tracing/sampling code can be moved out > from the allocation path. This should make it easier to understand what > is going on here, and the different concerns are more separated. > > A collector can override the virtual member functions for allocating > objects, arrays, and classes, and inject their own variation points into > this framework. For example for installing a forwarding pointer in the > cell header required by Brooks pointers, if you are into that kind of > stuff. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8205683 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02/ Awesome cleanup Erik! A few comments/questions/suggestions: src/hotspot/share/gc/shared/collectedHeap.cpp --------------------------------------------- 449 oop CollectedHeap::obj_allocate(Klass* klass, int size, TRAPS) { 450 ObjAllocator allocator(klass, size); 451 return allocator.allocate(); 452 } 453 454 oop CollectedHeap::array_allocate(Klass* klass, int size, int length, bool do_zero, TRAPS) { 455 ObjArrayAllocator allocator(klass, size, length, do_zero); 456 return allocator.allocate(); 457 } 458 459 oop CollectedHeap::class_allocate(Klass* klass, int size, TRAPS) { 460 ClassAllocator allocator(klass, size); 461 return allocator.allocate(); 462 } I doesn't look like the above functions need to take in TRAPS, right? src/hotspot/share/gc/shared/memAllocator.hpp -------------------------------------------- 47 HeapWord* allocate_from_tlab(Allocation& allocation) const; 48 HeapWord* allocate_from_tlab_slow(Allocation& allocation) const; 49 HeapWord* allocate_outside_tlab(Allocation& allocation) const; Can we call these allocate_inside_tlab/allocate_outside_tlab? 73 virtual oop init_obj(HeapWord* mem) const; Can we call this initialize() instead? To better line up with allocate(). Further, I'd also like to see initialize() being a pure virtual function and instead do what that function does today in a different function, protected and non-virtual, called finish(). So in the end we'll have: allocate(); initialize(); finish(); I believe such naming/structure would make code like this easier to understand: src/hotspot/share/gc/shared/memAllocator.cpp: 415 mem_clear(mem); 416 java_lang_Class::set_oop_size(mem, (int)_word_size); 417 oop obj = finish(mem); Otherwise, calling init_obj()/initialize() as the last thing here seems a bit odd and suggests the we're re-initializing the object completely. src/hotspot/share/gc/shared/memAllocator.cpp -------------------------------------------- ... 391 oop obj = MemAllocator::init_obj(mem); 392 assert(Universe::is_bootstrapping() || !obj->is_array(), "must not be an array"); 393 return obj; 394 } ... 405 oop obj = MemAllocator::init_obj(mem); 406 assert(obj->is_array(), "must be an array"); 407 return obj; 408 } ... 417 oop obj = MemAllocator::init_obj(mem); 418 assert(Universe::is_bootstrapping() || !obj->is_array(), "must not be an array"); 419 return obj; 420 } The above asserts look unnecessary. If so, it would be nice to just convert this to: ... 391 return finish(mem); 392 } ... 405 return finish(mem); 406 } ... 417 return finish(mem); 418 } 200 HeapWord* mem = (HeapWord*)obj; 201 size_t size_in_bytes = _allocator._word_size * HeapWordSize; 202 ThreadLocalAllocBuffer& tlab = _thread->tlab(); 203 size_t bytes_since_last = _allocated_outside_tlab ? 0 : tlab.bytes_since_last_sample_point(); 204 _thread->heap_sampler().check_for_sampling(mem, size_in_bytes, bytes_since_last); Looks like we can now change the first arg in check_for_sampling() from HeapWord* to oop and avoid the casting above and inside check_for_sampling() itself. src/hotspot/share/gc/shared/threadLocalAllocBuffer.hpp ------------------------------------------------------ 143 HeapWord* allocate_sampled_object(size_t size); Can we call this function allocate_sampled() to better match with allocate()? cheers, Per From erik.helin at oracle.com Wed Jun 27 11:31:17 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 27 Jun 2018 13:31:17 +0200 Subject: RFR: 8197425: Liveset information for Old Object sample event In-Reply-To: <5B334179.7070309@oracle.com> References: <5B2927F6.1090804@oracle.com> <5B334179.7070309@oracle.com> Message-ID: Hi Erik, On 06/27/2018 09:49 AM, Erik Gahlin wrote: > Here is an updated webrev. > > http://cr.openjdk.java.net/~egahlin/8197425_2 I can't comment on the LeakProfiler and/or JFR parts of the patch (essentially the whole patch), but I can say that the patch now uses a much better API from the GC's point of view :) So the usage of Universe::get_heap_used_at_last_gc() seems correct, unfortunately I lack the knowledge to review the rest of the patch :( Thanks, Erik > Erik > >> Hi, >> >> Could I have a review of an enhancement that adds heap usage after GC >> to the Old Object Sample (OOS) event. This is useful for spotting an >> increasing liveset. >> >> Recordings typically only contain data for a limited period, i .e. >> last 30 minutes, but the OOS event contains samples from when JFR/JVM >> was started, potentially several days back. >> >> The liveset trend is useful for tools such as Mission Control to >> detect if there is a memory leak. If that is the case, information in >> the OOS event can be used to pinpoint where the leak occurred and what >> is keeping it alive. Presenting this information when there is not >> memory leak is confusing. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8197425 >> >> Webrev: >> http://cr.openjdk.java.net/~egahlin/8197425/ >> >> Testings: >> Tests in test/jdk/jdk/jfr >> >> Thanks >> Erik > From coleen.phillimore at oracle.com Wed Jun 27 11:39:10 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 27 Jun 2018 07:39:10 -0400 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> Message-ID: Yes, this patch looks better and more general. Thank you for fixing this! Coleen On 6/27/18 5:33 AM, Per Liden wrote: > Actually, that seems a bit too restrictive as > vm.cds.archived.java.heap is only true when G1 is enabled. > > So, this is probably even better: > > ?* @requires vm.cds > ?* @requires vm.opt.final.UseCompressedOops > ?* @requires vm.opt.final.UseCompressedClassPointers > > Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 > > /Per > > On 06/27/2018 10:37 AM, Per Liden wrote: >> Updated webrev, which adjusts the @requires tag, from: >> >> ?? @requires vm.cds & vm.gc != "Z" >> >> to: >> >> ?? @requires vm.cds.archived.java.heap >> >> which I believe is more correct in this case. >> >> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >> >> cheers, >> Per >> >> >> On 06/27/2018 09:15 AM, Per Liden wrote: >>> Hi Coleen, >>> >>> This doesn't look quite right to me. ZGC already disables >>> UseCompressedOop and UseCompressedClassPointers, which should be the >>> indicators that we can't use CDS. The problem is that CDS checks >>> those flags _before_ the heap has had a change to say they it >>> supports. So if we just move the call to set_shared_spaces_flags() >>> after the call to GCConfig::arguments()->initialize() (which should >>> be safe), then we're all good and you'll get the usual: >>> >>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>> Error occurred during initialization of VM >>> Cannot dump shared archive when UseCompressedOops or >>> UseCompressedClassPointers is off. >>> >>> Here's a proposed patch for this, which also adjusts the appropriate >>> tests for this: >>> >>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>> >>> cheers, >>> Per >>> >>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi Calvin, thank you for reporting the bug and the code review and >>>> test code. >>>> >>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>> Hi Coleen, >>>>> >>>>> The code changes look good. >>>>> >>>>> Since there's a new error message, I'd suggest adding a test to >>>>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>> >>>>> diff --git >>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>> --- >>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>> +++ >>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>> @@ -52,5 +52,13 @@ >>>>> ?????????????????????????????? "-Xshare:on", "-version"); >>>>> ???????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>> ???????? CDSTestUtils.checkExec(out); >>>>> + >>>>> +??????? // CDS dumping doesn't work with ZGC >>>>> +??????? ProcessBuilder pb = >>>>> ProcessTools.createJavaProcessBuilder(true, >>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>> +??????????????????????????????? "-XX:+UseZGC", >>>>> +??????????????????????????????? "-Xshare:dump"); >>>>> +??????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>> +??????? CDSTestUtils.checkExecExpectError(out, 1, >>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>> ???? } >>>>> ?} >>>>> >>>>> (I haven't tested the above) >>>> >>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>> reclare pb. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>> >>>>> >>>>> Also, I think the new error message should be included in the >>>>> release notes. >>>>> >>>> >>>> I added the test case and it passes.? I don't think having a >>>> release note for something that nobody would ever do for an >>>> experimental option is worth having.?? But I can look into the ZGC >>>> release notes and see if there's something that says CDS is not >>>> supported. >>>> >>>> Thanks, >>>> Coleen >>>>> thanks, >>>>> Calvin >>>>> >>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Disable CDS with ZGC >>>>>> >>>>>> Tested with: >>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>> From per.liden at oracle.com Wed Jun 27 11:46:13 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 13:46:13 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> Message-ID: Cool! Thanks for reviewing Coleen! It's going through tier1,2,3 right now. cheers, Per On 06/27/2018 01:39 PM, coleen.phillimore at oracle.com wrote: > > Yes, this patch looks better and more general. > Thank you for fixing this! > Coleen > > On 6/27/18 5:33 AM, Per Liden wrote: >> Actually, that seems a bit too restrictive as >> vm.cds.archived.java.heap is only true when G1 is enabled. >> >> So, this is probably even better: >> >> ?* @requires vm.cds >> ?* @requires vm.opt.final.UseCompressedOops >> ?* @requires vm.opt.final.UseCompressedClassPointers >> >> Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 >> >> /Per >> >> On 06/27/2018 10:37 AM, Per Liden wrote: >>> Updated webrev, which adjusts the @requires tag, from: >>> >>> ?? @requires vm.cds & vm.gc != "Z" >>> >>> to: >>> >>> ?? @requires vm.cds.archived.java.heap >>> >>> which I believe is more correct in this case. >>> >>> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >>> >>> cheers, >>> Per >>> >>> >>> On 06/27/2018 09:15 AM, Per Liden wrote: >>>> Hi Coleen, >>>> >>>> This doesn't look quite right to me. ZGC already disables >>>> UseCompressedOop and UseCompressedClassPointers, which should be the >>>> indicators that we can't use CDS. The problem is that CDS checks >>>> those flags _before_ the heap has had a change to say they it >>>> supports. So if we just move the call to set_shared_spaces_flags() >>>> after the call to GCConfig::arguments()->initialize() (which should >>>> be safe), then we're all good and you'll get the usual: >>>> >>>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>>> Error occurred during initialization of VM >>>> Cannot dump shared archive when UseCompressedOops or >>>> UseCompressedClassPointers is off. >>>> >>>> Here's a proposed patch for this, which also adjusts the appropriate >>>> tests for this: >>>> >>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>>> >>>> cheers, >>>> Per >>>> >>>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Hi Calvin, thank you for reporting the bug and the code review and >>>>> test code. >>>>> >>>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> The code changes look good. >>>>>> >>>>>> Since there's a new error message, I'd suggest adding a test to >>>>>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>>> >>>>>> diff --git >>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>> --- >>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>> +++ >>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>> @@ -52,5 +52,13 @@ >>>>>> ?????????????????????????????? "-Xshare:on", "-version"); >>>>>> ???????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>>> ???????? CDSTestUtils.checkExec(out); >>>>>> + >>>>>> +??????? // CDS dumping doesn't work with ZGC >>>>>> +??????? ProcessBuilder pb = >>>>>> ProcessTools.createJavaProcessBuilder(true, >>>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>>> +??????????????????????????????? "-XX:+UseZGC", >>>>>> +??????????????????????????????? "-Xshare:dump"); >>>>>> +??????? out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>>> +??????? CDSTestUtils.checkExecExpectError(out, 1, >>>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>>> ???? } >>>>>> ?} >>>>>> >>>>>> (I haven't tested the above) >>>>> >>>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>>> reclare pb. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>>> >>>>>> >>>>>> Also, I think the new error message should be included in the >>>>>> release notes. >>>>>> >>>>> >>>>> I added the test case and it passes.? I don't think having a >>>>> release note for something that nobody would ever do for an >>>>> experimental option is worth having.?? But I can look into the ZGC >>>>> release notes and see if there's something that says CDS is not >>>>> supported. >>>>> >>>>> Thanks, >>>>> Coleen >>>>>> thanks, >>>>>> Calvin >>>>>> >>>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Disable CDS with ZGC >>>>>>> >>>>>>> Tested with: >>>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>>>>> >>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>> > From rkennke at redhat.com Wed Jun 27 11:49:31 2018 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 27 Jun 2018 13:49:31 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: References: Message-ID: Hi Erik, why is this change needed: -void java_lang_Class::set_oop_size(oop java_class, int size) { +void java_lang_Class::set_oop_size(HeapWord* java_class, int size) { assert(_oop_size_offset != 0, "must be set"); assert(size > 0, "Oop size must be greater than zero, not %d", size); - java_class->int_field_put(_oop_size_offset, size); + *(int*)(((char*)java_class) + _oop_size_offset) = size; } It seems to subvert any barriers that GCs might want to employ? Roman > Hi, > > After a bunch of stuff was added to the CollectedHeap object allocation > path, it is starting to look quite ugly. It has function pointers passed > around with arguments to be sent to them, mixing up allocation, > initialization, tracing, sampling and exception handling all over the > place. > > I propose to improve the situation by having an abstract MemAllocator > stack object own the allocation path, with suitable subclasses, > ObjAllocator, ClassAllocator and ObjArrayAllocator for allocating the 3 > types we support allocating on the heap. Variation points now simply use > virtual calls, and most of the tracing/sampling code can be moved out > from the allocation path. This should make it easier to understand what > is going on here, and the different concerns are more separated. > > A collector can override the virtual member functions for allocating > objects, arrays, and classes, and inject their own variation points into > this framework. For example for installing a forwarding pointer in the > cell header required by Brooks pointers, if you are into that kind of > stuff. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8205683 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02/ > > Thanks, > /Erik From per.liden at oracle.com Wed Jun 27 11:57:05 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 13:57:05 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: References: Message-ID: <8f96e32e-645c-e6f2-cf34-848191660316@oracle.com> Hi Roamn, On 06/27/2018 01:49 PM, Roman Kennke wrote: > Hi Erik, > > why is this change needed: > > -void java_lang_Class::set_oop_size(oop java_class, int size) { > +void java_lang_Class::set_oop_size(HeapWord* java_class, int size) { > assert(_oop_size_offset != 0, "must be set"); > assert(size > 0, "Oop size must be greater than zero, not %d", size); > - java_class->int_field_put(_oop_size_offset, size); > + *(int*)(((char*)java_class) + _oop_size_offset) = size; > } > > It seems to subvert any barriers that GCs might want to employ? This was changed because we don't have an oop here yet. In other words, the memory behind mem is not a properly constructed object at this point (hence a valid oop can't point to it yet). I think set_oop_size() is a bit special in that sense, and I don't think you'll ever want a barrier there, right? /Per > > Roman > > >> Hi, >> >> After a bunch of stuff was added to the CollectedHeap object allocation >> path, it is starting to look quite ugly. It has function pointers passed >> around with arguments to be sent to them, mixing up allocation, >> initialization, tracing, sampling and exception handling all over the >> place. >> >> I propose to improve the situation by having an abstract MemAllocator >> stack object own the allocation path, with suitable subclasses, >> ObjAllocator, ClassAllocator and ObjArrayAllocator for allocating the 3 >> types we support allocating on the heap. Variation points now simply use >> virtual calls, and most of the tracing/sampling code can be moved out >> from the allocation path. This should make it easier to understand what >> is going on here, and the different concerns are more separated. >> >> A collector can override the virtual member functions for allocating >> objects, arrays, and classes, and inject their own variation points into >> this framework. For example for installing a forwarding pointer in the >> cell header required by Brooks pointers, if you are into that kind of >> stuff. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8205683 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02/ >> >> Thanks, >> /Erik > > > From erik.osterlund at oracle.com Wed Jun 27 12:01:22 2018 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 27 Jun 2018 14:01:22 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: <8f96e32e-645c-e6f2-cf34-848191660316@oracle.com> References: <8f96e32e-645c-e6f2-cf34-848191660316@oracle.com> Message-ID: <04C9B66B-AEF9-45AD-976C-246342442E71@oracle.com> Hi Per and Roman, Right - I specifically did not want to send invalid oops that have yet to be constructed through the Access API. Only after the Klass pointer has been set do we have an oop. Thanks, /Erik > On 27 Jun 2018, at 13:57, Per Liden wrote: > > Hi Roamn, > >> On 06/27/2018 01:49 PM, Roman Kennke wrote: >> Hi Erik, >> why is this change needed: >> -void java_lang_Class::set_oop_size(oop java_class, int size) { >> +void java_lang_Class::set_oop_size(HeapWord* java_class, int size) { >> assert(_oop_size_offset != 0, "must be set"); >> assert(size > 0, "Oop size must be greater than zero, not %d", size); >> - java_class->int_field_put(_oop_size_offset, size); >> + *(int*)(((char*)java_class) + _oop_size_offset) = size; >> } >> It seems to subvert any barriers that GCs might want to employ? > > This was changed because we don't have an oop here yet. In other words, the memory behind mem is not a properly constructed object at this point (hence a valid oop can't point to it yet). I think set_oop_size() is a bit special in that sense, and I don't think you'll ever want a barrier there, right? > > /Per > >> Roman >>> Hi, >>> >>> After a bunch of stuff was added to the CollectedHeap object allocation >>> path, it is starting to look quite ugly. It has function pointers passed >>> around with arguments to be sent to them, mixing up allocation, >>> initialization, tracing, sampling and exception handling all over the >>> place. >>> >>> I propose to improve the situation by having an abstract MemAllocator >>> stack object own the allocation path, with suitable subclasses, >>> ObjAllocator, ClassAllocator and ObjArrayAllocator for allocating the 3 >>> types we support allocating on the heap. Variation points now simply use >>> virtual calls, and most of the tracing/sampling code can be moved out >>> from the allocation path. This should make it easier to understand what >>> is going on here, and the different concerns are more separated. >>> >>> A collector can override the virtual member functions for allocating >>> objects, arrays, and classes, and inject their own variation points into >>> this framework. For example for installing a forwarding pointer in the >>> cell header required by Brooks pointers, if you are into that kind of >>> stuff. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8205683 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02/ >>> >>> Thanks, >>> /Erik From rkennke at redhat.com Wed Jun 27 12:15:32 2018 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 27 Jun 2018 14:15:32 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: <04C9B66B-AEF9-45AD-976C-246342442E71@oracle.com> References: <8f96e32e-645c-e6f2-cf34-848191660316@oracle.com> <04C9B66B-AEF9-45AD-976C-246342442E71@oracle.com> Message-ID: <943f6d35-83f9-8691-e21d-82c775d36f25@redhat.com> Ah yes, that seems fine. No, we wouldn't nee any barriers there. I guess, should any GC ever require barriers on this, we'd add it then (via special decorators or such). Roman > Hi Per and Roman > > Right - I specifically did not want to send invalid oops that have yet to be constructed through the Access API. Only after the Klass pointer has been set do we have an oop. > > Thanks, > /Erik > >> On 27 Jun 2018, at 13:57, Per Liden wrote: >> >> Hi Roamn, >> >>> On 06/27/2018 01:49 PM, Roman Kennke wrote: >>> Hi Erik, >>> why is this change needed: >>> -void java_lang_Class::set_oop_size(oop java_class, int size) { >>> +void java_lang_Class::set_oop_size(HeapWord* java_class, int size) { >>> assert(_oop_size_offset != 0, "must be set"); >>> assert(size > 0, "Oop size must be greater than zero, not %d", size); >>> - java_class->int_field_put(_oop_size_offset, size); >>> + *(int*)(((char*)java_class) + _oop_size_offset) = size; >>> } >>> It seems to subvert any barriers that GCs might want to employ? >> >> This was changed because we don't have an oop here yet. In other words, the memory behind mem is not a properly constructed object at this point (hence a valid oop can't point to it yet). I think set_oop_size() is a bit special in that sense, and I don't think you'll ever want a barrier there, right? >> >> /Per >> >>> Roman >>>> Hi, >>>> >>>> After a bunch of stuff was added to the CollectedHeap object allocation >>>> path, it is starting to look quite ugly. It has function pointers passed >>>> around with arguments to be sent to them, mixing up allocation, >>>> initialization, tracing, sampling and exception handling all over the >>>> place. >>>> >>>> I propose to improve the situation by having an abstract MemAllocator >>>> stack object own the allocation path, with suitable subclasses, >>>> ObjAllocator, ClassAllocator and ObjArrayAllocator for allocating the 3 >>>> types we support allocating on the heap. Variation points now simply use >>>> virtual calls, and most of the tracing/sampling code can be moved out >>>> from the allocation path. This should make it easier to understand what >>>> is going on here, and the different concerns are more separated. >>>> >>>> A collector can override the virtual member functions for allocating >>>> objects, arrays, and classes, and inject their own variation points into >>>> this framework. For example for installing a forwarding pointer in the >>>> cell header required by Brooks pointers, if you are into that kind of >>>> stuff. >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8205683 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02/ >>>> >>>> Thanks, >>>> /Erik > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From stefan.karlsson at oracle.com Wed Jun 27 13:10:25 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 27 Jun 2018 15:10:25 +0200 Subject: RFR: 8205922: Add reference iteration mode that skips visiting the referents Message-ID: <2f68cfde-0ebb-9658-3e9c-36303afc07fb@oracle.com> Hi all, Please review this patch to allow us to skip visiting the j.l.Reference::referent during oop_iteration. http://cr.openjdk.java.net/~stefank/8205922/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8205922 One use-case for this is when ZGC does verification at the end of the Mark End pause. At that point we have complete marking information roots and fields are good, but the processing of j.l.Reference::referents are deferred to the concurrent phase after this pause. In this case we would like to skip applying the verification closure to the referents. Thanks, StefanK From stefan.karlsson at oracle.com Wed Jun 27 13:15:42 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 27 Jun 2018 15:15:42 +0200 Subject: RFR: 8205923: ZGC: Verification applies load barriers before verification Message-ID: Hi all, Please review this patch to stop applying load barriers in the Mark End verification. http://cr.openjdk.java.net/~stefank/8205923/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8205923 The recent IN_CONCURRENT_ROOT changes introduced load barriers to our Mark End verification. These load barriers are unnecessary because all roots should have been fixed at that point. The object oop field verification applied a load barrier to be able to load stale j.l.Reference::referents, and as an effect applied load barriers to all fields. This patch skips visiting the referents and uses RawAccess loads for all other fields. This patch builds upon the JDK-8205922, which introduces the mechanism to skip visiting referents in oop_iterate calls. Thanks, StefanK From stefan.karlsson at oracle.com Wed Jun 27 13:18:35 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 27 Jun 2018 15:18:35 +0200 Subject: RFR: 8205676: ZGC: Remove TLAB allocations in relocation path In-Reply-To: <62139932-68af-ae87-335f-c8973c3ad892@oracle.com> References: <62139932-68af-ae87-335f-c8973c3ad892@oracle.com> Message-ID: <907092ee-ca31-7d99-b2c7-7690dd3175c1@oracle.com> Looks good. StefanK On 2018-06-26 12:45, Per Liden wrote: > During Java-thread aided relocation, ZGC tries to use the TLAB to > allocate the new object. However, this interacts badly with JEP 331: > Low-Overhead Heap Profiling, as it distorts the profiling statistics. I > propose we remove the use of the TALB in the relocation path, > essentially changing a thread-local pointer bump to a uncontended > CPU-local CAS. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205676 > Webrev: http://cr.openjdk.java.net/~pliden/8205676/webrev.0 > > Testing: Currently running benchmarks to verify that this has no > unexpected performance hit. > > I've filed two follow up RFEs (also out for review now), to clean up > some functions that are now unused: > https://bugs.openjdk.java.net/browse/JDK-8205678 > https://bugs.openjdk.java.net/browse/JDK-8205679 > > /Per From erik.osterlund at oracle.com Wed Jun 27 13:23:04 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 27 Jun 2018 15:23:04 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: References: Message-ID: <0017cfb1-4067-6670-4fb7-6ad504e0adb1@oracle.com> Hi Per, Thank you for reviewing. Incremental: http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02_03/ Full: http://cr.openjdk.java.net/~eosterlund/8205683/webrev.03/ On 2018-06-27 12:43, Per Liden wrote: > Hi Erik, > > On 06/26/2018 05:13 PM, Erik ?sterlund wrote: >> Hi, >> >> After a bunch of stuff was added to the CollectedHeap object >> allocation path, it is starting to look quite ugly. It has function >> pointers passed around with arguments to be sent to them, mixing up >> allocation, initialization, tracing, sampling and exception handling >> all over the place. >> >> I propose to improve the situation by having an abstract MemAllocator >> stack object own the allocation path, with suitable subclasses, >> ObjAllocator, ClassAllocator and ObjArrayAllocator for allocating the >> 3 types we support allocating on the heap. Variation points now >> simply use virtual calls, and most of the tracing/sampling code can >> be moved out from the allocation path. This should make it easier to >> understand what is going on here, and the different concerns are more >> separated. >> >> A collector can override the virtual member functions for allocating >> objects, arrays, and classes, and inject their own variation points >> into this framework. For example for installing a forwarding pointer >> in the cell header required by Brooks pointers, if you are into that >> kind of stuff. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8205683 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02/ > > Awesome cleanup Erik! A few comments/questions/suggestions: > > src/hotspot/share/gc/shared/collectedHeap.cpp > --------------------------------------------- > > 449 oop CollectedHeap::obj_allocate(Klass* klass, int size, TRAPS) { > 450?? ObjAllocator allocator(klass, size); > 451?? return allocator.allocate(); > 452 } > 453 > 454 oop CollectedHeap::array_allocate(Klass* klass, int size, int > length, bool do_zero, TRAPS) { > 455?? ObjArrayAllocator allocator(klass, size, length, do_zero); > 456?? return allocator.allocate(); > 457 } > 458 > 459 oop CollectedHeap::class_allocate(Klass* klass, int size, TRAPS) { > 460?? ClassAllocator allocator(klass, size); > 461?? return allocator.allocate(); > 462 } > > I doesn't look like the above functions need to take in TRAPS, right? It is used by callers so they can use the CHECK_NULL macro on the callsite. But inside, I don't need it. > src/hotspot/share/gc/shared/memAllocator.hpp > -------------------------------------------- > > 47?? HeapWord* allocate_from_tlab(Allocation& allocation) const; > 48?? HeapWord* allocate_from_tlab_slow(Allocation& allocation) const; > 49?? HeapWord* allocate_outside_tlab(Allocation& allocation) const; > > Can we call these allocate_inside_tlab/allocate_outside_tlab? Fixed. > ?73 virtual oop init_obj(HeapWord* mem) const; > > Can we call this initialize() instead? To better line up with allocate(). Fixed. > Further, I'd also like to see initialize() being a pure virtual > function and instead do what that function does today in a different > function, protected and non-virtual, called finish(). So in the end > we'll have: > > allocate(); > initialize(); > finish(); > > I believe such naming/structure would make code like this easier to > understand: > src/hotspot/share/gc/shared/memAllocator.cpp: > 415?? mem_clear(mem); > 416?? java_lang_Class::set_oop_size(mem, (int)_word_size); > 417?? oop obj = finish(mem); > > Otherwise, calling init_obj()/initialize() as the last thing here > seems a bit odd and suggests the we're re-initializing the object > completely. Fixed. > src/hotspot/share/gc/shared/memAllocator.cpp > -------------------------------------------- > ... > 391?? oop obj = MemAllocator::init_obj(mem); > 392?? assert(Universe::is_bootstrapping() || !obj->is_array(), "must > not be an array"); > 393?? return obj; > 394 } > ... > 405?? oop obj = MemAllocator::init_obj(mem); > 406?? assert(obj->is_array(), "must be an array"); > 407?? return obj; > 408 } > ... > 417?? oop obj = MemAllocator::init_obj(mem); > 418?? assert(Universe::is_bootstrapping() || !obj->is_array(), "must > not be an array"); > 419?? return obj; > 420 } > > The above asserts look unnecessary. If so, it would be nice to just > convert this to: > > ... > 391?? return finish(mem); > 392 } > ... > 405?? return finish(mem); > 406 } > ... > 417?? return finish(mem); > 418 } Fixed. > 200???? HeapWord* mem = (HeapWord*)obj; > 201???? size_t size_in_bytes = _allocator._word_size * HeapWordSize; > 202???? ThreadLocalAllocBuffer& tlab = _thread->tlab(); > 203???? size_t bytes_since_last = _allocated_outside_tlab ? 0 : > tlab.bytes_since_last_sample_point(); > 204???? _thread->heap_sampler().check_for_sampling(mem, size_in_bytes, > bytes_since_last); > > Looks like we can now change the first arg in check_for_sampling() > from HeapWord* to oop and avoid the casting above and inside > check_for_sampling() itself. Fixed. > src/hotspot/share/gc/shared/threadLocalAllocBuffer.hpp > ------------------------------------------------------ > > 143?? HeapWord* allocate_sampled_object(size_t size); > > Can we call this function allocate_sampled() to better match with > allocate()? Not sure where you found this function; I have removed it. The logic was instead moved into the allocate_inside_tlab_slow function. One alternative I considered is to pass in a bool* to ThreadLocalAllocBuffer::allocate() that if not NULL also tries to bump the end to its actual end, and returns back whether the end was bumped by the sampling logic or not inside of allocate, and have that pointer point straight into Allocation::_tlab_end_reset_for_sample. If you prefer that style, I can change it to do that instead. Thanks, /Erik > > cheers, > Per From stefan.karlsson at oracle.com Wed Jun 27 13:20:42 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 27 Jun 2018 15:20:42 +0200 Subject: RFR: 8205678: ZGC: Remove unused ZAllocationFlags::java_thread() In-Reply-To: <0a3eba94-4c99-eff6-067a-17bf8bbb789c@oracle.com> References: <0a3eba94-4c99-eff6-067a-17bf8bbb789c@oracle.com> Message-ID: <4d9a3f1a-f554-72c4-6634-21f21caf70d5@oracle.com> Looks good, but maybe update this comment: // * 7-5 Unused (3-bits) StefanK On 2018-06-26 12:45, Per Liden wrote: > After JDK-8205676, the java_thread field in the ZAllocationFlags is > unused and can be removed. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205678 > Webrev: http://cr.openjdk.java.net/~pliden/8205678/webrev.0 > > /Per From stefan.karlsson at oracle.com Wed Jun 27 13:21:38 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 27 Jun 2018 15:21:38 +0200 Subject: RFR: 8205679: Remove unused ThreadLocalAllocBuffer::undo_allocate() In-Reply-To: References: Message-ID: I'm fine with this removal. StefanK On 2018-06-26 12:45, Per Liden wrote: > After JDK-8205676, the function ThreadLocalAllocBuffer::undo_allocate() > is unused and can be removed. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205679 > Webrev: http://cr.openjdk.java.net/~pliden/8205679/webrev.0 > > /Per From erik.osterlund at oracle.com Wed Jun 27 14:07:50 2018 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 27 Jun 2018 16:07:50 +0200 Subject: RFR: 8205923: ZGC: Verification applies load barriers before verification In-Reply-To: References: Message-ID: <03A2822C-B880-4A3A-8276-317E36CB16E4@oracle.com> Hi Stefan, Looks good. Thanks, /Erik > On 27 Jun 2018, at 15:15, Stefan Karlsson wrote: > > Hi all, > > Please review this patch to stop applying load barriers in the Mark End verification. > > http://cr.openjdk.java.net/~stefank/8205923/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8205923 > > The recent IN_CONCURRENT_ROOT changes introduced load barriers to our Mark End verification. These load barriers are unnecessary because all roots should have been fixed at that point. > > The object oop field verification applied a load barrier to be able to load stale j.l.Reference::referents, and as an effect applied load barriers to all fields. This patch skips visiting the referents and uses RawAccess loads for all other fields. > > This patch builds upon the JDK-8205922, which introduces the mechanism to skip visiting referents in oop_iterate calls. > > Thanks, > StefanK From erik.osterlund at oracle.com Wed Jun 27 14:09:40 2018 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 27 Jun 2018 16:09:40 +0200 Subject: RFR: 8205922: Add reference iteration mode that skips visiting the referents In-Reply-To: <2f68cfde-0ebb-9658-3e9c-36303afc07fb@oracle.com> References: <2f68cfde-0ebb-9658-3e9c-36303afc07fb@oracle.com> Message-ID: <83B990D3-D93B-4D1B-8FE3-3FE54917906C@oracle.com> Hi Stefan, Looks good. Thanks, /Erik > On 27 Jun 2018, at 15:10, Stefan Karlsson wrote: > > Hi all, > > Please review this patch to allow us to skip visiting the j.l.Reference::referent during oop_iteration. > > http://cr.openjdk.java.net/~stefank/8205922/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8205922 > > One use-case for this is when ZGC does verification at the end of the Mark End pause. At that point we have complete marking information roots and fields are good, but the processing of j.l.Reference::referents are deferred to the concurrent phase after this pause. In this case we would like to skip applying the verification closure to the referents. > > Thanks, > StefanK > From per.liden at oracle.com Wed Jun 27 14:26:59 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 16:26:59 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: <0017cfb1-4067-6670-4fb7-6ad504e0adb1@oracle.com> References: <0017cfb1-4067-6670-4fb7-6ad504e0adb1@oracle.com> Message-ID: <39039f93-5250-8e8e-ea8b-77b885c2c1f9@oracle.com> Hi Erik, On 2018-06-27 15:23, Erik ?sterlund wrote: > Hi Per, > > Thank you for reviewing. > > Incremental: > http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02_03/ > > Full: > http://cr.openjdk.java.net/~eosterlund/8205683/webrev.03/ >[...] >> I doesn't look like the above functions need to take in TRAPS, right? > > It is used by callers so they can use the CHECK_NULL macro on the > callsite. But inside, I don't need it. Ah, I see. I tend to think that we're only using that style when something inside throws and exception, but anyway, keep it as is. [...] >> src/hotspot/share/gc/shared/threadLocalAllocBuffer.hpp >> ------------------------------------------------------ >> >> 143?? HeapWord* allocate_sampled_object(size_t size); >> >> Can we call this function allocate_sampled() to better match with >> allocate()? > > Not sure where you found this function; I have removed it. The logic was You're right of course, ignore me. > instead moved into the allocate_inside_tlab_slow function. One > alternative I considered is to pass in a bool* to > ThreadLocalAllocBuffer::allocate() that if not NULL also tries to bump > the end to its actual end, and returns back whether the end was bumped > by the sampling logic or not inside of allocate, and have that pointer > point straight into Allocation::_tlab_end_reset_for_sample. If you > prefer that style, I can change it to do that instead. I like what you have now, so I'd say keep it as is. In summary, looks good, ship it! cheers, Per > > Thanks, > /Erik > >> >> cheers, >> Per > From per.liden at oracle.com Wed Jun 27 14:27:45 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 16:27:45 +0200 Subject: RFR: 8205676: ZGC: Remove TLAB allocations in relocation path In-Reply-To: <907092ee-ca31-7d99-b2c7-7690dd3175c1@oracle.com> References: <62139932-68af-ae87-335f-c8973c3ad892@oracle.com> <907092ee-ca31-7d99-b2c7-7690dd3175c1@oracle.com> Message-ID: Thanks Stefan! /Per On 2018-06-27 15:18, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2018-06-26 12:45, Per Liden wrote: >> During Java-thread aided relocation, ZGC tries to use the TLAB to >> allocate the new object. However, this interacts badly with JEP 331: >> Low-Overhead Heap Profiling, as it distorts the profiling statistics. >> I propose we remove the use of the TALB in the relocation path, >> essentially changing a thread-local pointer bump to a uncontended >> CPU-local CAS. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205676 >> Webrev: http://cr.openjdk.java.net/~pliden/8205676/webrev.0 >> >> Testing: Currently running benchmarks to verify that this has no >> unexpected performance hit. >> >> I've filed two follow up RFEs (also out for review now), to clean up >> some functions that are now unused: >> https://bugs.openjdk.java.net/browse/JDK-8205678 >> https://bugs.openjdk.java.net/browse/JDK-8205679 >> >> /Per From per.liden at oracle.com Wed Jun 27 14:28:13 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 16:28:13 +0200 Subject: RFR: 8205679: Remove unused ThreadLocalAllocBuffer::undo_allocate() In-Reply-To: References: Message-ID: <992dcef0-48b0-52da-33fc-2027b662a4d5@oracle.com> Thanks for reviewing, Stefan! /Per On 2018-06-27 15:21, Stefan Karlsson wrote: > I'm fine with this removal. > > StefanK > > On 2018-06-26 12:45, Per Liden wrote: >> After JDK-8205676, the function >> ThreadLocalAllocBuffer::undo_allocate() is unused and can be removed. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205679 >> Webrev: http://cr.openjdk.java.net/~pliden/8205679/webrev.0 >> >> /Per From per.liden at oracle.com Wed Jun 27 14:28:53 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 16:28:53 +0200 Subject: RFR: 8205678: ZGC: Remove unused ZAllocationFlags::java_thread() In-Reply-To: <4d9a3f1a-f554-72c4-6634-21f21caf70d5@oracle.com> References: <0a3eba94-4c99-eff6-067a-17bf8bbb789c@oracle.com> <4d9a3f1a-f554-72c4-6634-21f21caf70d5@oracle.com> Message-ID: <9ec2b9e1-3a38-7817-d6b1-cc3b75b5ed42@oracle.com> Ah, good catch, will updated that before I push. Thanks for reviewing! /Per On 2018-06-27 15:20, Stefan Karlsson wrote: > Looks good, but maybe update this comment: > ?//? * 7-5 Unused (3-bits) > > StefanK > > On 2018-06-26 12:45, Per Liden wrote: >> After JDK-8205676, the java_thread field in the ZAllocationFlags is >> unused and can be removed. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205678 >> Webrev: http://cr.openjdk.java.net/~pliden/8205678/webrev.0 >> >> /Per From per.liden at oracle.com Wed Jun 27 14:42:41 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 16:42:41 +0200 Subject: RFR: 8205923: ZGC: Verification applies load barriers before verification In-Reply-To: References: Message-ID: <27b210a3-8b2d-5dff-9ee2-3101edb4bc6f@oracle.com> Looks good! /Per On 2018-06-27 15:15, Stefan Karlsson wrote: > Hi all, > > Please review this patch to stop applying load barriers in the Mark End > verification. > > http://cr.openjdk.java.net/~stefank/8205923/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8205923 > > The recent IN_CONCURRENT_ROOT changes introduced load barriers to our > Mark End verification. These load barriers are unnecessary because all > roots should have been fixed at that point. > > The object oop field verification applied a load barrier to be able to > load stale j.l.Reference::referents, and as an effect applied load > barriers to all fields. This patch skips visiting the referents and uses > RawAccess loads for all other fields. > > This patch builds upon the JDK-8205922, which introduces the mechanism > to skip visiting referents in oop_iterate calls. > > Thanks, > StefanK From per.liden at oracle.com Wed Jun 27 14:42:59 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 16:42:59 +0200 Subject: RFR: 8205922: Add reference iteration mode that skips visiting the referents In-Reply-To: <2f68cfde-0ebb-9658-3e9c-36303afc07fb@oracle.com> References: <2f68cfde-0ebb-9658-3e9c-36303afc07fb@oracle.com> Message-ID: Looks good! /Per On 2018-06-27 15:10, Stefan Karlsson wrote: > Hi all, > > Please review this patch to allow us to skip visiting the > j.l.Reference::referent during oop_iteration. > > http://cr.openjdk.java.net/~stefank/8205922/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8205922 > > One use-case for this is when ZGC does verification at the end of the > Mark End pause. At that point we have complete marking information roots > and fields are good, but the processing of j.l.Reference::referents are > deferred to the concurrent phase after this pause. In this case we would > like to skip applying the verification closure to the referents. > > Thanks, > StefanK > From per.liden at oracle.com Wed Jun 27 14:43:27 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 16:43:27 +0200 Subject: RFR: 8205676: ZGC: Remove TLAB allocations in relocation path In-Reply-To: <54ff68ba-7082-a49a-530f-9d1a5be64ab1@oracle.com> References: <62139932-68af-ae87-335f-c8973c3ad892@oracle.com> <54ff68ba-7082-a49a-530f-9d1a5be64ab1@oracle.com> Message-ID: <23eadb7e-4074-f728-01b6-1e1e72571465@oracle.com> Thanks Erik! /Per On 2018-06-26 18:05, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2018-06-26 12:45, Per Liden wrote: >> During Java-thread aided relocation, ZGC tries to use the TLAB to >> allocate the new object. However, this interacts badly with JEP 331: >> Low-Overhead Heap Profiling, as it distorts the profiling statistics. >> I propose we remove the use of the TALB in the relocation path, >> essentially changing a thread-local pointer bump to a uncontended >> CPU-local CAS. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205676 >> Webrev: http://cr.openjdk.java.net/~pliden/8205676/webrev.0 >> >> Testing: Currently running benchmarks to verify that this has no >> unexpected performance hit. >> >> I've filed two follow up RFEs (also out for review now), to clean up >> some functions that are now unused: >> https://bugs.openjdk.java.net/browse/JDK-8205678 >> https://bugs.openjdk.java.net/browse/JDK-8205679 >> >> /Per > From erik.osterlund at oracle.com Wed Jun 27 14:58:53 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 27 Jun 2018 16:58:53 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: <39039f93-5250-8e8e-ea8b-77b885c2c1f9@oracle.com> References: <0017cfb1-4067-6670-4fb7-6ad504e0adb1@oracle.com> <39039f93-5250-8e8e-ea8b-77b885c2c1f9@oracle.com> Message-ID: Hi Per, Thank you for the review. /Erik On 2018-06-27 16:26, Per Liden wrote: > Hi Erik, > > On 2018-06-27 15:23, Erik ?sterlund wrote: >> Hi Per, >> >> Thank you for reviewing. >> >> Incremental: >> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02_03/ >> >> Full: >> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.03/ >> [...] >>> I doesn't look like the above functions need to take in TRAPS, right? >> >> It is used by callers so they can use the CHECK_NULL macro on the >> callsite. But inside, I don't need it. > > Ah, I see. I tend to think that we're only using that style when > something inside throws and exception, but anyway, keep it as is. > > [...] >>> src/hotspot/share/gc/shared/threadLocalAllocBuffer.hpp >>> ------------------------------------------------------ >>> >>> 143?? HeapWord* allocate_sampled_object(size_t size); >>> >>> Can we call this function allocate_sampled() to better match with >>> allocate()? >> >> Not sure where you found this function; I have removed it. The logic was > > You're right of course, ignore me. > >> instead moved into the allocate_inside_tlab_slow function. One >> alternative I considered is to pass in a bool* to >> ThreadLocalAllocBuffer::allocate() that if not NULL also tries to >> bump the end to its actual end, and returns back whether the end was >> bumped by the sampling logic or not inside of allocate, and have that >> pointer point straight into Allocation::_tlab_end_reset_for_sample. >> If you prefer that style, I can change it to do that instead. > > I like what you have now, so I'd say keep it as is. > > In summary, looks good, ship it! > > cheers, > Per > >> >> Thanks, >> /Erik >> >>> >>> cheers, >>> Per >> From jiangli.zhou at oracle.com Wed Jun 27 16:16:15 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 27 Jun 2018 09:16:15 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <78b4517e-df7d-1e01-19e7-eff5ec79e1ef@oracle.com> References: <81FCAF82-25AB-4C46-86AA-3D5F1A236C58@oracle.com> <78b4517e-df7d-1e01-19e7-eff5ec79e1ef@oracle.com> Message-ID: Hi Per, Here is the tmpfs I ran into. Setting -XX:ZPath=/run/shm seems to work. [0.013s][error][gc,init] More than one tmpfs filesystem found: [0.013s][error][gc,init] /run/lock [0.013s][error][gc,init] /run/shm [0.013s][error][gc,init] Use -XX:ZPath to specify the path to a tmpfs filesystem Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. Thanks, Jiangli > On Jun 27, 2018, at 12:33 AM, Per Liden wrote: > > On 06/26/2018 11:57 PM, Jiangli Zhou wrote: >> Hi Coleen, >> This looks good. >> Should we also disable UseSharedSpaces at runtime for ZGC in case an archive was dumped using a different GC algorithm? I ran into tmpfs error when trying to run with ZGC, so I couldn?t double check for that case... > > What kind of tmpfs errors? I would like to know if we have a bug somewhere or if it's user error. > > /Per > >> Thanks, >> Jiangli >>> On Jun 26, 2018, at 2:13 PM, coleen.phillimore at oracle.com wrote: >>> >>> Summary: Disable CDS with ZGC >>> >>> Tested with: >>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>> >>> Thanks, >>> Coleen -------------- next part -------------- An HTML attachment was scrubbed... URL: From calvin.cheung at oracle.com Wed Jun 27 17:39:49 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 27 Jun 2018 10:39:49 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> Message-ID: <5B33CBE5.80502@oracle.com> Hi Per, Thanks for coming up with a simpler fix. It looks good. Just one comment below. On 6/27/18, 2:33 AM, Per Liden wrote: > Actually, that seems a bit too restrictive as > vm.cds.archived.java.heap is only true when G1 is enabled. > > So, this is probably even better: > > * @requires vm.cds > * @requires vm.opt.final.UseCompressedOops > * @requires vm.opt.final.UseCompressedClassPointers I think the @requires vm.cds calls into VMProps.vmCDS() which calls the WB_IsCDSIncludedInVmBuild() where it already checks for compressed oops and pointers: if (!UseCompressedOops || !UseCompressedClassPointers) { // On 64-bit VMs, CDS is supported only with compressed oops/pointers return false; } Are the last two @requires needed? thanks, Calvin > > Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 > > /Per > > On 06/27/2018 10:37 AM, Per Liden wrote: >> Updated webrev, which adjusts the @requires tag, from: >> >> @requires vm.cds & vm.gc != "Z" >> >> to: >> >> @requires vm.cds.archived.java.heap >> >> which I believe is more correct in this case. >> >> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >> >> cheers, >> Per >> >> >> On 06/27/2018 09:15 AM, Per Liden wrote: >>> Hi Coleen, >>> >>> This doesn't look quite right to me. ZGC already disables >>> UseCompressedOop and UseCompressedClassPointers, which should be the >>> indicators that we can't use CDS. The problem is that CDS checks >>> those flags _before_ the heap has had a change to say they it >>> supports. So if we just move the call to set_shared_spaces_flags() >>> after the call to GCConfig::arguments()->initialize() (which should >>> be safe), then we're all good and you'll get the usual: >>> >>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>> Error occurred during initialization of VM >>> Cannot dump shared archive when UseCompressedOops or >>> UseCompressedClassPointers is off. >>> >>> Here's a proposed patch for this, which also adjusts the appropriate >>> tests for this: >>> >>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>> >>> cheers, >>> Per >>> >>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi Calvin, thank you for reporting the bug and the code review and >>>> test code. >>>> >>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>> Hi Coleen, >>>>> >>>>> The code changes look good. >>>>> >>>>> Since there's a new error message, I'd suggest adding a test to >>>>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>> >>>>> diff --git >>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>> --- >>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>> +++ >>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>> @@ -52,5 +52,13 @@ >>>>> "-Xshare:on", "-version"); >>>>> out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>> CDSTestUtils.checkExec(out); >>>>> + >>>>> + // CDS dumping doesn't work with ZGC >>>>> + ProcessBuilder pb = >>>>> ProcessTools.createJavaProcessBuilder(true, >>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>> + "-XX:+UseZGC", >>>>> + "-Xshare:dump"); >>>>> + out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>> + CDSTestUtils.checkExecExpectError(out, 1, >>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>> } >>>>> } >>>>> >>>>> (I haven't tested the above) >>>> >>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>> reclare pb. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>> >>>>> >>>>> Also, I think the new error message should be included in the >>>>> release notes. >>>>> >>>> >>>> I added the test case and it passes. I don't think having a >>>> release note for something that nobody would ever do for an >>>> experimental option is worth having. But I can look into the ZGC >>>> release notes and see if there's something that says CDS is not >>>> supported. >>>> >>>> Thanks, >>>> Coleen >>>>> thanks, >>>>> Calvin >>>>> >>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Disable CDS with ZGC >>>>>> >>>>>> Tested with: >>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>> From per.liden at oracle.com Wed Jun 27 18:08:30 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 20:08:30 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <5B33CBE5.80502@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> <5B33CBE5.80502@oracle.com> Message-ID: <268d97b8-d174-f60b-ae78-be760bbd0075@oracle.com> Hi Calvin, On 06/27/2018 07:39 PM, Calvin Cheung wrote: > Hi Per, > > Thanks for coming up with a simpler fix. It looks good. Just one comment > below. > > On 6/27/18, 2:33 AM, Per Liden wrote: >> Actually, that seems a bit too restrictive as >> vm.cds.archived.java.heap is only true when G1 is enabled. >> >> So, this is probably even better: >> >> * @requires vm.cds >> * @requires vm.opt.final.UseCompressedOops >> * @requires vm.opt.final.UseCompressedClassPointers > I think the @requires vm.cds calls into VMProps.vmCDS() which calls the > WB_IsCDSIncludedInVmBuild() where it already checks for compressed oops > and pointers: > > if (!UseCompressedOops || !UseCompressedClassPointers) { > // On 64-bit VMs, CDS is supported only with compressed > oops/pointers > return false; > } > > Are the last two @requires needed? That's an excellent point, and you're right, those extra @requires are not needed. New webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.3 Thanks for reviewing! cheers, Per > > thanks, > Calvin >> >> Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 >> >> /Per >> >> On 06/27/2018 10:37 AM, Per Liden wrote: >>> Updated webrev, which adjusts the @requires tag, from: >>> >>> @requires vm.cds & vm.gc != "Z" >>> >>> to: >>> >>> @requires vm.cds.archived.java.heap >>> >>> which I believe is more correct in this case. >>> >>> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >>> >>> cheers, >>> Per >>> >>> >>> On 06/27/2018 09:15 AM, Per Liden wrote: >>>> Hi Coleen, >>>> >>>> This doesn't look quite right to me. ZGC already disables >>>> UseCompressedOop and UseCompressedClassPointers, which should be the >>>> indicators that we can't use CDS. The problem is that CDS checks >>>> those flags _before_ the heap has had a change to say they it >>>> supports. So if we just move the call to set_shared_spaces_flags() >>>> after the call to GCConfig::arguments()->initialize() (which should >>>> be safe), then we're all good and you'll get the usual: >>>> >>>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>>> Error occurred during initialization of VM >>>> Cannot dump shared archive when UseCompressedOops or >>>> UseCompressedClassPointers is off. >>>> >>>> Here's a proposed patch for this, which also adjusts the appropriate >>>> tests for this: >>>> >>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>>> >>>> cheers, >>>> Per >>>> >>>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Hi Calvin, thank you for reporting the bug and the code review and >>>>> test code. >>>>> >>>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> The code changes look good. >>>>>> >>>>>> Since there's a new error message, I'd suggest adding a test to >>>>>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>>> >>>>>> diff --git >>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>> --- >>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>> +++ >>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>> @@ -52,5 +52,13 @@ >>>>>> "-Xshare:on", "-version"); >>>>>> out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>>> CDSTestUtils.checkExec(out); >>>>>> + >>>>>> + // CDS dumping doesn't work with ZGC >>>>>> + ProcessBuilder pb = >>>>>> ProcessTools.createJavaProcessBuilder(true, >>>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>>> + "-XX:+UseZGC", >>>>>> + "-Xshare:dump"); >>>>>> + out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>>> + CDSTestUtils.checkExecExpectError(out, 1, >>>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>>> } >>>>>> } >>>>>> >>>>>> (I haven't tested the above) >>>>> >>>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>>> reclare pb. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>>> >>>>>> >>>>>> Also, I think the new error message should be included in the >>>>>> release notes. >>>>>> >>>>> >>>>> I added the test case and it passes. I don't think having a >>>>> release note for something that nobody would ever do for an >>>>> experimental option is worth having. But I can look into the ZGC >>>>> release notes and see if there's something that says CDS is not >>>>> supported. >>>>> >>>>> Thanks, >>>>> Coleen >>>>>> thanks, >>>>>> Calvin >>>>>> >>>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Disable CDS with ZGC >>>>>>> >>>>>>> Tested with: >>>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>>>>> >>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>> From per.liden at oracle.com Wed Jun 27 18:14:10 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 20:14:10 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: References: <81FCAF82-25AB-4C46-86AA-3D5F1A236C58@oracle.com> <78b4517e-df7d-1e01-19e7-eff5ec79e1ef@oracle.com> Message-ID: Hi Jiangli, On 06/27/2018 06:16 PM, Jiangli Zhou wrote: > Hi Per, > > Here is the tmpfs I ran into. Setting -XX:ZPath=/run/shm seems to work. Ok, good! Setting -XX:ZPath is needed when running on Linux kernels < 3.17 _and_ there are multiple tmpfs filesystems mounted, in which case ZGC needs some guidance on which one to use. When running on kernels >= 3.17 we instead use memfd_create() to get a tmpfs file handle, and -XX:ZPath is never needed. cheers, Per > > [0.013s][error][gc,init] More than one tmpfs filesystem found: > [0.013s][error][gc,init] /run/lock > [0.013s][error][gc,init] /run/shm > [0.013s][error][gc,init] Use -XX:ZPath to specify the path to a tmpfs > filesystem > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. > > > Thanks, > Jiangli > >> On Jun 27, 2018, at 12:33 AM, Per Liden > > wrote: >> >> On 06/26/2018 11:57 PM, Jiangli Zhou wrote: >>> Hi Coleen, >>> This looks good. >>> Should we also disable UseSharedSpaces at runtime for ZGC in case an >>> archive was dumped using a different GC algorithm? I ran into tmpfs >>> error when trying to run with ZGC, so I couldn?t double check for >>> that case... >> >> What kind of tmpfs errors? I would like to know if we have a bug >> somewhere or if it's user error. >> >> /Per >> >>> Thanks, >>> Jiangli >>>> On Jun 26, 2018, at 2:13 PM, coleen.phillimore at oracle.com >>>> wrote: >>>> >>>> Summary: Disable CDS with ZGC >>>> >>>> Tested with: >>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on -version >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>> >>>> Thanks, >>>> Coleen > From calvin.cheung at oracle.com Wed Jun 27 18:29:12 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 27 Jun 2018 11:29:12 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <268d97b8-d174-f60b-ae78-be760bbd0075@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> <5B33CBE5.80502@oracle.com> <268d97b8-d174-f60b-ae78-be760bbd0075@oracle.com> Message-ID: <5B33D778.5010500@oracle.com> On 6/27/18, 11:08 AM, Per Liden wrote: > Hi Calvin, > > On 06/27/2018 07:39 PM, Calvin Cheung wrote: >> Hi Per, >> >> Thanks for coming up with a simpler fix. It looks good. Just one >> comment below. >> >> On 6/27/18, 2:33 AM, Per Liden wrote: >>> Actually, that seems a bit too restrictive as >>> vm.cds.archived.java.heap is only true when G1 is enabled. >>> >>> So, this is probably even better: >>> >>> * @requires vm.cds >>> * @requires vm.opt.final.UseCompressedOops >>> * @requires vm.opt.final.UseCompressedClassPointers >> I think the @requires vm.cds calls into VMProps.vmCDS() which calls >> the WB_IsCDSIncludedInVmBuild() where it already checks for >> compressed oops and pointers: >> >> if (!UseCompressedOops || !UseCompressedClassPointers) { >> // On 64-bit VMs, CDS is supported only with compressed >> oops/pointers >> return false; >> } >> >> Are the last two @requires needed? > > That's an excellent point, and you're right, those extra @requires are > not needed. New webrev: > > http://cr.openjdk.java.net/~pliden/8205702/webrev.3 Looks good. thanks, Calvin > > Thanks for reviewing! > > cheers, > Per > >> >> thanks, >> Calvin >>> >>> Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 >>> >>> /Per >>> >>> On 06/27/2018 10:37 AM, Per Liden wrote: >>>> Updated webrev, which adjusts the @requires tag, from: >>>> >>>> @requires vm.cds & vm.gc != "Z" >>>> >>>> to: >>>> >>>> @requires vm.cds.archived.java.heap >>>> >>>> which I believe is more correct in this case. >>>> >>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >>>> >>>> cheers, >>>> Per >>>> >>>> >>>> On 06/27/2018 09:15 AM, Per Liden wrote: >>>>> Hi Coleen, >>>>> >>>>> This doesn't look quite right to me. ZGC already disables >>>>> UseCompressedOop and UseCompressedClassPointers, which should be >>>>> the indicators that we can't use CDS. The problem is that CDS >>>>> checks those flags _before_ the heap has had a change to say they >>>>> it supports. So if we just move the call to >>>>> set_shared_spaces_flags() after the call to >>>>> GCConfig::arguments()->initialize() (which should be safe), then >>>>> we're all good and you'll get the usual: >>>>> >>>>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>>>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>>>> Error occurred during initialization of VM >>>>> Cannot dump shared archive when UseCompressedOops or >>>>> UseCompressedClassPointers is off. >>>>> >>>>> Here's a proposed patch for this, which also adjusts the >>>>> appropriate tests for this: >>>>> >>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>>>> >>>>> cheers, >>>>> Per >>>>> >>>>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Hi Calvin, thank you for reporting the bug and the code review >>>>>> and test code. >>>>>> >>>>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> The code changes look good. >>>>>>> >>>>>>> Since there's a new error message, I'd suggest adding a test to >>>>>>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>>>> >>>>>>> diff --git >>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>> >>>>>>> --- >>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>> >>>>>>> +++ >>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>> >>>>>>> @@ -52,5 +52,13 @@ >>>>>>> "-Xshare:on", "-version"); >>>>>>> out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>>>> CDSTestUtils.checkExec(out); >>>>>>> + >>>>>>> + // CDS dumping doesn't work with ZGC >>>>>>> + ProcessBuilder pb = >>>>>>> ProcessTools.createJavaProcessBuilder(true, >>>>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>>>> + "-XX:+UseZGC", >>>>>>> + "-Xshare:dump"); >>>>>>> + out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>>>> + CDSTestUtils.checkExecExpectError(out, 1, >>>>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> (I haven't tested the above) >>>>>> >>>>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>>>> reclare pb. >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>>>> >>>>>>> >>>>>>> Also, I think the new error message should be included in the >>>>>>> release notes. >>>>>>> >>>>>> >>>>>> I added the test case and it passes. I don't think having a >>>>>> release note for something that nobody would ever do for an >>>>>> experimental option is worth having. But I can look into the >>>>>> ZGC release notes and see if there's something that says CDS is >>>>>> not supported. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>>> thanks, >>>>>>> Calvin >>>>>>> >>>>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Disable CDS with ZGC >>>>>>>> >>>>>>>> Tested with: >>>>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on >>>>>>>> -version >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>> From per.liden at oracle.com Wed Jun 27 18:34:59 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 20:34:59 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <5B33D778.5010500@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> <5B33CBE5.80502@oracle.com> <268d97b8-d174-f60b-ae78-be760bbd0075@oracle.com> <5B33D778.5010500@oracle.com> Message-ID: <1599422d-c7e1-a957-cd08-21ae1d27b4ef@oracle.com> On 06/27/2018 08:29 PM, Calvin Cheung wrote: > > > On 6/27/18, 11:08 AM, Per Liden wrote: >> Hi Calvin, >> >> On 06/27/2018 07:39 PM, Calvin Cheung wrote: >>> Hi Per, >>> >>> Thanks for coming up with a simpler fix. It looks good. Just one >>> comment below. >>> >>> On 6/27/18, 2:33 AM, Per Liden wrote: >>>> Actually, that seems a bit too restrictive as >>>> vm.cds.archived.java.heap is only true when G1 is enabled. >>>> >>>> So, this is probably even better: >>>> >>>> * @requires vm.cds >>>> * @requires vm.opt.final.UseCompressedOops >>>> * @requires vm.opt.final.UseCompressedClassPointers >>> I think the @requires vm.cds calls into VMProps.vmCDS() which calls >>> the WB_IsCDSIncludedInVmBuild() where it already checks for >>> compressed oops and pointers: >>> >>> if (!UseCompressedOops || !UseCompressedClassPointers) { >>> // On 64-bit VMs, CDS is supported only with compressed >>> oops/pointers >>> return false; >>> } >>> >>> Are the last two @requires needed? >> >> That's an excellent point, and you're right, those extra @requires are >> not needed. New webrev: >> >> http://cr.openjdk.java.net/~pliden/8205702/webrev.3 > Looks good. Thanks Calvin! /Per > > thanks, > Calvin >> >> Thanks for reviewing! >> >> cheers, >> Per >> >>> >>> thanks, >>> Calvin >>>> >>>> Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 >>>> >>>> /Per >>>> >>>> On 06/27/2018 10:37 AM, Per Liden wrote: >>>>> Updated webrev, which adjusts the @requires tag, from: >>>>> >>>>> @requires vm.cds & vm.gc != "Z" >>>>> >>>>> to: >>>>> >>>>> @requires vm.cds.archived.java.heap >>>>> >>>>> which I believe is more correct in this case. >>>>> >>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >>>>> >>>>> cheers, >>>>> Per >>>>> >>>>> >>>>> On 06/27/2018 09:15 AM, Per Liden wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> This doesn't look quite right to me. ZGC already disables >>>>>> UseCompressedOop and UseCompressedClassPointers, which should be >>>>>> the indicators that we can't use CDS. The problem is that CDS >>>>>> checks those flags _before_ the heap has had a change to say they >>>>>> it supports. So if we just move the call to >>>>>> set_shared_spaces_flags() after the call to >>>>>> GCConfig::arguments()->initialize() (which should be safe), then >>>>>> we're all good and you'll get the usual: >>>>>> >>>>>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>>>>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>>>>> Error occurred during initialization of VM >>>>>> Cannot dump shared archive when UseCompressedOops or >>>>>> UseCompressedClassPointers is off. >>>>>> >>>>>> Here's a proposed patch for this, which also adjusts the >>>>>> appropriate tests for this: >>>>>> >>>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>>>>> >>>>>> cheers, >>>>>> Per >>>>>> >>>>>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Hi Calvin, thank you for reporting the bug and the code review >>>>>>> and test code. >>>>>>> >>>>>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> The code changes look good. >>>>>>>> >>>>>>>> Since there's a new error message, I'd suggest adding a test to >>>>>>>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>>>>> >>>>>>>> diff --git >>>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>> >>>>>>>> --- >>>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>> >>>>>>>> +++ >>>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>> >>>>>>>> @@ -52,5 +52,13 @@ >>>>>>>> "-Xshare:on", "-version"); >>>>>>>> out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>>>>> CDSTestUtils.checkExec(out); >>>>>>>> + >>>>>>>> + // CDS dumping doesn't work with ZGC >>>>>>>> + ProcessBuilder pb = >>>>>>>> ProcessTools.createJavaProcessBuilder(true, >>>>>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>>>>> + "-XX:+UseZGC", >>>>>>>> + "-Xshare:dump"); >>>>>>>> + out = CDSTestUtils.executeAndLog(pb, "SharedArchiveFile"); >>>>>>>> + CDSTestUtils.checkExecExpectError(out, 1, >>>>>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> (I haven't tested the above) >>>>>>> >>>>>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>>>>> reclare pb. >>>>>>> >>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>>>>> >>>>>>>> >>>>>>>> Also, I think the new error message should be included in the >>>>>>>> release notes. >>>>>>>> >>>>>>> >>>>>>> I added the test case and it passes. I don't think having a >>>>>>> release note for something that nobody would ever do for an >>>>>>> experimental option is worth having. But I can look into the >>>>>>> ZGC release notes and see if there's something that says CDS is >>>>>>> not supported. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>>> thanks, >>>>>>>> Calvin >>>>>>>> >>>>>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Disable CDS with ZGC >>>>>>>>> >>>>>>>>> Tested with: >>>>>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on >>>>>>>>> -version >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>> From zgu at redhat.com Wed Jun 27 18:39:10 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 27 Jun 2018 14:39:10 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection Message-ID: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> Hi, Please review this small enhancement base on paper [1], that keeps the last successfully stolen queue as one of best-of-2 candidates for work stealing. Based on experiments done by Thomas Schatzl and myself, it shows positive impacts on task termination and average pause time. Bug: https://bugs.openjdk.java.net/browse/JDK-8205921 Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.html Test: hotspot_gc on Linux 64 (fastdebug and release) [1] Characterizing and Optimizing Hotspot Parallel Garbage Collection on Multicore Systems http://ranger.uta.edu/~jrao/papers/EuroSys18.pdf Thanks, -Zhengyu From per.liden at oracle.com Wed Jun 27 21:22:25 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 23:22:25 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <1599422d-c7e1-a957-cd08-21ae1d27b4ef@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> <5B33CBE5.80502@oracle.com> <268d97b8-d174-f60b-ae78-be760bbd0075@oracle.com> <5B33D778.5010500@oracle.com> <1599422d-c7e1-a957-cd08-21ae1d27b4ef@oracle.com> Message-ID: <1c1418f1-a97e-517c-49aa-97b4e284df48@oracle.com> Sorry, but I noticed that the IncompatibleOptions.java assumes that ZGC is always available, but it's only available on linux-x64, so this will fail on other platforms. Updated webrev to take this into account. It's currently running in mach5 t{1-3} on all platforms. http://cr.openjdk.java.net/~pliden/8205702/webrev.4 /Per On 06/27/2018 08:34 PM, Per Liden wrote: > On 06/27/2018 08:29 PM, Calvin Cheung wrote: >> >> >> On 6/27/18, 11:08 AM, Per Liden wrote: >>> Hi Calvin, >>> >>> On 06/27/2018 07:39 PM, Calvin Cheung wrote: >>>> Hi Per, >>>> >>>> Thanks for coming up with a simpler fix. It looks good. Just one >>>> comment below. >>>> >>>> On 6/27/18, 2:33 AM, Per Liden wrote: >>>>> Actually, that seems a bit too restrictive as >>>>> vm.cds.archived.java.heap is only true when G1 is enabled. >>>>> >>>>> So, this is probably even better: >>>>> >>>>> * @requires vm.cds >>>>> * @requires vm.opt.final.UseCompressedOops >>>>> * @requires vm.opt.final.UseCompressedClassPointers >>>> I think the @requires vm.cds calls into VMProps.vmCDS() which calls >>>> the WB_IsCDSIncludedInVmBuild() where it already checks for >>>> compressed oops and pointers: >>>> >>>> if (!UseCompressedOops || !UseCompressedClassPointers) { >>>> // On 64-bit VMs, CDS is supported only with compressed >>>> oops/pointers >>>> return false; >>>> } >>>> >>>> Are the last two @requires needed? >>> >>> That's an excellent point, and you're right, those extra @requires >>> are not needed. New webrev: >>> >>> http://cr.openjdk.java.net/~pliden/8205702/webrev.3 >> Looks good. > > Thanks Calvin! > > /Per > >> >> thanks, >> Calvin >>> >>> Thanks for reviewing! >>> >>> cheers, >>> Per >>> >>>> >>>> thanks, >>>> Calvin >>>>> >>>>> Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 >>>>> >>>>> /Per >>>>> >>>>> On 06/27/2018 10:37 AM, Per Liden wrote: >>>>>> Updated webrev, which adjusts the @requires tag, from: >>>>>> >>>>>> @requires vm.cds & vm.gc != "Z" >>>>>> >>>>>> to: >>>>>> >>>>>> @requires vm.cds.archived.java.heap >>>>>> >>>>>> which I believe is more correct in this case. >>>>>> >>>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >>>>>> >>>>>> cheers, >>>>>> Per >>>>>> >>>>>> >>>>>> On 06/27/2018 09:15 AM, Per Liden wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> This doesn't look quite right to me. ZGC already disables >>>>>>> UseCompressedOop and UseCompressedClassPointers, which should be >>>>>>> the indicators that we can't use CDS. The problem is that CDS >>>>>>> checks those flags _before_ the heap has had a change to say they >>>>>>> it supports. So if we just move the call to >>>>>>> set_shared_spaces_flags() after the call to >>>>>>> GCConfig::arguments()->initialize() (which should be safe), then >>>>>>> we're all good and you'll get the usual: >>>>>>> >>>>>>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>>>>>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>>>>>> Error occurred during initialization of VM >>>>>>> Cannot dump shared archive when UseCompressedOops or >>>>>>> UseCompressedClassPointers is off. >>>>>>> >>>>>>> Here's a proposed patch for this, which also adjusts the >>>>>>> appropriate tests for this: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>>>>>> >>>>>>> cheers, >>>>>>> Per >>>>>>> >>>>>>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> Hi Calvin, thank you for reporting the bug and the code review >>>>>>>> and test code. >>>>>>>> >>>>>>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> The code changes look good. >>>>>>>>> >>>>>>>>> Since there's a new error message, I'd suggest adding a test to >>>>>>>>> runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>>>>>> >>>>>>>>> diff --git >>>>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>> >>>>>>>>> --- >>>>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>> >>>>>>>>> +++ >>>>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>> >>>>>>>>> @@ -52,5 +52,13 @@ >>>>>>>>> "-Xshare:on", "-version"); >>>>>>>>> out = CDSTestUtils.executeAndLog(pb, >>>>>>>>> "SharedArchiveFile"); >>>>>>>>> CDSTestUtils.checkExec(out); >>>>>>>>> + >>>>>>>>> + // CDS dumping doesn't work with ZGC >>>>>>>>> + ProcessBuilder pb = >>>>>>>>> ProcessTools.createJavaProcessBuilder(true, >>>>>>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>>>>>> + "-XX:+UseZGC", >>>>>>>>> + "-Xshare:dump"); >>>>>>>>> + out = CDSTestUtils.executeAndLog(pb, >>>>>>>>> "SharedArchiveFile"); >>>>>>>>> + CDSTestUtils.checkExecExpectError(out, 1, >>>>>>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> (I haven't tested the above) >>>>>>>> >>>>>>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>>>>>> reclare pb. >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>>>>>> >>>>>>>>> >>>>>>>>> Also, I think the new error message should be included in the >>>>>>>>> release notes. >>>>>>>>> >>>>>>>> >>>>>>>> I added the test case and it passes. I don't think having a >>>>>>>> release note for something that nobody would ever do for an >>>>>>>> experimental option is worth having. But I can look into the >>>>>>>> ZGC release notes and see if there's something that says CDS is >>>>>>>> not supported. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>>> thanks, >>>>>>>>> Calvin >>>>>>>>> >>>>>>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> Summary: Disable CDS with ZGC >>>>>>>>>> >>>>>>>>>> Tested with: >>>>>>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on >>>>>>>>>> -version >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>> From calvin.cheung at oracle.com Wed Jun 27 21:49:04 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 27 Jun 2018 14:49:04 -0700 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <1c1418f1-a97e-517c-49aa-97b4e284df48@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> <5B33CBE5.80502@oracle.com> <268d97b8-d174-f60b-ae78-be760bbd0075@oracle.com> <5B33D778.5010500@oracle.com> <1599422d-c7e1-a957-cd08-21ae1d27b4ef@oracle.com> <1c1418f1-a97e-517c-49aa-97b4e284df48@oracle.com> Message-ID: <5B340650.9000308@oracle.com> Looks good. thanks, Calvin On 6/27/18, 2:22 PM, Per Liden wrote: > Sorry, but I noticed that the IncompatibleOptions.java assumes that > ZGC is always available, but it's only available on linux-x64, so this > will fail on other platforms. Updated webrev to take this into > account. It's currently running in mach5 t{1-3} on all platforms. > > http://cr.openjdk.java.net/~pliden/8205702/webrev.4 > > /Per > > On 06/27/2018 08:34 PM, Per Liden wrote: >> On 06/27/2018 08:29 PM, Calvin Cheung wrote: >>> >>> >>> On 6/27/18, 11:08 AM, Per Liden wrote: >>>> Hi Calvin, >>>> >>>> On 06/27/2018 07:39 PM, Calvin Cheung wrote: >>>>> Hi Per, >>>>> >>>>> Thanks for coming up with a simpler fix. It looks good. Just one >>>>> comment below. >>>>> >>>>> On 6/27/18, 2:33 AM, Per Liden wrote: >>>>>> Actually, that seems a bit too restrictive as >>>>>> vm.cds.archived.java.heap is only true when G1 is enabled. >>>>>> >>>>>> So, this is probably even better: >>>>>> >>>>>> * @requires vm.cds >>>>>> * @requires vm.opt.final.UseCompressedOops >>>>>> * @requires vm.opt.final.UseCompressedClassPointers >>>>> I think the @requires vm.cds calls into VMProps.vmCDS() which >>>>> calls the WB_IsCDSIncludedInVmBuild() where it already checks for >>>>> compressed oops and pointers: >>>>> >>>>> if (!UseCompressedOops || !UseCompressedClassPointers) { >>>>> // On 64-bit VMs, CDS is supported only with compressed >>>>> oops/pointers >>>>> return false; >>>>> } >>>>> >>>>> Are the last two @requires needed? >>>> >>>> That's an excellent point, and you're right, those extra @requires >>>> are not needed. New webrev: >>>> >>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.3 >>> Looks good. >> >> Thanks Calvin! >> >> /Per >> >>> >>> thanks, >>> Calvin >>>> >>>> Thanks for reviewing! >>>> >>>> cheers, >>>> Per >>>> >>>>> >>>>> thanks, >>>>> Calvin >>>>>> >>>>>> Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 >>>>>> >>>>>> /Per >>>>>> >>>>>> On 06/27/2018 10:37 AM, Per Liden wrote: >>>>>>> Updated webrev, which adjusts the @requires tag, from: >>>>>>> >>>>>>> @requires vm.cds & vm.gc != "Z" >>>>>>> >>>>>>> to: >>>>>>> >>>>>>> @requires vm.cds.archived.java.heap >>>>>>> >>>>>>> which I believe is more correct in this case. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >>>>>>> >>>>>>> cheers, >>>>>>> Per >>>>>>> >>>>>>> >>>>>>> On 06/27/2018 09:15 AM, Per Liden wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> This doesn't look quite right to me. ZGC already disables >>>>>>>> UseCompressedOop and UseCompressedClassPointers, which should >>>>>>>> be the indicators that we can't use CDS. The problem is that >>>>>>>> CDS checks those flags _before_ the heap has had a change to >>>>>>>> say they it supports. So if we just move the call to >>>>>>>> set_shared_spaces_flags() after the call to >>>>>>>> GCConfig::arguments()->initialize() (which should be safe), >>>>>>>> then we're all good and you'll get the usual: >>>>>>>> >>>>>>>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>>>>>>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>>>>>>> Error occurred during initialization of VM >>>>>>>> Cannot dump shared archive when UseCompressedOops or >>>>>>>> UseCompressedClassPointers is off. >>>>>>>> >>>>>>>> Here's a proposed patch for this, which also adjusts the >>>>>>>> appropriate tests for this: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>>>>>>> >>>>>>>> cheers, >>>>>>>> Per >>>>>>>> >>>>>>>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>> >>>>>>>>> Hi Calvin, thank you for reporting the bug and the code review >>>>>>>>> and test code. >>>>>>>>> >>>>>>>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>>>>>>> Hi Coleen, >>>>>>>>>> >>>>>>>>>> The code changes look good. >>>>>>>>>> >>>>>>>>>> Since there's a new error message, I'd suggest adding a test >>>>>>>>>> to runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>>>>>>> >>>>>>>>>> diff --git >>>>>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>>> >>>>>>>>>> --- >>>>>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>>> >>>>>>>>>> +++ >>>>>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>>> >>>>>>>>>> @@ -52,5 +52,13 @@ >>>>>>>>>> "-Xshare:on", "-version"); >>>>>>>>>> out = CDSTestUtils.executeAndLog(pb, >>>>>>>>>> "SharedArchiveFile"); >>>>>>>>>> CDSTestUtils.checkExec(out); >>>>>>>>>> + >>>>>>>>>> + // CDS dumping doesn't work with ZGC >>>>>>>>>> + ProcessBuilder pb = >>>>>>>>>> ProcessTools.createJavaProcessBuilder(true, >>>>>>>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>>>>>>> + "-XX:+UseZGC", >>>>>>>>>> + "-Xshare:dump"); >>>>>>>>>> + out = CDSTestUtils.executeAndLog(pb, >>>>>>>>>> "SharedArchiveFile"); >>>>>>>>>> + CDSTestUtils.checkExecExpectError(out, 1, >>>>>>>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> (I haven't tested the above) >>>>>>>>> >>>>>>>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>>>>>>> reclare pb. >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Also, I think the new error message should be included in the >>>>>>>>>> release notes. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I added the test case and it passes. I don't think having a >>>>>>>>> release note for something that nobody would ever do for an >>>>>>>>> experimental option is worth having. But I can look into the >>>>>>>>> ZGC release notes and see if there's something that says CDS >>>>>>>>> is not supported. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>>>> thanks, >>>>>>>>>> Calvin >>>>>>>>>> >>>>>>>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>> Summary: Disable CDS with ZGC >>>>>>>>>>> >>>>>>>>>>> Tested with: >>>>>>>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>>>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on >>>>>>>>>>> -version >>>>>>>>>>> >>>>>>>>>>> open webrev at >>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Coleen >>>>>>>>> From per.liden at oracle.com Wed Jun 27 21:51:45 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Jun 2018 23:51:45 +0200 Subject: RFR (XS) 8205702: assert(UseCompressedClassPointers) failed in universe.hpp In-Reply-To: <5B340650.9000308@oracle.com> References: <5B32B34C.2060506@oracle.com> <1246989c-e8e8-22e9-cdc8-1253c344aac6@oracle.com> <92de1a95-e09a-82ef-e2f4-9c2a8979fd26@oracle.com> <88cd8e8b-3692-ae23-7475-0fbd44692ba9@oracle.com> <5B33CBE5.80502@oracle.com> <268d97b8-d174-f60b-ae78-be760bbd0075@oracle.com> <5B33D778.5010500@oracle.com> <1599422d-c7e1-a957-cd08-21ae1d27b4ef@oracle.com> <1c1418f1-a97e-517c-49aa-97b4e284df48@oracle.com> <5B340650.9000308@oracle.com> Message-ID: <435d050f-c25e-bdc0-235b-ed85bf65c095@oracle.com> Thanks Calvin! /Per On 06/27/2018 11:49 PM, Calvin Cheung wrote: > Looks good. > > thanks, > Calvin > > On 6/27/18, 2:22 PM, Per Liden wrote: >> Sorry, but I noticed that the IncompatibleOptions.java assumes that >> ZGC is always available, but it's only available on linux-x64, so this >> will fail on other platforms. Updated webrev to take this into >> account. It's currently running in mach5 t{1-3} on all platforms. >> >> http://cr.openjdk.java.net/~pliden/8205702/webrev.4 >> >> /Per >> >> On 06/27/2018 08:34 PM, Per Liden wrote: >>> On 06/27/2018 08:29 PM, Calvin Cheung wrote: >>>> >>>> >>>> On 6/27/18, 11:08 AM, Per Liden wrote: >>>>> Hi Calvin, >>>>> >>>>> On 06/27/2018 07:39 PM, Calvin Cheung wrote: >>>>>> Hi Per, >>>>>> >>>>>> Thanks for coming up with a simpler fix. It looks good. Just one >>>>>> comment below. >>>>>> >>>>>> On 6/27/18, 2:33 AM, Per Liden wrote: >>>>>>> Actually, that seems a bit too restrictive as >>>>>>> vm.cds.archived.java.heap is only true when G1 is enabled. >>>>>>> >>>>>>> So, this is probably even better: >>>>>>> >>>>>>> * @requires vm.cds >>>>>>> * @requires vm.opt.final.UseCompressedOops >>>>>>> * @requires vm.opt.final.UseCompressedClassPointers >>>>>> I think the @requires vm.cds calls into VMProps.vmCDS() which >>>>>> calls the WB_IsCDSIncludedInVmBuild() where it already checks for >>>>>> compressed oops and pointers: >>>>>> >>>>>> if (!UseCompressedOops || !UseCompressedClassPointers) { >>>>>> // On 64-bit VMs, CDS is supported only with compressed >>>>>> oops/pointers >>>>>> return false; >>>>>> } >>>>>> >>>>>> Are the last two @requires needed? >>>>> >>>>> That's an excellent point, and you're right, those extra @requires >>>>> are not needed. New webrev: >>>>> >>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.3 >>>> Looks good. >>> >>> Thanks Calvin! >>> >>> /Per >>> >>>> >>>> thanks, >>>> Calvin >>>>> >>>>> Thanks for reviewing! >>>>> >>>>> cheers, >>>>> Per >>>>> >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>>>> >>>>>>> Updated webrev: http://cr.openjdk.java.net/~pliden/8205702/webrev.2 >>>>>>> >>>>>>> /Per >>>>>>> >>>>>>> On 06/27/2018 10:37 AM, Per Liden wrote: >>>>>>>> Updated webrev, which adjusts the @requires tag, from: >>>>>>>> >>>>>>>> @requires vm.cds & vm.gc != "Z" >>>>>>>> >>>>>>>> to: >>>>>>>> >>>>>>>> @requires vm.cds.archived.java.heap >>>>>>>> >>>>>>>> which I believe is more correct in this case. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.1 >>>>>>>> >>>>>>>> cheers, >>>>>>>> Per >>>>>>>> >>>>>>>> >>>>>>>> On 06/27/2018 09:15 AM, Per Liden wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> This doesn't look quite right to me. ZGC already disables >>>>>>>>> UseCompressedOop and UseCompressedClassPointers, which should >>>>>>>>> be the indicators that we can't use CDS. The problem is that >>>>>>>>> CDS checks those flags _before_ the heap has had a change to >>>>>>>>> say they it supports. So if we just move the call to >>>>>>>>> set_shared_spaces_flags() after the call to >>>>>>>>> GCConfig::arguments()->initialize() (which should be safe), >>>>>>>>> then we're all good and you'll get the usual: >>>>>>>>> >>>>>>>>> $ ./build/fastdebug/images/jdk/bin/java -Xshare:dump >>>>>>>>> -XX:+UnlockExperimentalVMOptions -XX:+UseZGC >>>>>>>>> Error occurred during initialization of VM >>>>>>>>> Cannot dump shared archive when UseCompressedOops or >>>>>>>>> UseCompressedClassPointers is off. >>>>>>>>> >>>>>>>>> Here's a proposed patch for this, which also adjusts the >>>>>>>>> appropriate tests for this: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~pliden/8205702/webrev.0 >>>>>>>>> >>>>>>>>> cheers, >>>>>>>>> Per >>>>>>>>> >>>>>>>>> On 06/27/2018 01:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> >>>>>>>>>> Hi Calvin, thank you for reporting the bug and the code review >>>>>>>>>> and test code. >>>>>>>>>> >>>>>>>>>> On 6/26/18 5:42 PM, Calvin Cheung wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> The code changes look good. >>>>>>>>>>> >>>>>>>>>>> Since there's a new error message, I'd suggest adding a test >>>>>>>>>>> to runtime/SharedArchiveFile/SharedArchiveFile.java as follows: >>>>>>>>>>> >>>>>>>>>>> diff --git >>>>>>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>>>> >>>>>>>>>>> --- >>>>>>>>>>> a/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>>>> >>>>>>>>>>> +++ >>>>>>>>>>> b/test/hotspot/jtreg/runtime/SharedArchiveFile/SharedArchiveFile.java >>>>>>>>>>> >>>>>>>>>>> @@ -52,5 +52,13 @@ >>>>>>>>>>> "-Xshare:on", "-version"); >>>>>>>>>>> out = CDSTestUtils.executeAndLog(pb, >>>>>>>>>>> "SharedArchiveFile"); >>>>>>>>>>> CDSTestUtils.checkExec(out); >>>>>>>>>>> + >>>>>>>>>>> + // CDS dumping doesn't work with ZGC >>>>>>>>>>> + ProcessBuilder pb = >>>>>>>>>>> ProcessTools.createJavaProcessBuilder(true, >>>>>>>>>>> + "-XX:SharedArchiveFile=./SharedArchiveFile.jsa", >>>>>>>>>>> + "-XX:+UseZGC", >>>>>>>>>>> + "-Xshare:dump"); >>>>>>>>>>> + out = CDSTestUtils.executeAndLog(pb, >>>>>>>>>>> "SharedArchiveFile"); >>>>>>>>>>> + CDSTestUtils.checkExecExpectError(out, 1, >>>>>>>>>>> "DumpSharedSpaces (-Xshare:dump) is not supported with ZGC."); >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> (I haven't tested the above) >>>>>>>>>> >>>>>>>>>> It needed an -XX:+UnlockExperimentalVMOptions as well, and not >>>>>>>>>> reclare pb. >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8205702.02/webrev >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Also, I think the new error message should be included in the >>>>>>>>>>> release notes. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I added the test case and it passes. I don't think having a >>>>>>>>>> release note for something that nobody would ever do for an >>>>>>>>>> experimental option is worth having. But I can look into the >>>>>>>>>> ZGC release notes and see if there's something that says CDS >>>>>>>>>> is not supported. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>>>>> thanks, >>>>>>>>>>> Calvin >>>>>>>>>>> >>>>>>>>>>> On 6/26/18, 2:13 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> Summary: Disable CDS with ZGC >>>>>>>>>>>> >>>>>>>>>>>> Tested with: >>>>>>>>>>>> java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xshare:dump >>>>>>>>>>>> java -XX:+UnlockExperimentalOptions -XX:+UseZGC -Xshare:on >>>>>>>>>>>> -version >>>>>>>>>>>> >>>>>>>>>>>> open webrev at >>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8205702.01/webrev >>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205702 >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Coleen >>>>>>>>>> From kim.barrett at oracle.com Thu Jun 28 02:39:03 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 27 Jun 2018 22:39:03 -0400 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: <0017cfb1-4067-6670-4fb7-6ad504e0adb1@oracle.com> References: <0017cfb1-4067-6670-4fb7-6ad504e0adb1@oracle.com> Message-ID: <7E0CB80E-91B6-47CB-8262-B535FBA13A9D@oracle.com> > On Jun 27, 2018, at 9:23 AM, Erik ?sterlund wrote: > > Incremental: > http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02_03/ > > Full: > http://cr.openjdk.java.net/~eosterlund/8205683/webrev.03/ I really like the overall approach. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/memAllocator.hpp 40 CollectedHeap *const _heap; 41 Thread * const _thread; 42 Klass *const _klass; Inconsistent placement of "*"; and it seems more usual in our code (and C++ more widely) to type-cuddle pointer/reference (like everywhere else in this file). ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/memAllocator.hpp 52 MemAllocator(Klass* klass, size_t word_size) 53 : _heap(Universe::heap()), 54 _thread(Thread::current()), 55 _klass(klass), 56 _word_size(word_size) 57 { } It seems like there would be a lot of allocators constructed in contexts where the heap or thread are already known (e.g. CollectedHeap::obj_allocate and friends, which can refer to THREAD), so that having additional constructor overloads would be useful. Though that would be nicer with C++11 forwarding constructors. [Aside: That is my personal preferred formatting for constructors.] ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/memAllocator.hpp 66 // Raw memory allocation. This may or may not use TLAB allocations to satisfy the 67 // allocation. A GC implementation may override this function to satisfy the allocation 68 // in any way. But the default is to try a TLAB allocation, and otherwise perform 69 // mem_allocate. 70 virtual HeapWord* mem_allocate(Allocation& allocation) const; How would a GC override this function? I don't see any examples, and had some trouble coming up with a mechanism / idiom. My guess is that the idea is that a collector overrides CollectedHeap::obj_allocate and friends and uses a different set of Allocator classes? If so, it might be worth spelling that out. If not, then I've completely missed the intended design for extension here. [And I wonder how nicely that will work out. I guess the Shenandoah folks can complain if it seems overly contorted.] ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/memAllocator.cpp 47 oop* _obj; I think we usually use "obj" for an oop, and use other names for oop*. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/memAllocator.cpp 88 void set_obj(oop obj) const { *_obj = obj; } I was wondering why we need this function. I did track it down eventually; it's to preserve the oop across a potential safepoint by the jvmti sampler. Maybe use a focused protection RAII object rather than exposing a public setter? ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/memAllocator.hpp 95 const size_t hs = oopDesc::header_size() + 1; I think this should be using objArrayOopDesc::header_size(_klass->element_type()). ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/memAllocator.cpp 272 // 3. Try refilling the TLAB and allocating the object in it. "3." seems left-over from earlier version of code. ------------------------------------------------------------------------------ There are a number of places (particularly in memAllocator.cpp) where there are calls related to "optional" features (JVMTI, DTRACE, &etc) where it seems like there ought to be some conditionalization. But the old code didn't have any of that either, so okay for now. ------------------------------------------------------------------------------ From kim.barrett at oracle.com Thu Jun 28 05:00:44 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 28 Jun 2018 01:00:44 -0400 Subject: RFC: Patch for 8203848: Missing remembered set entry in j.l.ref.references after JDK-8203028 In-Reply-To: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> References: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> Message-ID: <1F627247-6EF1-4350-81C4-4E2D0C74AFD2@oracle.com> > On Jun 7, 2018, at 5:47 AM, Thomas Schatzl wrote: > > Hi all, > > I would like to ask for comments on the fix for the issue handled in > JDK-8203848. > > In particular, the problem is that currently the "discovered" field of > j.l.ref.References is managed completely opaque to the rest of the VM, > which causes the described error: remembered set entries are missing > for that reference when doing *concurrent* reference discovery. > > There is no problem with liveness of objects referenced by that because > a) a j.l.ref.Reference object found by reference discovery will be > automatically kept alive and > b) no GC in Hotspot at this time evacuates old gen objects during > marking (and Z does not use the reference processing framework at all), > so that reference in the "discovered" field will never be outdated. > > However in the future, G1 might want to move objects in old gen at any > time for e.g. defragmentation purposes, and I am a bit unclear about > Shenandoah tbh :) > > I see two solutions for this issue: > - improve the access modifier so that at least the post-barrier that is > responsible for adding remembered set entries is invoked on this field. > E.g. in ReferenceProcessor::add_to_discovered_list_mt(), instead of > > oop retest = RawAccess<>::oop_atomic_cmpxchg(next_discovered, > discovered_addr, oop(NULL)); > > do a > > oop retest = > HeapAccess::oop_atomic_cmpxchg(next_discovered, > discovered_addr, oop(NULL)); > > Note that I am almost confident that this only makes G1 work as far as > I understand the access framework; since the previous value is NULL > when we cmpxchg, G1 can skip the pre-barrier; maybe more is needed for > Shenandoah, but I hope that Shenandoah devs can chime in here. > > I tested this, and with this change the problem does not occur after > 2000 iterations of the test. > > (see the preliminary webrev at http://cr.openjdk.java.net/~tschatzl/820 > 3848/webrev/ ; only the change to referenceProcessor.cpp is relevant > here). > > - the other "solution" is to fix the remembered set verification to > ignore this field, and try to fix this again in the future when/if G1 > evacuates old regions during marking. > > Any comments? > > Thanks, > Thomas Using AS_NO_KEEPALIVE seems okay to me. From shade at redhat.com Thu Jun 28 06:52:58 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 28 Jun 2018 08:52:58 +0200 Subject: RFR: 8205679: Remove unused ThreadLocalAllocBuffer::undo_allocate() In-Reply-To: <0c186b36-db83-9f8a-2f32-c6415daeba2a@oracle.com> References: <1815fb83-9bd0-423e-bb36-af14b1033e58@redhat.com> <0c186b36-db83-9f8a-2f32-c6415daeba2a@oracle.com> Message-ID: <5b036256-b06e-e701-9669-360bcd09a307@redhat.com> On 06/27/2018 10:53 AM, Per Liden wrote: > Hi Aleksey, > > Thanks for reviewing. > > On 06/26/2018 06:09 PM, Aleksey Shipilev wrote: >> On 06/26/2018 12:45 PM, Per Liden wrote: >>> After JDK-8205676, the function ThreadLocalAllocBuffer::undo_allocate() is unused and can be >>> removed. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205679 >>> Webrev: http://cr.openjdk.java.net/~pliden/8205679/webrev.0 >> >> The patch looks good, but the method itself looks generic enough to keep around. At some point, >> Shenandoah may switch to using this instead of PLABs. > > We typically don't leave unused code around (as it tends to rot quickly), unless we know it will be > used in the near/mid-term future. So, I guess the question is, are you already using this in the > shanandoah repo, or will soon-ish? [1] If not I'd prefer to remove it. The hg history will preserve > it, so it's trivial to bring it back if you find that you really do needed it in the future. We don't have the patch in Shenandoah repos yet. OK, let's remove it, and reinstate later if needed. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From kim.barrett at oracle.com Thu Jun 28 09:13:25 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 28 Jun 2018 05:13:25 -0400 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: <7E0CB80E-91B6-47CB-8262-B535FBA13A9D@oracle.com> References: <0017cfb1-4067-6670-4fb7-6ad504e0adb1@oracle.com> <7E0CB80E-91B6-47CB-8262-B535FBA13A9D@oracle.com> Message-ID: > On Jun 27, 2018, at 10:39 PM, Kim Barrett wrote: > >> On Jun 27, 2018, at 9:23 AM, Erik ?sterlund wrote: >> >> Incremental: >> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02_03/ >> >> Full: >> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.03/ > > I really like the overall approach. Erik and I talked off-line. He confirmed my understanding of how the extension protocol for mem_allocate is supposed to work. The remaining suggestions were easily dealt with, so I don't need to wait for another webrev. Looks good. From per.liden at oracle.com Thu Jun 28 10:11:52 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 28 Jun 2018 12:11:52 +0200 Subject: RFR: 8205993: ZGC: Fix typos and incorrect indentations Message-ID: <9adf512d-6852-102c-c7a7-35cd458a50a4@oracle.com> Stefan and I found various typos and incorrect code/comment indentations in ZGC code. Here's a patch to address those. All changes are trivial and not touching any logic. Bug: https://bugs.openjdk.java.net/browse/JDK-8205993 Webrev: http://cr.openjdk.java.net/~pliden/8205993/webrev.0 /Per From stefan.karlsson at oracle.com Thu Jun 28 10:11:15 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 28 Jun 2018 12:11:15 +0200 Subject: RFR: 8205993: ZGC: Fix typos and incorrect indentations In-Reply-To: <9adf512d-6852-102c-c7a7-35cd458a50a4@oracle.com> References: <9adf512d-6852-102c-c7a7-35cd458a50a4@oracle.com> Message-ID: <002ff647-a6bf-0dd5-36ad-f56c4190d963@oracle.com> Looks good. StefanK On 2018-06-28 12:11, Per Liden wrote: > Stefan and I found various typos and incorrect code/comment indentations > in ZGC code. Here's a patch to address those. All changes are trivial > and not touching any logic. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205993 > Webrev: http://cr.openjdk.java.net/~pliden/8205993/webrev.0 > > /Per From per.liden at oracle.com Thu Jun 28 10:15:38 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 28 Jun 2018 12:15:38 +0200 Subject: RFR: 8205993: ZGC: Fix typos and incorrect indentations In-Reply-To: <002ff647-a6bf-0dd5-36ad-f56c4190d963@oracle.com> References: <9adf512d-6852-102c-c7a7-35cd458a50a4@oracle.com> <002ff647-a6bf-0dd5-36ad-f56c4190d963@oracle.com> Message-ID: Thanks Stefan! I'll push immediately. /Per On 06/28/2018 12:11 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2018-06-28 12:11, Per Liden wrote: >> Stefan and I found various typos and incorrect code/comment >> indentations in ZGC code. Here's a patch to address those. All changes >> are trivial and not touching any logic. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205993 >> Webrev: http://cr.openjdk.java.net/~pliden/8205993/webrev.0 >> >> /Per From erik.osterlund at oracle.com Thu Jun 28 13:00:08 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 28 Jun 2018 15:00:08 +0200 Subject: RFR: 8205683: Refactor heap allocation to separate concerns In-Reply-To: References: <0017cfb1-4067-6670-4fb7-6ad504e0adb1@oracle.com> <7E0CB80E-91B6-47CB-8262-B535FBA13A9D@oracle.com> Message-ID: <25791262-7030-7779-b7af-c0188119e2bb@oracle.com> Hi Kim, Thank you for the review. /Erik On 2018-06-28 11:13, Kim Barrett wrote: >> On Jun 27, 2018, at 10:39 PM, Kim Barrett wrote: >> >>> On Jun 27, 2018, at 9:23 AM, Erik ?sterlund wrote: >>> >>> Incremental: >>> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.02_03/ >>> >>> Full: >>> http://cr.openjdk.java.net/~eosterlund/8205683/webrev.03/ >> I really like the overall approach. > Erik and I talked off-line. He confirmed my understanding of how the > extension protocol for mem_allocate is supposed to work. The > remaining suggestions were easily dealt with, so I don't need to wait > for another webrev. Looks good. > From thomas.schatzl at oracle.com Thu Jun 28 13:26:58 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 28 Jun 2018 15:26:58 +0200 Subject: RFC: Patch for 8203848: Missing remembered set entry in j.l.ref.references after JDK-8203028 In-Reply-To: <1F627247-6EF1-4350-81C4-4E2D0C74AFD2@oracle.com> References: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> <1F627247-6EF1-4350-81C4-4E2D0C74AFD2@oracle.com> Message-ID: Hi Kim and Erik, thanks for your comments. I updated the webrev at http://cr.openjdk.java.net/~tschatzl/8203848/webrev/ to only contain the cmpxchg change. Unless somebody objects I will push this in the next few days. Thanks, Thomas On Thu, 2018-06-28 at 01:00 -0400, Kim Barrett wrote: > > On Jun 7, 2018, at 5:47 AM, Thomas Schatzl > om> wrote: > > > > Hi all, > > > > I would like to ask for comments on the fix for the issue handled > > in JDK-8203848. > > > > In particular, the problem is that currently the "discovered" field > > of j.l.ref.References is managed completely opaque to the rest of > > the VM, which causes the described error: remembered set entries > > are missing for that reference when doing *concurrent* reference > > discovery. > > > > There is no problem with liveness of objects referenced by that > > because > > a) a j.l.ref.Reference object found by reference discovery will be > > automatically kept alive and > > b) no GC in Hotspot at this time evacuates old gen objects during > > marking (and Z does not use the reference processing framework at > > all), > > so that reference in the "discovered" field will never be outdated. > > > > However in the future, G1 might want to move objects in old gen at > > any time for e.g. defragmentation purposes, and I am a bit unclear > > about Shenandoah tbh :) > > > > I see two solutions for this issue: > > - improve the access modifier so that at least the post-barrier > > that is responsible for adding remembered set entries is invoked on > > this field. E.g. in > > ReferenceProcessor::add_to_discovered_list_mt(), instead of > > > > oop retest = RawAccess<>::oop_atomic_cmpxchg(next_discovered, > > discovered_addr, oop(NULL)); > > > > do a > > > > oop retest = > > HeapAccess::oop_atomic_cmpxchg(next_discovered, > > discovered_addr, oop(NULL)); > > > > Note that I am almost confident that this only makes G1 work as far > > as I understand the access framework; since the previous value is > > NULL when we cmpxchg, G1 can skip the pre-barrier; maybe more is > > needed for Shenandoah, but I hope that Shenandoah devs can chime in > > here. > > > > I tested this, and with this change the problem does not occur > > after 2000 iterations of the test. > > > > (see the preliminary webrev at > > http://cr.openjdk.java.net/~tschatzl/8203848/webrev/ ; only the > > change to referenceProcessor.cpp is > > relevant > > here). > > > > - the other "solution" is to fix the remembered set verification to > > ignore this field, and try to fix this again in the future when/if > > G1 > > evacuates old regions during marking. > > > > Any comments? > > > > Thanks, > > Thomas > > Using AS_NO_KEEPALIVE seems okay to me. > From erik.osterlund at oracle.com Thu Jun 28 15:13:52 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 28 Jun 2018 17:13:52 +0200 Subject: RFC: Patch for 8203848: Missing remembered set entry in j.l.ref.references after JDK-8203028 In-Reply-To: References: <9d7be491c8d2445d92b4ff00aeaf735c405ff9cb.camel@oracle.com> <1F627247-6EF1-4350-81C4-4E2D0C74AFD2@oracle.com> Message-ID: <0263e52e-a0a0-be65-18c0-8cb5c7e86191@oracle.com> Hi Thomas, Ship it. Thanks, /Erik On 2018-06-28 15:26, Thomas Schatzl wrote: > Hi Kim and Erik, > > thanks for your comments. > > I updated the webrev at > > http://cr.openjdk.java.net/~tschatzl/8203848/webrev/ > > to only contain the cmpxchg change. > > Unless somebody objects I will push this in the next few days. > > Thanks, > Thomas > > > On Thu, 2018-06-28 at 01:00 -0400, Kim Barrett wrote: >>> On Jun 7, 2018, at 5:47 AM, Thomas Schatzl >> om> wrote: >>> >>> Hi all, >>> >>> I would like to ask for comments on the fix for the issue handled >>> in JDK-8203848. >>> >>> In particular, the problem is that currently the "discovered" field >>> of j.l.ref.References is managed completely opaque to the rest of >>> the VM, which causes the described error: remembered set entries >>> are missing for that reference when doing *concurrent* reference >>> discovery. >>> >>> There is no problem with liveness of objects referenced by that >>> because >>> a) a j.l.ref.Reference object found by reference discovery will be >>> automatically kept alive and >>> b) no GC in Hotspot at this time evacuates old gen objects during >>> marking (and Z does not use the reference processing framework at >>> all), >>> so that reference in the "discovered" field will never be outdated. >>> >>> However in the future, G1 might want to move objects in old gen at >>> any time for e.g. defragmentation purposes, and I am a bit unclear >>> about Shenandoah tbh :) >>> >>> I see two solutions for this issue: >>> - improve the access modifier so that at least the post-barrier >>> that is responsible for adding remembered set entries is invoked on >>> this field. E.g. in >>> ReferenceProcessor::add_to_discovered_list_mt(), instead of >>> >>> oop retest = RawAccess<>::oop_atomic_cmpxchg(next_discovered, >>> discovered_addr, oop(NULL)); >>> >>> do a >>> >>> oop retest = >>> HeapAccess::oop_atomic_cmpxchg(next_discovered, >>> discovered_addr, oop(NULL)); >>> >>> Note that I am almost confident that this only makes G1 work as far >>> as I understand the access framework; since the previous value is >>> NULL when we cmpxchg, G1 can skip the pre-barrier; maybe more is >>> needed for Shenandoah, but I hope that Shenandoah devs can chime in >>> here. >>> >>> I tested this, and with this change the problem does not occur >>> after 2000 iterations of the test. >>> >>> (see the preliminary webrev at >>> http://cr.openjdk.java.net/~tschatzl/8203848/webrev/ ; only the >>> change to referenceProcessor.cpp is >>> relevant >>> here). >>> >>> - the other "solution" is to fix the remembered set verification to >>> ignore this field, and try to fix this again in the future when/if >>> G1 >>> evacuates old regions during marking. >>> >>> Any comments? >>> >>> Thanks, >>> Thomas >> Using AS_NO_KEEPALIVE seems okay to me. >> From jiangli.zhou at Oracle.COM Thu Jun 28 23:15:07 2018 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Thu, 28 Jun 2018 16:15:07 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules Message-ID: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. Following are the details of system module archiving, which are duplicated in above bug report. --------------------------------------------------------------------------------------------------------------------------- Support archiving system module graph when the initial module is unnamed module from -cp currently. Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. Dump time system module object archiving ================================= At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. private static SystemModules archivedSystemModules; private static ModuleFinder archivedSystemModuleFinder; private static String archivedMainModule; The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. Runtime initialization from archived system module objects ============================================ VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. Thanks, Jiangli From erik.joelsson at oracle.com Fri Jun 29 00:44:12 2018 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Thu, 28 Jun 2018 17:44:12 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> Message-ID: <673771d5-0e61-6338-d222-9d1bd7f2826b@oracle.com> Build changes look good. /Erik On 2018-06-28 16:15, Jiangli Zhou wrote: > This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). > > The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. > > The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. > > webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ > RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 > > Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. > > Following are the details of system module archiving, which are duplicated in above bug report. > --------------------------------------------------------------------------------------------------------------------------- > Support archiving system module graph when the initial module is unnamed module from -cp currently. > > Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. > > Dump time system module object archiving > ================================= > At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. > > private static SystemModules archivedSystemModules; > private static ModuleFinder archivedSystemModuleFinder; > private static String archivedMainModule; > > The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. > > 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. > 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. > 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. > 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. > 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. > > Runtime initialization from archived system module objects > ============================================ > VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. > > If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. > > In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. > > Thanks, > Jiangli > > From jiangli.zhou at oracle.com Fri Jun 29 01:02:03 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 28 Jun 2018 18:02:03 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <673771d5-0e61-6338-d222-9d1bd7f2826b@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <673771d5-0e61-6338-d222-9d1bd7f2826b@oracle.com> Message-ID: <8BE8EC92-9624-4D8B-BF09-E7A5D41765DC@oracle.com> Hi Erik, Thank you for the quick review! Jiangli > On Jun 28, 2018, at 5:44 PM, Erik Joelsson wrote: > > Build changes look good. > > /Erik > > > On 2018-06-28 16:15, Jiangli Zhou wrote: >> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). >> >> The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. >> >> The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 >> >> Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. >> >> Following are the details of system module archiving, which are duplicated in above bug report. >> --------------------------------------------------------------------------------------------------------------------------- >> Support archiving system module graph when the initial module is unnamed module from -cp currently. >> >> Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. >> >> Dump time system module object archiving >> ================================= >> At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. >> >> private static SystemModules archivedSystemModules; >> private static ModuleFinder archivedSystemModuleFinder; >> private static String archivedMainModule; >> >> The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. >> >> 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. >> 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. >> 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. >> 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. >> 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. >> >> Runtime initialization from archived system module objects >> ============================================ >> VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. >> >> If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. >> >> In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. >> >> Thanks, >> Jiangli >> >> >