From yasuenag at gmail.com Sat Jul 1 14:44:23 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Sat, 1 Jul 2017 23:44:23 +0900 Subject: JDK-8153333: [REDO] STW phases at Concurrent GC should count in PerfCounter In-Reply-To: References: Message-ID: PING: Have you checked this issue? Yasumasa On 2017/06/14 13:22, Yasumasa Suenaga wrote: > Hi all, > > I changed PerfCounter to show CGC STW phase in jstat in JDK-8151674. > However, it occurred several jtreg test failure, so it was back-outed. > > I want to resume to work for this issue. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.03/hotspot/ > http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.03/jdk/ > > These changes are work fine on jtreg test as below: > > hotspot/test/serviceability/tmtools/jstat > jdk/test/sun/tools > > > Since JDK 9, default GC algorithm is set to G1. > So I think this change is useful to watch GC behavior through jstat. > > I cannot access JPRT. Could you help? > > > Thanks, > > Yasumasa > From mikael.gerdin at oracle.com Mon Jul 3 07:35:26 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 3 Jul 2017 09:35:26 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> Message-ID: <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> Hi Roman, On 2017-06-30 18:32, Roman Kennke wrote: > I came across one problem using this approach: We will have 2 instances > of CollectedHeap around, where there's usually only 1, and some code > expects only 1. For example, in CollectedHeap constructor, we create new > PerfData variables, and we now create them 2x, which leads to an assert > being thrown. I suspect there is more code like that. > > I will attempt to refactor this a little more, maybe it's not that bad, > but it's probably not worth spending too much time on it. I think refactoring the code to not expect a singleton CollectedHeap instance is a bit too much. Perhaps there is another way to share common code between Serial and CMS but that might require a bit more thought. /Mikael > > Roman >> Hi Roman, >> >> thanks for putting this patch together, it is a great step forward! One >> thung that (in my mind) would improve it even further is if we embed a >> GenCollectedHeap in CMSHeap and then make CMSHeap inherit directly from >> CollectedHeap. >> >> With this solution, the definition of CMSHeap would look like something >> along the lines of: >> >> class CMSHeap : public CollectedHeap { >> WorkGang* _wg; >> GenCollectedHeap _gch; >> >> public: >> CMSHeap(GenCollectorPolicy* policy) : >> _wg(new WorkGang("GC Thread", ParallelGCThreads, true, true), >> _gch(policy) { >> _wg->initialize_workers(); >> } >> >> // a bunch of "facade" methods >> virtual bool supports_tlab_allocation() const { >> return _gch->supports_tlab_allocation(); >> } >> >> virtual size_t tlab_capacity(Thread* t) const { >> return _gch->tlab_capacity(t); >> } >> }; >> >> With this approach, you would have to implement a bunch of "facade" >> methods that just delegates to _gch, such as the methods >> supports_tlab_allocation and tlab_capacity above. There are two reasons >> why I prefer this approach: >> 1. In the end we want CMSHeap to inherit from CollectedHeap anyway :) >> 2. It makes it very clear which methods we gradually have to >> re-implement in CMSHeap to eventually get rid of the _gch field (the >> end goal). This is much harder to see if CMSHeap inherits from >> GenCollectedHeap (see more below). >> >> The second point will most likely cause some initial problems with >> `protected` code in GenCollectedHeap. For example, as you noticed when >> creating this patch, CMSHeap make use of a few `protected` fields and >> methods from GenCollectedHeap, most notably: >> - _process_strong_tasks >> - process_roots() >> - process_string_table_roots() >> >> It would be much better (IMO) to share this code via composition rather >> than inheritance. In this particular case, I would prefer to create a >> class StrongRootsProcessor that encapsulates the root processing logic. >> Then GenCollectedHeap and CMSHeap can both contain an instance of >> StrongRootsProcessor. >> >> What do you think of this approach? Do you have some spare cycles to try >> this approach out? >> >> Thanks, >> Erik >> >> On 06/02/2017 10:55 AM, Roman Kennke wrote: >>> Take this patch. It #ifdef ASSERT's a call to check_gen_kinds() that is >>> only present in debug builds. >>> >>> >>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.01/ >>> >>> >>> Roman >>> >>> Am 01.06.2017 um 22:50 schrieb Roman Kennke: >>>> What $SUBJECT says. >>>> >>>> I went over genCollectedHeap.[hpp|cpp] and moved everything that I could >>>> find that is CMS-only into a new CMSHeap class. >>>> >>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.00/ >>>> >>>> >>>> It is possible that I overlooked something there. There may be code in >>>> there that doesn't shout "CMS" at me, but is still intrinsically CMS stuff. >>>> >>>> Also not that I have not removed that little part: >>>> >>>> always_do_update_barrier = UseConcMarkSweepGC; >>>> >>>> because I expect it to go away with Erik ?'s big refactoring. >>>> >>>> What do you think? >>>> >>>> Testing: hotspot_gc, specjvm, some little apps with -XX:+UseConcMarkSweepGC >>>> >>>> Roman >>>> > From stefan.johansson at oracle.com Mon Jul 3 08:38:47 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 3 Jul 2017 10:38:47 +0200 Subject: RFR: 8183281: Remove unnecessary call to increment_gc_time_stamp In-Reply-To: References: <646a5d9a-b6d3-82c9-3937-027c3193d4c0@oracle.com> <5b2dff36-0a55-feb8-7e80-52e4562a5651@oracle.com> Message-ID: On 2017-06-30 17:34, Erik Helin wrote: > On 06/30/2017 01:53 PM, Stefan Johansson wrote: >> Hi Erik, >> >> On 2017-06-30 11:37, Erik Helin wrote: >>> Hi all, >>> >>> the following small patch removes an unnecessary call to >>> increment_gc_time_stamp from >>> G1CollectedHeap::do_collection_pause_at_safepoint (and the long, >>> wrong, comment above the call). >>> >>> We already do a call increment_gc_time_stamp much earlier in >>> do_collection_pause_at_safepoint, which is enough. The reasons >>> outlined in the comment motivating a second call is no longer true, >>> the code has changed (but the comment has not). >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8183281 >>> Patch: see below >>> Testing: make hotspot >>> >> Patch looks good, but I would like to see some more testing than just >> building hotspot. Running the gc jtreg tests for example. > > Thanks for reviewing! All pass for both fastdebug and product when > running `make test TEST=hotspot_gc` on my Linux workstation. > Thanks for running the tests, ship it! StefanJ > Thanks, > Erik > >> Thanks for cleaning up the code, >> Stefan >>> Thanks, >>> Erik >>> >>> # HG changeset patch >>> # User ehelin >>> # Date 1498814642 -7200 >>> # Fri Jun 30 11:24:02 2017 +0200 >>> # Node ID 62400b3cbec4e0d06e0d6c21c9486070d8c906a4 >>> # Parent 10ccf0a5f63fdca04d9eda2c774ccdd0e12bc1a1 >>> 8183281: Remove unnecessary call to increment_gc_time_stamp >>> >>> diff -r 10ccf0a5f63f -r 62400b3cbec4 >>> src/share/vm/gc/g1/g1CollectedHeap.cpp >>> --- a/src/share/vm/gc/g1/g1CollectedHeap.cpp Thu Jun 29 19:09:04 >>> 2017 +0000 >>> +++ b/src/share/vm/gc/g1/g1CollectedHeap.cpp Fri Jun 30 11:24:02 >>> 2017 +0200 >>> @@ -3266,29 +3266,6 @@ >>> >>> MemoryService::track_memory_usage(); >>> >>> - // In prepare_for_verify() below we'll need to scan the >>> deferred >>> - // update buffers to bring the RSets up-to-date if >>> - // G1HRRSFlushLogBuffersOnVerify has been set. While scanning >>> - // the update buffers we'll probably need to scan cards on the >>> - // regions we just allocated to (i.e., the GC alloc >>> - // regions). However, during the last GC we called >>> - // set_saved_mark() on all the GC alloc regions, so card >>> - // scanning might skip the [saved_mark_word()...top()] area of >>> - // those regions (i.e., the area we allocated objects into >>> - // during the last GC). But it shouldn't. Given that >>> - // saved_mark_word() is conditional on whether the GC time >>> stamp >>> - // on the region is current or not, by incrementing the GC >>> time >>> - // stamp here we invalidate all the GC time stamps on all the >>> - // regions and saved_mark_word() will simply return top() for >>> - // all the regions. This is a nicer way of ensuring this >>> rather >>> - // than iterating over the regions and fixing them. In >>> fact, the >>> - // GC time stamp increment here also ensures that >>> - // saved_mark_word() will return top() between pauses, i.e., >>> - // during concurrent refinement. So we don't need the >>> - // is_gc_active() check to decided which top to use when >>> - // scanning cards (see CR 7039627). >>> - increment_gc_time_stamp(); >>> - >>> if (VerifyRememberedSets) { >>> log_info(gc, verify)("[Verifying RemSets after GC]"); >>> VerifyRegionRemSetClosure v_cl; >> From rkennke at redhat.com Mon Jul 3 09:13:43 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 3 Jul 2017 11:13:43 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> Message-ID: <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> Am 03.07.2017 um 09:35 schrieb Mikael Gerdin: > Hi Roman, > > On 2017-06-30 18:32, Roman Kennke wrote: >> I came across one problem using this approach: We will have 2 instances >> of CollectedHeap around, where there's usually only 1, and some code >> expects only 1. For example, in CollectedHeap constructor, we create new >> PerfData variables, and we now create them 2x, which leads to an assert >> being thrown. I suspect there is more code like that. >> >> I will attempt to refactor this a little more, maybe it's not that bad, >> but it's probably not worth spending too much time on it. > > I think refactoring the code to not expect a singleton CollectedHeap > instance is a bit too much. > Perhaps there is another way to share common code between Serial and > CMS but that might require a bit more thought. Yeah, definitely. I hit another difficulty: pretty much the same issues that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now show up with Generation and its subclasses.. How about we push the original patch that I've posted, and work from there? In fact, I *have* found some little things I would change (some more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have overlooked in my first pass...) Roman > > /Mikael > >> >> Roman >>> Hi Roman, >>> >>> thanks for putting this patch together, it is a great step forward! One >>> thung that (in my mind) would improve it even further is if we embed a >>> GenCollectedHeap in CMSHeap and then make CMSHeap inherit directly from >>> CollectedHeap. >>> >>> With this solution, the definition of CMSHeap would look like something >>> along the lines of: >>> >>> class CMSHeap : public CollectedHeap { >>> WorkGang* _wg; >>> GenCollectedHeap _gch; >>> >>> public: >>> CMSHeap(GenCollectorPolicy* policy) : >>> _wg(new WorkGang("GC Thread", ParallelGCThreads, true, true), >>> _gch(policy) { >>> _wg->initialize_workers(); >>> } >>> >>> // a bunch of "facade" methods >>> virtual bool supports_tlab_allocation() const { >>> return _gch->supports_tlab_allocation(); >>> } >>> >>> virtual size_t tlab_capacity(Thread* t) const { >>> return _gch->tlab_capacity(t); >>> } >>> }; >>> >>> With this approach, you would have to implement a bunch of "facade" >>> methods that just delegates to _gch, such as the methods >>> supports_tlab_allocation and tlab_capacity above. There are two reasons >>> why I prefer this approach: >>> 1. In the end we want CMSHeap to inherit from CollectedHeap anyway :) >>> 2. It makes it very clear which methods we gradually have to >>> re-implement in CMSHeap to eventually get rid of the _gch field >>> (the >>> end goal). This is much harder to see if CMSHeap inherits from >>> GenCollectedHeap (see more below). >>> >>> The second point will most likely cause some initial problems with >>> `protected` code in GenCollectedHeap. For example, as you noticed when >>> creating this patch, CMSHeap make use of a few `protected` fields and >>> methods from GenCollectedHeap, most notably: >>> - _process_strong_tasks >>> - process_roots() >>> - process_string_table_roots() >>> >>> It would be much better (IMO) to share this code via composition rather >>> than inheritance. In this particular case, I would prefer to create a >>> class StrongRootsProcessor that encapsulates the root processing logic. >>> Then GenCollectedHeap and CMSHeap can both contain an instance of >>> StrongRootsProcessor. >>> >>> What do you think of this approach? Do you have some spare cycles to >>> try >>> this approach out? >>> >>> Thanks, >>> Erik >>> >>> On 06/02/2017 10:55 AM, Roman Kennke wrote: >>>> Take this patch. It #ifdef ASSERT's a call to check_gen_kinds() >>>> that is >>>> only present in debug builds. >>>> >>>> >>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.01/ >>>> >>>> >>>> Roman >>>> >>>> Am 01.06.2017 um 22:50 schrieb Roman Kennke: >>>>> What $SUBJECT says. >>>>> >>>>> I went over genCollectedHeap.[hpp|cpp] and moved everything that I >>>>> could >>>>> find that is CMS-only into a new CMSHeap class. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.00/ >>>>> >>>>> >>>>> It is possible that I overlooked something there. There may be >>>>> code in >>>>> there that doesn't shout "CMS" at me, but is still intrinsically >>>>> CMS stuff. >>>>> >>>>> Also not that I have not removed that little part: >>>>> >>>>> always_do_update_barrier = UseConcMarkSweepGC; >>>>> >>>>> because I expect it to go away with Erik ?'s big refactoring. >>>>> >>>>> What do you think? >>>>> >>>>> Testing: hotspot_gc, specjvm, some little apps with >>>>> -XX:+UseConcMarkSweepGC >>>>> >>>>> Roman >>>>> >> From thomas.schatzl at oracle.com Mon Jul 3 09:16:50 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 03 Jul 2017 11:16:50 +0200 Subject: RFR: 8183281: Remove unnecessary call to increment_gc_time_stamp In-Reply-To: <646a5d9a-b6d3-82c9-3937-027c3193d4c0@oracle.com> References: <646a5d9a-b6d3-82c9-3937-027c3193d4c0@oracle.com> Message-ID: <1499073410.2802.0.camel@oracle.com> Hi, On Fri, 2017-06-30 at 11:37 +0200, Erik Helin wrote: > Hi all, > > the following small patch removes an unnecessary call to? > increment_gc_time_stamp from? > G1CollectedHeap::do_collection_pause_at_safepoint (and the long, > wrong,? > comment above the call). > > We already do a call increment_gc_time_stamp much earlier in? > do_collection_pause_at_safepoint, which is enough. The reasons > outlined? > in the comment motivating a second call is no longer true, the code > has? > changed (but the comment has not). > > Bug: https://bugs.openjdk.java.net/browse/JDK-8183281 > Patch: see below > Testing: make hotspot > ? looks good. Thomas From thomas.schatzl at oracle.com Mon Jul 3 09:53:32 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 03 Jul 2017 11:53:32 +0200 Subject: RFR (XXS): 8183394: Remove unused G1RemSet::n_workers() method Message-ID: <1499075612.2802.5.camel@oracle.com> Hi all, ? can I have a review for this trivial removal of an unused method? One Reviewer should be sufficient for this ;) CR: https://bugs.openjdk.java.net/browse/JDK-8183394 Webrev: http://cr.openjdk.java.net/~tschatzl/8183394/webrev/ Testing: Local compilation Thanks, ? Thomas From thomas.schatzl at oracle.com Mon Jul 3 09:58:37 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 03 Jul 2017 11:58:37 +0200 Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count number of cards concurrently refined Message-ID: <1499075917.2802.8.camel@oracle.com> Hi all, ? can I have reviews for this small change that makes?G1Remset::_conc_refined_cards only count the number of concurrently refined cards (+ some trivial renaming of the variable)? The reason is that I plan to add the number of refined cards during gc as separately soon. This has been suggested earlier in some internal discussion, and I agree. CR: https://bugs.openjdk.java.net/browse/JDK-8179677 Webrev: http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/ Testing: jprt Thanks, ? Thomas From thomas.schatzl at oracle.com Mon Jul 3 11:24:48 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 03 Jul 2017 13:24:48 +0200 Subject: RFR (XS): 8183397: Ensure consistent closure filtering during evacuation Message-ID: <1499081088.2802.29.camel@oracle.com> Hi all, ? can I have reviews for this change that fixes an observation that has been made recently by Erik, i.e. that the "else" part of several evacuation closures inconsistently filters out non-cross-region references before checking whether the referenced object is a humongous or ext region. This causes somewhat hard to diagnose performance issues, and earlier filtering does not hurt if done anyway. (Note that the current way of checking in all but the UpdateRS closure using HeapRegion::is_in_same_region() seems optimal. The only reason why the other way in the UpdateRS closure is better because the code needs the "to" HeapRegion pointer anyway) CR: https://bugs.openjdk.java.net/browse/JDK-8183397 Webrev: http://cr.openjdk.java.net/~tschatzl/8183397/webrev/ Testing: jprt,?performance regression analysis Thanks, ? Thomas From thomas.schatzl at oracle.com Mon Jul 3 11:24:53 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 03 Jul 2017 13:24:53 +0200 Subject: RFR (S): 8179679: Rearrange filters before card scanning Message-ID: <1499081093.2802.30.camel@oracle.com> Hi all, ? please have a look at this change that rearranges the checks in the G1RemSet card scanning a bit in order to: - remove some redundant checking made possible recently with?JDK- 8177044 - group together similar checks (so that the compiler can more easily reuse some intermediate values) - minimize unnecessary card claiming CR: https://bugs.openjdk.java.net/browse/JDK-8179679 Webrev: http://cr.openjdk.java.net/~tschatzl/8179679/webrev.1/?(note: there has been a previous webrev, but without reviews; still there is a webrev.0_to_1 for the curious) Testing: jprt,?performance regression analysis Thanks, ? Thomas From mikael.gerdin at oracle.com Mon Jul 3 11:54:54 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 3 Jul 2017 13:54:54 +0200 Subject: RFR (XXS): 8183394: Remove unused G1RemSet::n_workers() method In-Reply-To: <1499075612.2802.5.camel@oracle.com> References: <1499075612.2802.5.camel@oracle.com> Message-ID: <866973a1-3698-e36e-e38d-8a7631fcf1c6@oracle.com> Hi Thomas, On 2017-07-03 11:53, Thomas Schatzl wrote: > Hi all, > > can I have a review for this trivial removal of an unused method? One > Reviewer should be sufficient for this ;) > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183394 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183394/webrev/ Looks good and trivial enough to me. /Mikael > Testing: > Local compilation > > Thanks, > Thomas > From mikael.gerdin at oracle.com Mon Jul 3 11:57:48 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 3 Jul 2017 13:57:48 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <59510D5E.10009@oracle.com> References: <59510D5E.10009@oracle.com> Message-ID: Hi Erik, On 2017-06-26 15:34, Erik ?sterlund wrote: > Hi, > > Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ I think this change makes sense and I agree with your reasoning below. I'm leaning towards suggesting creating a named enum value for "access+1" to begin a move towards getting rid of adding and subtracting values from enums in this code. I don't have a good name for it, though. /Mikael > Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 > > The G1 barrier queues have very awkward lock orderings for the following > reasons: > > 1) These queues may queue up things when performing a reference write or > resolving a jweak (intentionally or just happened to be jweak, even > though it looks like a jobject), which can happen in a lot of places in > the code. We resolve JNIHandles while holding special locks in many > places. We perform reference writes also in many places. Now the > unsuspecting hotspot developer might think that it is okay to resolve a > JNIHandle or perform a reference write while possibly holding a special > lock. But no. In some cases, object writes have been moved out of locks > and replaced with lock-free CAS, only to dodge the G1 write barrier > locks. I don't think the G1 lock ordering issues should shape the shared > code rather than the other way around. > 2) There is an issue that the shared queue locks have a "special" rank, > which is below the lock ranks used by the cbl monitor and free list > monitor. This leads to an issue when these locks have to be taken while > holding the shared queue locks. The current solution is to drop the > shared queue locks temporarily, introducing nasty data races. These > races are guarded, but the whole race seems very unnecessary. > > I argue that if the G1 write barrier queue locks were simply set > appropriately in the first place by analyzing what ranks they should > have, none of the above issues would exist. Therefore I propose this new > ordering. > > Specifically, I recognize that locks required for performing memory > accesses and resolving JNIHandles are more special than the "special" > rank. Therefore, this change introduces a new lock ordering category > called "access", which is to be used by barriers required to perform > memory accesses. In other words, by recognizing the rank is more special > than "special", we can remove "special" code to walk around making its > rank more "special". That seems desirable to me. The access locks need > to comply to the same constraints as the special locks: they may not > perform safepoint checks. > > The old lock ranks were: > > SATB_Q_FL_lock: special > SATB_Q_CBL_mon: leaf - 1 > Shared_SATB_Q_lock: leaf - 1 > > DirtyCardQ_FL_lock: special > DirtyCardQ_CBL_mon: leaf - 1 > Shared_DirtyCardQ_lock: leaf - 1 > > The new lock ranks are: > > SATB_Q_FL_lock: access (special - 2) > SATB_Q_CBL_mon: access (special - 2) > Shared_SATB_Q_lock: access + 1 (special - 1) > > DirtyCardQ_FL_lock: access (special - 2) > DirtyCardQ_CBL_mon: access (special - 2) > Shared_DirtyCardQ_lock: access + 1 (special - 1) > > Analysis: > > Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the same > group of locks. The free list lock, the completed buffer list monitor > and the shared queue lock. > > Observations: > 1) The free list lock and completed buffer list monitors (members of > PtrQueueSet) are disjoint. We never hold both of them at the same time. > Rationale: The free list lock is only used from > PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and > PtrQueueSet::reduce_free_list, and no callsite from there can be > expanded where the cbl monitor is acquired. So therefore it is > impossible to acquire the cbl monitor while holding the free list lock. > The opposite case of acquiring the free list lock while holding the cbl > monitor is also not possible; only the following places acquire the cbl > monitor: PtrQueueSet::enqueue_complete_buffer, > PtrQueueSet::merge_bufferlists, > PtrQueueSet::assert_completed_buffer_list_len_correct, > PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id, > FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer, > DirtyCardQueueSet::clear, > SATBMarkQueueSet::apply_closure_to_completed_buffer and > SATBMarkQueueSet::abandon_partial_marking. Again, neither of these paths > where the cbl monitor is held can expand callsites to a place where the > free list locks are held. Therefore it holds that the cbl monitor can > not be held while the free list lock is held, and the free list lock can > not be held while the cbl monitor is held. Therefore they are held > disjointly. > 2) We might hold the shared queue locks before acquiring the completed > buffer list monitor. (today we drop the shared queue lock then and > reacquire it later as a hack as already described) > 3) We do not acquire a shared queue lock while holding the free list > lock or completed buffer list monitor, as there is no reference from a > PtrQueueSet to its shared queue, so those code paths do not know how to > reach the shared PtrQueue to acquire its lock. The derived classes are > exceptions but they never use the shared queue lock while holding the > completed buffer list monitor or free list lock. DirtyCardQueueSet uses > the shared queue for concatenating logs (in a safepoint without holding > those locks). The SATBMarkQueueSet uses the shared queue for filtering > the buffers, fiddling with activeness, printing and resetting, all > without grabbing any locks. > 4) We do not acquire any other lock (above event) while holding the free > list lock or completed buffer list monitors. This was discovered by > manually expanding the call graphs from where these two locks are held. > > Derived constraints: > a) Because of observation 1, the free list lock and completed buffer > list monitors can have the same rank. > b) Because of observations 1 and 2, the shared queue lock ought to have > a rank higher than the ranks of the free list lock and the completed > buffer list monitors (not the case today). > c) Because of of observation 3 and 2, the free list lock and completed > buffer list monitors ought to have a rank lower than the rank of the > shared queue lock. > d) Because of observation 4 (and constraints a-c), all the barrier locks > should be below the "special" rank without violating any existing ranks. > > The proposed new lock ranks conform to the constraints derived from my > observations. It is worth noting that the potential relationship that > could break (and why they do not) are: > 1) If a lock is acquired from within the barriers that does not involve > the shared queue lock, the free list lock or the completed buffer list > monitor, we have now inverted their relationship as that other lock > would probably have a rank higher than or equal to "special". But due to > observation 4, there are no such cases. > 2) The relationship between the shared queue lock and the completed > buffer list monitor has been changed so both can be held at the same > time if the shared queue lock is acquired first (which it is). This is > arguably the way it should have been from the first place, and the old > solution had ugly hacks where we would drop the shared queue lock to not > run into the lock order assert (and only not to run into the lock order > assert, i.e. not to avoid potential deadlock) to ensure the locks are > not held at the same time. That code has now been removed, so that the > shared queue lock is still held when enqueueing completed buffers (no > dodgy dropping and reclaiming), and the code for handling the races due > to multiple concurrent enqueuers has also been removed and replaced with > an assertion that there simply should not be multiple concurrent > enqueuers. Since the shared queue lock is now held throughout the whole > operation, there will be no concurrent enqueuers. > 3) The completed buffer list monitor used to have a higher rank than the > free list lock. Now they have the same. Therefore, they could previously > allow them to be held at the same time if the cbl monitor was acquired > first. However, as discussed, there is no such case, and they ought to > have the same rank not to confuse their true disjointness. If anyone > insists we do not break this relationship despite the true disjointness, > I could consent to adding another access lock rank, like this: > http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I think it > seems better to have the same rank since they are actually truly > disjoint and should remain disjoint. > > I do recognize that long term, we *might* want a lock-free solution or > something (not saying we do or do not). But until then, the ranks ought > to be corrected so that they do not cause these problems causing > everyone to bash their head against the awkward G1 lock ranks throughout > the code and make hacks around it. > > Testing: JPRT with hotspot all and lots of local testing. > > Thanks, > /Erik From thomas.schatzl at oracle.com Mon Jul 3 12:12:50 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 03 Jul 2017 14:12:50 +0200 Subject: RFR (S): 8183226: Remembered set summarization accesses not fully initialized java thread DCQS Message-ID: <1499083970.2802.33.camel@oracle.com> Hi all, ? can I get reviews for the following change that breaks some dependency cycle in g1remset initialization to fix some (at this time benign) bug when printing remembered set summarization information? The problem is that G1Remset initializes its internal remembered set summarization helper data structure in the constructor, which accesses some DCQS members before we call the initialize methods on the various global DCQS'es in G1CollectedHeap::initialize(). By splitting the initialization of the remembered set summarization into an extra method, this one can be called at the very end of G1CollectedHeap::initialize(), thus breaking the dependency. Benign because the values accessed at that time have the same values as the values after initialization. This also allows for grouping together the initialization of G1RemSet/DCQS/G1ConcurrentRefine related data structures more easily in G1CollectedHeap::initialize(). CR: https://bugs.openjdk.java.net/browse/JDK-8183226 Webrev: http://cr.openjdk.java.net/~tschatzl/8183226/webrev/ Testing: local testing running remembered set summarization manually, jprt Thanks, ? Thomas From erik.helin at oracle.com Mon Jul 3 12:44:18 2017 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 3 Jul 2017 14:44:18 +0200 Subject: RFR (XXS): 8183394: Remove unused G1RemSet::n_workers() method In-Reply-To: <1499075612.2802.5.camel@oracle.com> References: <1499075612.2802.5.camel@oracle.com> Message-ID: <55027601-074b-b92a-7516-b08282291b70@oracle.com> On 07/03/2017 11:53 AM, Thomas Schatzl wrote: > Hi all, > > can I have a review for this trivial removal of an unused method? One > Reviewer should be sufficient for this ;) > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183394 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183394/webrev/ Looks good, Reviewed. Thanks for cleaning this up! Erik > Testing: > Local compilation > > Thanks, > Thomas > From erik.osterlund at oracle.com Mon Jul 3 12:53:58 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 3 Jul 2017 14:53:58 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: References: <59510D5E.10009@oracle.com> Message-ID: <595A3E66.5050705@oracle.com> Hi Mikael, Thank you for the review! Regarding the use of + x in the current enum system for lock rankings, I agree that it is not a brilliant system and you feel a bit sad when your lock rank is "leaf+2". However, sometimes I feel like abstracting numbers with names can become confusing as well - even misleading. Like for example how leaf is no longer a leaf and how it is questionable whether special is really all that special any longer. When I thought about possible name for access + 0 and access + 1, I was thinking something in the lines of "perhaps access_inner/outer or access_leaf/nonleaf", but then that might get confusing if suddenly access will need 3 ranks for some reason and we get an "access_special" rank or something. Perhaps a different solution than enum names would be nice long-term for lock ranks and deadlock detection, but I believe that might be outside of the current scope for this change. Thanks, /Erik On 2017-07-03 13:57, Mikael Gerdin wrote: > Hi Erik, > > On 2017-06-26 15:34, Erik ?sterlund wrote: >> Hi, >> >> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ > > I think this change makes sense and I agree with your reasoning below. > > I'm leaning towards suggesting creating a named enum value for > "access+1" to begin a move towards getting rid of adding and > subtracting values from enums in this code. I don't have a good name > for it, though. > > /Mikael > > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 >> >> The G1 barrier queues have very awkward lock orderings for the >> following reasons: >> >> 1) These queues may queue up things when performing a reference write >> or resolving a jweak (intentionally or just happened to be jweak, >> even though it looks like a jobject), which can happen in a lot of >> places in the code. We resolve JNIHandles while holding special locks >> in many places. We perform reference writes also in many places. Now >> the unsuspecting hotspot developer might think that it is okay to >> resolve a JNIHandle or perform a reference write while possibly >> holding a special lock. But no. In some cases, object writes have >> been moved out of locks and replaced with lock-free CAS, only to >> dodge the G1 write barrier locks. I don't think the G1 lock ordering >> issues should shape the shared code rather than the other way around. >> 2) There is an issue that the shared queue locks have a "special" >> rank, which is below the lock ranks used by the cbl monitor and free >> list monitor. This leads to an issue when these locks have to be >> taken while holding the shared queue locks. The current solution is >> to drop the shared queue locks temporarily, introducing nasty data >> races. These races are guarded, but the whole race seems very >> unnecessary. >> >> I argue that if the G1 write barrier queue locks were simply set >> appropriately in the first place by analyzing what ranks they should >> have, none of the above issues would exist. Therefore I propose this >> new ordering. >> >> Specifically, I recognize that locks required for performing memory >> accesses and resolving JNIHandles are more special than the "special" >> rank. Therefore, this change introduces a new lock ordering category >> called "access", which is to be used by barriers required to perform >> memory accesses. In other words, by recognizing the rank is more >> special than "special", we can remove "special" code to walk around >> making its rank more "special". That seems desirable to me. The >> access locks need to comply to the same constraints as the special >> locks: they may not perform safepoint checks. >> >> The old lock ranks were: >> >> SATB_Q_FL_lock: special >> SATB_Q_CBL_mon: leaf - 1 >> Shared_SATB_Q_lock: leaf - 1 >> >> DirtyCardQ_FL_lock: special >> DirtyCardQ_CBL_mon: leaf - 1 >> Shared_DirtyCardQ_lock: leaf - 1 >> >> The new lock ranks are: >> >> SATB_Q_FL_lock: access (special - 2) >> SATB_Q_CBL_mon: access (special - 2) >> Shared_SATB_Q_lock: access + 1 (special - 1) >> >> DirtyCardQ_FL_lock: access (special - 2) >> DirtyCardQ_CBL_mon: access (special - 2) >> Shared_DirtyCardQ_lock: access + 1 (special - 1) >> >> Analysis: >> >> Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the same >> group of locks. The free list lock, the completed buffer list monitor >> and the shared queue lock. >> >> Observations: >> 1) The free list lock and completed buffer list monitors (members of >> PtrQueueSet) are disjoint. We never hold both of them at the same time. >> Rationale: The free list lock is only used from >> PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and >> PtrQueueSet::reduce_free_list, and no callsite from there can be >> expanded where the cbl monitor is acquired. So therefore it is >> impossible to acquire the cbl monitor while holding the free list >> lock. The opposite case of acquiring the free list lock while holding >> the cbl monitor is also not possible; only the following places >> acquire the cbl monitor: PtrQueueSet::enqueue_complete_buffer, >> PtrQueueSet::merge_bufferlists, >> PtrQueueSet::assert_completed_buffer_list_len_correct, >> PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id, >> FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer, >> DirtyCardQueueSet::clear, >> SATBMarkQueueSet::apply_closure_to_completed_buffer and >> SATBMarkQueueSet::abandon_partial_marking. Again, neither of these >> paths where the cbl monitor is held can expand callsites to a place >> where the free list locks are held. Therefore it holds that the cbl >> monitor can not be held while the free list lock is held, and the >> free list lock can not be held while the cbl monitor is held. >> Therefore they are held disjointly. >> 2) We might hold the shared queue locks before acquiring the >> completed buffer list monitor. (today we drop the shared queue lock >> then and reacquire it later as a hack as already described) >> 3) We do not acquire a shared queue lock while holding the free list >> lock or completed buffer list monitor, as there is no reference from >> a PtrQueueSet to its shared queue, so those code paths do not know >> how to reach the shared PtrQueue to acquire its lock. The derived >> classes are exceptions but they never use the shared queue lock while >> holding the completed buffer list monitor or free list lock. >> DirtyCardQueueSet uses the shared queue for concatenating logs (in a >> safepoint without holding those locks). The SATBMarkQueueSet uses the >> shared queue for filtering the buffers, fiddling with activeness, >> printing and resetting, all without grabbing any locks. >> 4) We do not acquire any other lock (above event) while holding the >> free list lock or completed buffer list monitors. This was discovered >> by manually expanding the call graphs from where these two locks are >> held. >> >> Derived constraints: >> a) Because of observation 1, the free list lock and completed buffer >> list monitors can have the same rank. >> b) Because of observations 1 and 2, the shared queue lock ought to >> have a rank higher than the ranks of the free list lock and the >> completed buffer list monitors (not the case today). >> c) Because of of observation 3 and 2, the free list lock and >> completed buffer list monitors ought to have a rank lower than the >> rank of the shared queue lock. >> d) Because of observation 4 (and constraints a-c), all the barrier >> locks should be below the "special" rank without violating any >> existing ranks. >> >> The proposed new lock ranks conform to the constraints derived from >> my observations. It is worth noting that the potential relationship >> that could break (and why they do not) are: >> 1) If a lock is acquired from within the barriers that does not >> involve the shared queue lock, the free list lock or the completed >> buffer list monitor, we have now inverted their relationship as that >> other lock would probably have a rank higher than or equal to >> "special". But due to observation 4, there are no such cases. >> 2) The relationship between the shared queue lock and the completed >> buffer list monitor has been changed so both can be held at the same >> time if the shared queue lock is acquired first (which it is). This >> is arguably the way it should have been from the first place, and the >> old solution had ugly hacks where we would drop the shared queue lock >> to not run into the lock order assert (and only not to run into the >> lock order assert, i.e. not to avoid potential deadlock) to ensure >> the locks are not held at the same time. That code has now been >> removed, so that the shared queue lock is still held when enqueueing >> completed buffers (no dodgy dropping and reclaiming), and the code >> for handling the races due to multiple concurrent enqueuers has also >> been removed and replaced with an assertion that there simply should >> not be multiple concurrent enqueuers. Since the shared queue lock is >> now held throughout the whole operation, there will be no concurrent >> enqueuers. >> 3) The completed buffer list monitor used to have a higher rank than >> the free list lock. Now they have the same. Therefore, they could >> previously allow them to be held at the same time if the cbl monitor >> was acquired first. However, as discussed, there is no such case, and >> they ought to have the same rank not to confuse their true >> disjointness. If anyone insists we do not break this relationship >> despite the true disjointness, I could consent to adding another >> access lock rank, like this: >> http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I think >> it seems better to have the same rank since they are actually truly >> disjoint and should remain disjoint. >> >> I do recognize that long term, we *might* want a lock-free solution >> or something (not saying we do or do not). But until then, the ranks >> ought to be corrected so that they do not cause these problems >> causing everyone to bash their head against the awkward G1 lock ranks >> throughout the code and make hacks around it. >> >> Testing: JPRT with hotspot all and lots of local testing. >> >> Thanks, >> /Erik From rkennke at redhat.com Mon Jul 3 13:08:22 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 3 Jul 2017 15:08:22 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> Message-ID: In fact, my original plan was to also factor out the serial specific stuff into a new subclass. This means, everything that is truly shared between SerialHeap and CMSHeap would remain in GenCollectedHeap (for now), and everything else would move down to the specific subclasses. Then we can see what remains shared, and what is GC specific, and go from there. What do you think? Roman Am 03.07.2017 um 09:35 schrieb Mikael Gerdin: > Hi Roman, > > On 2017-06-30 18:32, Roman Kennke wrote: >> I came across one problem using this approach: We will have 2 instances >> of CollectedHeap around, where there's usually only 1, and some code >> expects only 1. For example, in CollectedHeap constructor, we create new >> PerfData variables, and we now create them 2x, which leads to an assert >> being thrown. I suspect there is more code like that. >> >> I will attempt to refactor this a little more, maybe it's not that bad, >> but it's probably not worth spending too much time on it. > > I think refactoring the code to not expect a singleton CollectedHeap > instance is a bit too much. > Perhaps there is another way to share common code between Serial and > CMS but that might require a bit more thought. > > /Mikael > >> >> Roman >>> Hi Roman, >>> >>> thanks for putting this patch together, it is a great step forward! One >>> thung that (in my mind) would improve it even further is if we embed a >>> GenCollectedHeap in CMSHeap and then make CMSHeap inherit directly from >>> CollectedHeap. >>> >>> With this solution, the definition of CMSHeap would look like something >>> along the lines of: >>> >>> class CMSHeap : public CollectedHeap { >>> WorkGang* _wg; >>> GenCollectedHeap _gch; >>> >>> public: >>> CMSHeap(GenCollectorPolicy* policy) : >>> _wg(new WorkGang("GC Thread", ParallelGCThreads, true, true), >>> _gch(policy) { >>> _wg->initialize_workers(); >>> } >>> >>> // a bunch of "facade" methods >>> virtual bool supports_tlab_allocation() const { >>> return _gch->supports_tlab_allocation(); >>> } >>> >>> virtual size_t tlab_capacity(Thread* t) const { >>> return _gch->tlab_capacity(t); >>> } >>> }; >>> >>> With this approach, you would have to implement a bunch of "facade" >>> methods that just delegates to _gch, such as the methods >>> supports_tlab_allocation and tlab_capacity above. There are two reasons >>> why I prefer this approach: >>> 1. In the end we want CMSHeap to inherit from CollectedHeap anyway :) >>> 2. It makes it very clear which methods we gradually have to >>> re-implement in CMSHeap to eventually get rid of the _gch field >>> (the >>> end goal). This is much harder to see if CMSHeap inherits from >>> GenCollectedHeap (see more below). >>> >>> The second point will most likely cause some initial problems with >>> `protected` code in GenCollectedHeap. For example, as you noticed when >>> creating this patch, CMSHeap make use of a few `protected` fields and >>> methods from GenCollectedHeap, most notably: >>> - _process_strong_tasks >>> - process_roots() >>> - process_string_table_roots() >>> >>> It would be much better (IMO) to share this code via composition rather >>> than inheritance. In this particular case, I would prefer to create a >>> class StrongRootsProcessor that encapsulates the root processing logic. >>> Then GenCollectedHeap and CMSHeap can both contain an instance of >>> StrongRootsProcessor. >>> >>> What do you think of this approach? Do you have some spare cycles to >>> try >>> this approach out? >>> >>> Thanks, >>> Erik >>> >>> On 06/02/2017 10:55 AM, Roman Kennke wrote: >>>> Take this patch. It #ifdef ASSERT's a call to check_gen_kinds() >>>> that is >>>> only present in debug builds. >>>> >>>> >>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.01/ >>>> >>>> >>>> Roman >>>> >>>> Am 01.06.2017 um 22:50 schrieb Roman Kennke: >>>>> What $SUBJECT says. >>>>> >>>>> I went over genCollectedHeap.[hpp|cpp] and moved everything that I >>>>> could >>>>> find that is CMS-only into a new CMSHeap class. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.00/ >>>>> >>>>> >>>>> It is possible that I overlooked something there. There may be >>>>> code in >>>>> there that doesn't shout "CMS" at me, but is still intrinsically >>>>> CMS stuff. >>>>> >>>>> Also not that I have not removed that little part: >>>>> >>>>> always_do_update_barrier = UseConcMarkSweepGC; >>>>> >>>>> because I expect it to go away with Erik ?'s big refactoring. >>>>> >>>>> What do you think? >>>>> >>>>> Testing: hotspot_gc, specjvm, some little apps with >>>>> -XX:+UseConcMarkSweepGC >>>>> >>>>> Roman >>>>> >> From stefan.johansson at oracle.com Mon Jul 3 13:12:51 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 3 Jul 2017 15:12:51 +0200 Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count number of cards concurrently refined In-Reply-To: <1499075917.2802.8.camel@oracle.com> References: <1499075917.2802.8.camel@oracle.com> Message-ID: <4331c480-7c20-8d39-9ea9-7418a86a878d@oracle.com> On 2017-07-03 11:58, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this small change that > makes G1Remset::_conc_refined_cards only count the number of > concurrently refined cards (+ some trivial renaming of the variable)? > > The reason is that I plan to add the number of refined cards during gc > as separately soon. This has been suggested earlier in some internal > discussion, and I agree. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8179677 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/ Looks good, StefanJ > Testing: > jprt > > Thanks, > Thomas From stefan.johansson at oracle.com Mon Jul 3 14:27:14 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 3 Jul 2017 16:27:14 +0200 Subject: RFR (XS): 8183397: Ensure consistent closure filtering during evacuation In-Reply-To: <1499081088.2802.29.camel@oracle.com> References: <1499081088.2802.29.camel@oracle.com> Message-ID: <80f207b6-5458-10d7-f40b-8001887adef4@oracle.com> On 2017-07-03 13:24, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that fixes an observation that has > been made recently by Erik, i.e. that the "else" part of several > evacuation closures inconsistently filters out non-cross-region > references before checking whether the referenced object is a humongous > or ext region. > > This causes somewhat hard to diagnose performance issues, and earlier > filtering does not hurt if done anyway. > > (Note that the current way of checking in all but the UpdateRS closure > using HeapRegion::is_in_same_region() seems optimal. The only reason > why the other way in the UpdateRS closure is better because the code > needs the "to" HeapRegion pointer anyway) > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183397 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183397/webrev/ Looks good, StefanJ > Testing: > jprt, performance regression analysis > > Thanks, > Thomas From rkennke at redhat.com Mon Jul 3 14:39:34 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 3 Jul 2017 16:39:34 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <72d197f7-a99b-84bc-26f7-c9a84da26ccd@oracle.com> References: <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <676d3b56-cee0-b68a-d700-e43695355148@redhat.com> <1fbd2b4a-9aef-d6db-726e-929b6b466e4c@oracle.com> <08391C19-4675-475C-A30D-F10B364B5AF3@redhat.com> <9a882506-282a-ec74-27de-5b22e258e352@oracle.com> <47667919-0786-56a0-ebf9-d7c1b48766c2@redhat.com> <72d197f7-a99b-84bc-26f7-c9a84da26ccd@oracle.com> Message-ID: Hi Robbin, does this require another review? I am not sure about David Holmes? If not, I'm going to need a sponsor. Thanks and cheers, Roman Am 29.06.2017 um 21:27 schrieb Robbin Ehn: > Hi Roman, > > Thanks, > > There seem to be a performance gain vs old just running VM thread > (again shaky numbers, but an indication): > > Old code with, MonitorUsedDeflationThreshold=0, 0.003099s, avg of 10 > worsed cleanups 0.0213s > Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s, avg of 10 > worsed cleanups 0.0173s > > I'm assuming that combining deflation and nmethods marking in same > pass is the reason for this. > Great! > > I'm happy, looks good! > > Thanks for fixing! > > /Robbin > > On 06/29/2017 08:25 PM, Roman Kennke wrote: >> I just did a run with gcbench. >> I am running: >> >> build/linux-x86_64-normal-server-release/images/jdk/bin/java -jar >> target/benchmarks.jar roots.Sync --jvmArgs "-Xmx8g -Xms8g >> -XX:ParallelSafepointCleanupThreads=1 -XX:-UseBiasedLocking --add-opens >> java.base/jdk.internal.misc=ALL-UNNAMED -XX:+PrintSafepointStatistics" >> -p size=500000 -wi 5 -i 5 -f 1 >> >> i.e. I am giving it 500,000 monitors per thread on 8 java threads. >> >> with VMThread I am getting: >> >> vmop [ threads: total >> initially_running wait_to_block ][ time: spin block sync >> cleanup vmop ] page_trap_count >> 0,646: G1IncCollectionPause [ >> 19 4 6 ][ 0 0 0 >> 158 225 ] 4 >> 1,073: G1IncCollectionPause [ >> 19 5 6 ][ 1 0 1 >> 159 174 ] 5 >> 1,961: G1IncCollectionPause [ >> 19 2 6 ][ 0 0 0 >> 130 66 ] 2 >> 2,202: G1IncCollectionPause [ >> 19 5 6 ][ 1 0 1 >> 127 70 ] 5 >> 2,445: G1IncCollectionPause [ >> 19 7 7 ][ 1 0 1 >> 127 66 ] 7 >> 2,684: G1IncCollectionPause [ >> 19 7 7 ][ 1 0 1 >> 127 66 ] 7 >> 3,371: G1IncCollectionPause [ >> 19 5 7 ][ 1 0 1 >> 127 74 ] 5 >> 3,619: G1IncCollectionPause [ >> 19 5 6 ][ 1 0 1 >> 127 66 ] 5 >> 3,857: G1IncCollectionPause [ >> 19 6 6 ][ 1 0 1 >> 126 68 ] 6 >> >> I.e. it gets to fairly consistent >120us for cleanup. >> >> With 4 safepoint cleanup threads I get: >> >> >> vmop [ threads: total >> initially_running wait_to_block ][ time: spin block sync >> cleanup vmop ] page_trap_count >> 0,650: G1IncCollectionPause [ >> 19 4 6 ][ 0 0 0 >> 63 197 ] 4 >> 0,951: G1IncCollectionPause [ >> 19 0 1 ][ 0 0 0 >> 64 151 ] 0 >> 1,214: G1IncCollectionPause [ >> 19 7 8 ][ 0 0 0 >> 62 93 ] 6 >> 1,942: G1IncCollectionPause [ >> 19 4 6 ][ 1 0 1 >> 59 71 ] 4 >> 2,118: G1IncCollectionPause [ >> 19 6 6 ][ 1 0 1 >> 59 72 ] 6 >> 2,296: G1IncCollectionPause [ >> 19 5 6 ][ 0 0 0 >> 59 69 ] 5 >> >> i.e. fairly consistently around 60 us (I think it's us?!) >> >> I grant you that I'm throwing way way more monitors at it. With just >> 12000 monitors per thread I get columns of 0s under cleanup. :-) >> >> Roman >> >> Here's with 1 tAm 29.06.2017 um 14:17 schrieb Robbin Ehn: >>> The test is using 24 threads (whatever that means), total number of >>> javathreads is 57 (including compiler, etc...). >>> >>> [29.186s][error][os ] Num threads:57 >>> [29.186s][error][os ] omInUseCount:0 >>> [29.186s][error][os ] omInUseCount:2064 >>> [29.187s][error][os ] omInUseCount:1861 >>> [29.188s][error][os ] omInUseCount:1058 >>> [29.188s][error][os ] omInUseCount:2 >>> [29.188s][error][os ] omInUseCount:577 >>> [29.189s][error][os ] omInUseCount:1443 >>> [29.189s][error][os ] omInUseCount:122 >>> [29.189s][error][os ] omInUseCount:47 >>> [29.189s][error][os ] omInUseCount:497 >>> [29.189s][error][os ] omInUseCount:16 >>> [29.189s][error][os ] omInUseCount:113 >>> [29.189s][error][os ] omInUseCount:5 >>> [29.189s][error][os ] omInUseCount:678 >>> [29.190s][error][os ] omInUseCount:105 >>> [29.190s][error][os ] omInUseCount:609 >>> [29.190s][error][os ] omInUseCount:286 >>> [29.190s][error][os ] omInUseCount:228 >>> [29.190s][error][os ] omInUseCount:1391 >>> [29.191s][error][os ] omInUseCount:1652 >>> [29.191s][error][os ] omInUseCount:325 >>> [29.191s][error][os ] omInUseCount:439 >>> [29.192s][error][os ] omInUseCount:994 >>> [29.192s][error][os ] omInUseCount:103 >>> [29.192s][error][os ] omInUseCount:2337 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:1 >>> [29.193s][error][os ] omInUseCount:1 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:1 >>> [29.193s][error][os ] omInUseCount:2 >>> [29.193s][error][os ] omInUseCount:1 >>> [29.193s][error][os ] omInUseCount:1 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:1 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:1 >>> [29.193s][error][os ] omInUseCount:0 >>> [29.193s][error][os ] omInUseCount:0 >>> >>> So in my setup even if you parallel the per thread in use monitors >>> work the synchronization overhead is still larger. >>> >>> /Robbin >>> >>> On 06/29/2017 01:42 PM, Roman Kennke wrote: >>>> How many Java threads are involved in monitor Inflation ? >>>> Parallelization is spread by Java threads (i.e. each worker claims >>>> and deflates monitors of 1 java thread per step). >>>> >>>> Roman >>>> >>>> Am 29. Juni 2017 12:49:58 MESZ schrieb Robbin Ehn >>>> : >>>> >>>> Hi Roman, >>>> >>>> I haven't had the time to test all scenarios, and the numbers are >>>> just an indication: >>>> >>>> Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s avg, >>>> avg of 10 worsed cleanups 0.0173s >>>> Do it 4 workers, MonitorUsedDeflationThreshold=0, 0.002923s avg, >>>> avg of 10 worsed cleanups 0.0199s >>>> Do it VM thread, MonitorUsedDeflationThreshold=1, 0.001889s avg, >>>> avg of 10 worsed cleanups 0.0066s >>>> >>>> When MonitorUsedDeflationThreshold=0 we are talking about 120000 >>>> free monitors to deflate. >>>> And I get worse numbers doing the cleanup in 4 threads. >>>> >>>> Any idea why I see these numbers? >>>> >>>> Thanks, Robbin >>>> >>>> On 06/28/2017 10:23 PM, Roman Kennke wrote: >>>> >>>> >>>> >>>> On 06/27/2017 09:47 PM, Roman Kennke wrote: >>>> >>>> Hi Robbin, >>>> >>>> Ugh. Thanks for catching this. >>>> Problem was that I was accounting the thread-local >>>> deflations twice: >>>> once in thread-local processing (basically a leftover >>>> from my earlier >>>> attempt to implement this accounting) and then >>>> again in >>>> finish_deflate_idle_monitors(). Should be fixed here: >>>> >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/ >>>> >>>> >>>> >>>> >>>> Nit: >>>> safepoint.cpp : ParallelSPCleanupTask >>>> "const char* name = " is not needed and 1 is unused >>>> >>>> >>>> Sorry, I don't understand what you mean by this. I see code >>>> like this: >>>> >>>> const char* name = "deflating idle monitors"; >>>> >>>> and it is used a few lines below, even 2x. >>>> >>>> What's '1 is unused' ? >>>> >>>> >>>> Side question: which jtreg targets do you usually >>>> run? >>>> >>>> >>>> Right now I cherry pick directories from: hotspot/test/ >>>> >>>> I'm going to add a decent test group for local testing. >>>> >>>> That would be good! >>>> >>>> >>>> >>>> >>>> Trying: make test TEST=hotspot_all >>>> gives me *lots* of failures due to missing jcstress >>>> stuff (?!) >>>> And even other subsets seem to depend on several bits >>>> and pieces >>>> that I >>>> have no idea about. >>>> >>>> >>>> Yes, you need to use internal tool 'jib' java integrate >>>> build to get >>>> that work or you can set some environment where the >>>> jcstress >>>> application stuff is... >>>> >>>> Uhhh. We really do want a subset of tests that we can run >>>> reliably and >>>> that are self-contained, how else are people (without that >>>> jib thingy) >>>> supposed to do some sanity checking with their patches? ;-) >>>> >>>> I have a regression on ClassLoaderData root scanning, >>>> this should not >>>> be related, >>>> but I only have 3 patches which could cause this, if it's >>>> not >>>> something in the environment that have changed. >>>> >>>> Let me know if it's my patch :-) >>>> >>>> >>>> Also do not see any immediate performance gains (off vs 4 >>>> threads), it >>>> might be >>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/06994badeb24 >>>> , but I need to-do some more testing. I know you often >>>> run with none >>>> default GSI. >>>> >>>> >>>> First of all, during the course of this review I reduced the >>>> change from >>>> an actual implementation to a kind of framework, and it needs >>>> some >>>> separate changes in the GC to make use of it. Not sure if you >>>> added >>>> corresponding code in (e.g.) G1? >>>> >>>> Also, this is only really visible in code that makes >>>> excessive use of >>>> monitors, i.e. the one linked by Carsten's original patch, or >>>> the test >>>> org.openjdk.gcbench.roots.Synchronizers.test in gc-bench: >>>> >>>> http://icedtea.classpath.org/hg/gc-bench/ >>>> >>>> There are also some popular real-world apps that tend to do >>>> this. From >>>> the top off my head, Cassandra is such an application. >>>> >>>> Thanks, Roman >>>> >>>> >>>> I'll get back to you. >>>> >>>> Thanks, Robbin >>>> >>>> >>>> Roman >>>> >>>> Am 27.06.2017 um 16:51 schrieb Robbin Ehn: >>>> >>>> Hi Roman, >>>> >>>> There is something wrong in calculations: >>>> INFO: Deflate: InCirc=43 InUse=18 Scavenged=25 >>>> ForceMonitorScavenge=0 >>>> : pop=27051 free=215487 >>>> >>>> free is larger than population, have not had the >>>> time to dig into this. >>>> >>>> Thanks, Robbin >>>> >>>> On 06/22/2017 10:19 PM, Roman Kennke wrote: >>>> >>>> So here's the latest iteration of that patch: >>>> >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.08/ >>>> >>>> >>>> >>>> I checked and fixed all the counters. The >>>> problem here is that they >>>> are >>>> not updated in a single place >>>> (deflate_idle_monitors() ) but in >>>> several >>>> places, potentially by multiple threads. I >>>> split up deflation into >>>> prepare_.. and a finish_.. methods to >>>> initialize local and update >>>> global >>>> counters respectively, and pass around a >>>> counters object (allocated on >>>> stack) to the various code paths that use it. >>>> Updating the counters >>>> always happen under a lock, there's no need >>>> to do anything special >>>> with >>>> regards to concurrency. >>>> >>>> I also checked the nmethod marking, but there >>>> doesn't seem to be >>>> anything in that code that looks problematic >>>> under concurrency. The >>>> worst that can happen is that two threads >>>> write the same value into an >>>> nmethod field. I think we can live with >>>> that ;-) >>>> >>>> Good to go? >>>> >>>> Tested by running specjvm and jcstress >>>> fastdebug+release without >>>> issues. >>>> >>>> Roman >>>> >>>> Am 02.06.2017 um 12:39 schrieb Robbin Ehn: >>>> >>>> Hi Roman, >>>> >>>> On 06/02/2017 11:41 AM, Roman Kennke >>>> wrote: >>>> >>>> Hi David, >>>> thanks for reviewing. I'll be on >>>> vacation the next two weeks too, >>>> with >>>> only sporadic access to work stuff. >>>> Yes, exposure will not be as good as >>>> otherwise, but it's not totally >>>> untested either: the serial code path >>>> is the same as the >>>> parallel, the >>>> only difference is that it's not >>>> actually called by multiple >>>> threads. >>>> It's ok I think. >>>> >>>> I found two more issues that I think >>>> should be addressed: >>>> - There are some counters in >>>> deflate_idle_monitors() and I'm not >>>> sure I >>>> correctly handle them in the split-up >>>> and MT'ed thread-local/ global >>>> list deflation >>>> - nmethod marking seems to >>>> unconditionally poke true or something >>>> like >>>> that in nmethod fields. This doesn't >>>> hurt correctness-wise, but it's >>>> probably worth checking if it's >>>> already true, especially when doing >>>> this >>>> with multiple threads concurrently. >>>> >>>> I'll send an updated patch around >>>> later, I hope I can get to it >>>> today... >>>> >>>> >>>> I'll review that when you get it out. >>>> I think this looks as a reasonable step >>>> before we tackle this with a >>>> major effort, such as the JEP you and >>>> Carsten doing. >>>> And another effort to 'fix' nmethods >>>> marking. >>>> >>>> Internal discussion yesterday lead us to >>>> conclude that the runtime >>>> will probably need more threads. >>>> This would be a good driver to do a >>>> 'global' worker pool which serves >>>> both gc, runtime and safepoints with >>>> threads. >>>> >>>> >>>> Roman >>>> >>>> Hi Roman, >>>> >>>> I am about to disappear on an >>>> extended vacation so will let others >>>> pursue this. IIUC this is longer >>>> an opt-in by the user at runtime, >>>> but >>>> an opt-in by the particular GC >>>> developers. Okay. My only concern >>>> with >>>> that is if Shenandoah is the only >>>> GC that currently opts in then >>>> this >>>> code is not going to get much >>>> testing and will be more prone to >>>> incidental breakage. >>>> >>>> >>>> As I mentioned before, it seem like Erik >>>> ? have some idea, maybe he >>>> can do this after his barrier patch. >>>> >>>> Thanks! >>>> >>>> /Robbin >>>> >>>> >>>> Cheers, >>>> David >>>> >>>> On 2/06/2017 2:21 AM, Roman >>>> Kennke wrote: >>>> >>>> Am 01.06.2017 um 17:50 >>>> schrieb Roman Kennke: >>>> >>>> Am 01.06.2017 um 14:18 >>>> schrieb Robbin Ehn: >>>> >>>> Hi Roman, >>>> >>>> On 06/01/2017 11:29 >>>> AM, Roman Kennke wrote: >>>> >>>> Am 31.05.2017 um >>>> 22:06 schrieb Robbin Ehn: >>>> >>>> Hi Roman, I >>>> agree that is really needed but: >>>> >>>> On 05/31/2017 >>>> 10:27 AM, Roman Kennke wrote: >>>> >>>> I >>>> realized that sharing workers with GC is not so easy. >>>> >>>> We need >>>> to be able to use the workers at a safepoint during >>>> >>>> concurrent >>>> GC work >>>> (which also uses the same workers). This does not >>>> only >>>> require >>>> that >>>> those workers be suspended, like e.g. >>>> >>>> SuspendibleThreadSet::yield(), but they need to be idle, i.e. >>>> have >>>> finished >>>> their tasks. This needs some careful handling to >>>> work >>>> without >>>> races: it >>>> requires a SuspendibleThreadSetJoiner around the >>>> >>>> corresponding >>>> >>>> run_task() call and also the tasks themselves need to join >>>> the >>>> STS and >>>> handle >>>> requests for safepoints not by yielding, but by >>>> leaving >>>> the >>>> task. >>>> This is >>>> far too peculiar for me to make the call to hook >>>> up GC >>>> workers >>>> for >>>> safepoint cleanup, and I thus removed those parts. I >>>> left the >>>> API in >>>> >>>> CollectedHeap in place. I think GC devs who know better >>>> about G1 >>>> and CMS >>>> should >>>> make that call, or else just use a separate thread >>>> pool. >>>> >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/ >>>> >>>> >>>> >>>> Is it ok >>>> now? >>>> >>>> I still think >>>> you should put the "Parallel Safepoint Cleanup" >>>> workers >>>> inside >>>> Shenandoah, >>>> so the >>>> SafepointSynchronizer only calls get_safepoint_workers, >>>> e.g.: >>>> >>>> >>>> _cleanup_workers = heap->get_safepoint_workers(); >>>> >>>> _num_cleanup_workers = _cleanup_workers != NULL ? >>>> >>>> _cleanup_workers->total_workers() : 1; >>>> >>>> ParallelSPCleanupTask cleanup(_cleanup_subtasks); >>>> >>>> StrongRootsScope srs(_num_cleanup_workers); >>>> if >>>> (_cleanup_workers != NULL) { >>>> >>>> _cleanup_workers->run_task(&cleanup, >>>> >>>> _num_cleanup_workers); >>>> } else { >>>> cleanup.work >>>> (0); >>>> } >>>> >>>> That way you >>>> don't even need your new flags, but it will be >>>> up to >>>> the >>>> other GCs to >>>> make their worker available >>>> or cheat with >>>> a separate workgang. >>>> >>>> I can do that, I >>>> don't mind. The question is, do we want that? >>>> >>>> The problem is that >>>> we do not want to haste such decision, we >>>> believe >>>> there is a better >>>> solution. >>>> I think you also >>>> would want another solution. >>>> But it's seems like >>>> such solution with 1 'global' thread pool >>>> either >>>> own by GC or the VM >>>> it self is quite the undertaking. >>>> Since this probably >>>> will not be done any time soon my >>>> suggestion is, >>>> to not hold you back >>>> (we also want this), just to make >>>> the code parallel and >>>> as an intermediate step ask the GC if it >>>> minds >>>> sharing it's thread. >>>> >>>> Now when Shenandoah >>>> is merged it's possible that e.g. G1 will >>>> share >>>> the code for a >>>> separate thread pool, do something of it's own or >>>> wait until the bigger >>>> question about thread pool(s) have been >>>> resolved. >>>> >>>> By adding a thread >>>> pool directly to the SafepointSynchronizer >>>> and >>>> flags for it we might >>>> limit our future options. >>>> >>>> I wouldn't call >>>> it 'cheating with a separate workgang' >>>> though. I >>>> see >>>> that both G1 and >>>> CMS suspend their worker threads at a >>>> safepoint. >>>> However: >>>> >>>> Yes it's not cheating >>>> but I want decent heuristics between e.g. >>>> number >>>> of concurrent marking >>>> threads and parallel safepoint threads >>>> since >>>> they compete for cpu >>>> time. >>>> As the code looks >>>> now, I think that decisions must be made by >>>> the >>>> GC. >>>> >>>> Ok, I see your point. I >>>> updated the proposed patch accordingly: >>>> >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.06/ >>>> >>>> >>>> >>>> Oops. Minor mistake there. >>>> Correction: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.07/ >>>> >>>> >>>> >>>> (Removed 'class WorkGang' >>>> from safepoint.hpp, and forgot to add it >>>> into >>>> collectedHeap.hpp, resulting >>>> in build failure...) >>>> >>>> Roman >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. >> >> From thomas.schatzl at oracle.com Mon Jul 3 15:04:42 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 03 Jul 2017 17:04:42 +0200 Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count number of cards concurrently refined In-Reply-To: <1499075917.2802.8.camel@oracle.com> References: <1499075917.2802.8.camel@oracle.com> Message-ID: <1499094282.2802.132.camel@oracle.com> Hi all, ? Erik asked for a few renamings and some additional comments. Here are the new webrevs: http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1_to_2?(diff) http://cr.openjdk.java.net/~tschatzl/8179677/webrev.2?(full) Thanks, ? Thomas On Mon, 2017-07-03 at 11:58 +0200, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this small change that > makes?G1Remset::_conc_refined_cards only count the number of > concurrently refined cards (+ some trivial renaming of the variable)? > > The reason is that I plan to add the number of refined cards during > gc > as separately soon. This has been suggested earlier in some internal > discussion, and I agree. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8179677 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/ > Testing: > jprt > > Thanks, > ? Thomas From rkennke at redhat.com Mon Jul 3 15:05:29 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 3 Jul 2017 17:05:29 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> Message-ID: <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> Am 03.07.2017 um 11:13 schrieb Roman Kennke: > Am 03.07.2017 um 09:35 schrieb Mikael Gerdin: >> Hi Roman, >> >> On 2017-06-30 18:32, Roman Kennke wrote: >>> I came across one problem using this approach: We will have 2 instances >>> of CollectedHeap around, where there's usually only 1, and some code >>> expects only 1. For example, in CollectedHeap constructor, we create new >>> PerfData variables, and we now create them 2x, which leads to an assert >>> being thrown. I suspect there is more code like that. >>> >>> I will attempt to refactor this a little more, maybe it's not that bad, >>> but it's probably not worth spending too much time on it. >> I think refactoring the code to not expect a singleton CollectedHeap >> instance is a bit too much. >> Perhaps there is another way to share common code between Serial and >> CMS but that might require a bit more thought. > Yeah, definitely. I hit another difficulty: pretty much the same issues > that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now show up > with Generation and its subclasses.. > > How about we push the original patch that I've posted, and work from > there? In fact, I *have* found some little things I would change (some > more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have > overlooked in my first pass...) So here's the little change (two more places in genCollectedHeap.hpp where UseConcMarkSweepGC was used to alter behaviour: http://cr.openjdk.java.net/~rkennke/8179387/webrev.02/ Ok to push this? Roman From thomas.schatzl at oracle.com Mon Jul 3 15:22:01 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 03 Jul 2017 17:22:01 +0200 Subject: RFR (XXS): 8183394: Remove unused G1RemSet::n_workers() method In-Reply-To: <55027601-074b-b92a-7516-b08282291b70@oracle.com> References: <1499075612.2802.5.camel@oracle.com> <55027601-074b-b92a-7516-b08282291b70@oracle.com> Message-ID: <1499095321.2802.134.camel@oracle.com> Thanks Erik, Mikael for your reviews! Thomas On Mon, 2017-07-03 at 14:44 +0200, Erik Helin wrote: > On 07/03/2017 11:53 AM, Thomas Schatzl wrote: > > > > Hi all, > > > > ? can I have a review for this trivial removal of an unused method? > > One > > Reviewer should be sufficient for this ;) > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8183394 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8183394/webrev/ > Looks good, Reviewed. Thanks for cleaning this up! > Erik > > > > > Testing: > > Local compilation > > > > Thanks, > > ? Thomas > > From erik.helin at oracle.com Mon Jul 3 15:41:44 2017 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 3 Jul 2017 17:41:44 +0200 Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count number of cards concurrently refined In-Reply-To: <1499094282.2802.132.camel@oracle.com> References: <1499075917.2802.8.camel@oracle.com> <1499094282.2802.132.camel@oracle.com> Message-ID: <47787650-1ade-d89c-29a8-3b8b6e4e8bd0@oracle.com> On 07/03/2017 05:04 PM, Thomas Schatzl wrote: > Hi all, > > Erik asked for a few renamings and some additional comments. Here are > the new webrevs: > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1_to_2 (diff) > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.2 (full) Looks good, Reviewed. Thanks Thomas! Erik > Thanks, > Thomas > > On Mon, 2017-07-03 at 11:58 +0200, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this small change that >> makes G1Remset::_conc_refined_cards only count the number of >> concurrently refined cards (+ some trivial renaming of the variable)? >> >> The reason is that I plan to add the number of refined cards during >> gc >> as separately soon. This has been suggested earlier in some internal >> discussion, and I agree. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8179677 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/ >> Testing: >> jprt >> >> Thanks, >> Thomas From robbin.ehn at oracle.com Tue Jul 4 07:11:50 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 4 Jul 2017 09:11:50 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <676d3b56-cee0-b68a-d700-e43695355148@redhat.com> <1fbd2b4a-9aef-d6db-726e-929b6b466e4c@oracle.com> <08391C19-4675-475C-A30D-F10B364B5AF3@redhat.com> <9a882506-282a-ec74-27de-5b22e258e352@oracle.com> <47667919-0786-56a0-ebf9-d7c1b48766c2@redhat.com> <72d197f7-a99b-84bc-26f7-c9a84da26ccd@oracle.com> Message-ID: Hi Roman, On 07/03/2017 04:39 PM, Roman Kennke wrote: > Hi Robbin, > > does this require another review? I am not sure about David Holmes? David is back in Aug, I think he was pretty okey with but I think we should get another review. Must of our people had extended weekend and are back tomorrow. I'm soon of for 5 weeks and I really want this to be pushed before that. > > If not, I'm going to need a sponsor. I will of course take care of that! /Robbin > > Thanks and cheers, > Roman > > Am 29.06.2017 um 21:27 schrieb Robbin Ehn: >> Hi Roman, >> >> Thanks, >> >> There seem to be a performance gain vs old just running VM thread >> (again shaky numbers, but an indication): >> >> Old code with, MonitorUsedDeflationThreshold=0, 0.003099s, avg of 10 >> worsed cleanups 0.0213s >> Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s, avg of 10 >> worsed cleanups 0.0173s >> >> I'm assuming that combining deflation and nmethods marking in same >> pass is the reason for this. >> Great! >> >> I'm happy, looks good! >> >> Thanks for fixing! >> >> /Robbin >> >> On 06/29/2017 08:25 PM, Roman Kennke wrote: >>> I just did a run with gcbench. >>> I am running: >>> >>> build/linux-x86_64-normal-server-release/images/jdk/bin/java -jar >>> target/benchmarks.jar roots.Sync --jvmArgs "-Xmx8g -Xms8g >>> -XX:ParallelSafepointCleanupThreads=1 -XX:-UseBiasedLocking --add-opens >>> java.base/jdk.internal.misc=ALL-UNNAMED -XX:+PrintSafepointStatistics" >>> -p size=500000 -wi 5 -i 5 -f 1 >>> >>> i.e. I am giving it 500,000 monitors per thread on 8 java threads. >>> >>> with VMThread I am getting: >>> >>> vmop [ threads: total >>> initially_running wait_to_block ][ time: spin block sync >>> cleanup vmop ] page_trap_count >>> 0,646: G1IncCollectionPause [ >>> 19 4 6 ][ 0 0 0 >>> 158 225 ] 4 >>> 1,073: G1IncCollectionPause [ >>> 19 5 6 ][ 1 0 1 >>> 159 174 ] 5 >>> 1,961: G1IncCollectionPause [ >>> 19 2 6 ][ 0 0 0 >>> 130 66 ] 2 >>> 2,202: G1IncCollectionPause [ >>> 19 5 6 ][ 1 0 1 >>> 127 70 ] 5 >>> 2,445: G1IncCollectionPause [ >>> 19 7 7 ][ 1 0 1 >>> 127 66 ] 7 >>> 2,684: G1IncCollectionPause [ >>> 19 7 7 ][ 1 0 1 >>> 127 66 ] 7 >>> 3,371: G1IncCollectionPause [ >>> 19 5 7 ][ 1 0 1 >>> 127 74 ] 5 >>> 3,619: G1IncCollectionPause [ >>> 19 5 6 ][ 1 0 1 >>> 127 66 ] 5 >>> 3,857: G1IncCollectionPause [ >>> 19 6 6 ][ 1 0 1 >>> 126 68 ] 6 >>> >>> I.e. it gets to fairly consistent >120us for cleanup. >>> >>> With 4 safepoint cleanup threads I get: >>> >>> >>> vmop [ threads: total >>> initially_running wait_to_block ][ time: spin block sync >>> cleanup vmop ] page_trap_count >>> 0,650: G1IncCollectionPause [ >>> 19 4 6 ][ 0 0 0 >>> 63 197 ] 4 >>> 0,951: G1IncCollectionPause [ >>> 19 0 1 ][ 0 0 0 >>> 64 151 ] 0 >>> 1,214: G1IncCollectionPause [ >>> 19 7 8 ][ 0 0 0 >>> 62 93 ] 6 >>> 1,942: G1IncCollectionPause [ >>> 19 4 6 ][ 1 0 1 >>> 59 71 ] 4 >>> 2,118: G1IncCollectionPause [ >>> 19 6 6 ][ 1 0 1 >>> 59 72 ] 6 >>> 2,296: G1IncCollectionPause [ >>> 19 5 6 ][ 0 0 0 >>> 59 69 ] 5 >>> >>> i.e. fairly consistently around 60 us (I think it's us?!) >>> >>> I grant you that I'm throwing way way more monitors at it. With just >>> 12000 monitors per thread I get columns of 0s under cleanup. :-) >>> >>> Roman >>> >>> Here's with 1 tAm 29.06.2017 um 14:17 schrieb Robbin Ehn: >>>> The test is using 24 threads (whatever that means), total number of >>>> javathreads is 57 (including compiler, etc...). >>>> >>>> [29.186s][error][os ] Num threads:57 >>>> [29.186s][error][os ] omInUseCount:0 >>>> [29.186s][error][os ] omInUseCount:2064 >>>> [29.187s][error][os ] omInUseCount:1861 >>>> [29.188s][error][os ] omInUseCount:1058 >>>> [29.188s][error][os ] omInUseCount:2 >>>> [29.188s][error][os ] omInUseCount:577 >>>> [29.189s][error][os ] omInUseCount:1443 >>>> [29.189s][error][os ] omInUseCount:122 >>>> [29.189s][error][os ] omInUseCount:47 >>>> [29.189s][error][os ] omInUseCount:497 >>>> [29.189s][error][os ] omInUseCount:16 >>>> [29.189s][error][os ] omInUseCount:113 >>>> [29.189s][error][os ] omInUseCount:5 >>>> [29.189s][error][os ] omInUseCount:678 >>>> [29.190s][error][os ] omInUseCount:105 >>>> [29.190s][error][os ] omInUseCount:609 >>>> [29.190s][error][os ] omInUseCount:286 >>>> [29.190s][error][os ] omInUseCount:228 >>>> [29.190s][error][os ] omInUseCount:1391 >>>> [29.191s][error][os ] omInUseCount:1652 >>>> [29.191s][error][os ] omInUseCount:325 >>>> [29.191s][error][os ] omInUseCount:439 >>>> [29.192s][error][os ] omInUseCount:994 >>>> [29.192s][error][os ] omInUseCount:103 >>>> [29.192s][error][os ] omInUseCount:2337 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:1 >>>> [29.193s][error][os ] omInUseCount:1 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:1 >>>> [29.193s][error][os ] omInUseCount:2 >>>> [29.193s][error][os ] omInUseCount:1 >>>> [29.193s][error][os ] omInUseCount:1 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:1 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:1 >>>> [29.193s][error][os ] omInUseCount:0 >>>> [29.193s][error][os ] omInUseCount:0 >>>> >>>> So in my setup even if you parallel the per thread in use monitors >>>> work the synchronization overhead is still larger. >>>> >>>> /Robbin >>>> >>>> On 06/29/2017 01:42 PM, Roman Kennke wrote: >>>>> How many Java threads are involved in monitor Inflation ? >>>>> Parallelization is spread by Java threads (i.e. each worker claims >>>>> and deflates monitors of 1 java thread per step). >>>>> >>>>> Roman >>>>> >>>>> Am 29. Juni 2017 12:49:58 MESZ schrieb Robbin Ehn >>>>> : >>>>> >>>>> Hi Roman, >>>>> >>>>> I haven't had the time to test all scenarios, and the numbers are >>>>> just an indication: >>>>> >>>>> Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s avg, >>>>> avg of 10 worsed cleanups 0.0173s >>>>> Do it 4 workers, MonitorUsedDeflationThreshold=0, 0.002923s avg, >>>>> avg of 10 worsed cleanups 0.0199s >>>>> Do it VM thread, MonitorUsedDeflationThreshold=1, 0.001889s avg, >>>>> avg of 10 worsed cleanups 0.0066s >>>>> >>>>> When MonitorUsedDeflationThreshold=0 we are talking about 120000 >>>>> free monitors to deflate. >>>>> And I get worse numbers doing the cleanup in 4 threads. >>>>> >>>>> Any idea why I see these numbers? >>>>> >>>>> Thanks, Robbin >>>>> >>>>> On 06/28/2017 10:23 PM, Roman Kennke wrote: >>>>> >>>>> >>>>> >>>>> On 06/27/2017 09:47 PM, Roman Kennke wrote: >>>>> >>>>> Hi Robbin, >>>>> >>>>> Ugh. Thanks for catching this. >>>>> Problem was that I was accounting the thread-local >>>>> deflations twice: >>>>> once in thread-local processing (basically a leftover >>>>> from my earlier >>>>> attempt to implement this accounting) and then >>>>> again in >>>>> finish_deflate_idle_monitors(). Should be fixed here: >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/ >>>>> >>>>> >>>>> >>>>> >>>>> Nit: >>>>> safepoint.cpp : ParallelSPCleanupTask >>>>> "const char* name = " is not needed and 1 is unused >>>>> >>>>> >>>>> Sorry, I don't understand what you mean by this. I see code >>>>> like this: >>>>> >>>>> const char* name = "deflating idle monitors"; >>>>> >>>>> and it is used a few lines below, even 2x. >>>>> >>>>> What's '1 is unused' ? >>>>> >>>>> >>>>> Side question: which jtreg targets do you usually >>>>> run? >>>>> >>>>> >>>>> Right now I cherry pick directories from: hotspot/test/ >>>>> >>>>> I'm going to add a decent test group for local testing. >>>>> >>>>> That would be good! >>>>> >>>>> >>>>> >>>>> >>>>> Trying: make test TEST=hotspot_all >>>>> gives me *lots* of failures due to missing jcstress >>>>> stuff (?!) >>>>> And even other subsets seem to depend on several bits >>>>> and pieces >>>>> that I >>>>> have no idea about. >>>>> >>>>> >>>>> Yes, you need to use internal tool 'jib' java integrate >>>>> build to get >>>>> that work or you can set some environment where the >>>>> jcstress >>>>> application stuff is... >>>>> >>>>> Uhhh. We really do want a subset of tests that we can run >>>>> reliably and >>>>> that are self-contained, how else are people (without that >>>>> jib thingy) >>>>> supposed to do some sanity checking with their patches? ;-) >>>>> >>>>> I have a regression on ClassLoaderData root scanning, >>>>> this should not >>>>> be related, >>>>> but I only have 3 patches which could cause this, if it's >>>>> not >>>>> something in the environment that have changed. >>>>> >>>>> Let me know if it's my patch :-) >>>>> >>>>> >>>>> Also do not see any immediate performance gains (off vs 4 >>>>> threads), it >>>>> might be >>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/06994badeb24 >>>>> , but I need to-do some more testing. I know you often >>>>> run with none >>>>> default GSI. >>>>> >>>>> >>>>> First of all, during the course of this review I reduced the >>>>> change from >>>>> an actual implementation to a kind of framework, and it needs >>>>> some >>>>> separate changes in the GC to make use of it. Not sure if you >>>>> added >>>>> corresponding code in (e.g.) G1? >>>>> >>>>> Also, this is only really visible in code that makes >>>>> excessive use of >>>>> monitors, i.e. the one linked by Carsten's original patch, or >>>>> the test >>>>> org.openjdk.gcbench.roots.Synchronizers.test in gc-bench: >>>>> >>>>> http://icedtea.classpath.org/hg/gc-bench/ >>>>> >>>>> There are also some popular real-world apps that tend to do >>>>> this. From >>>>> the top off my head, Cassandra is such an application. >>>>> >>>>> Thanks, Roman >>>>> >>>>> >>>>> I'll get back to you. >>>>> >>>>> Thanks, Robbin >>>>> >>>>> >>>>> Roman >>>>> >>>>> Am 27.06.2017 um 16:51 schrieb Robbin Ehn: >>>>> >>>>> Hi Roman, >>>>> >>>>> There is something wrong in calculations: >>>>> INFO: Deflate: InCirc=43 InUse=18 Scavenged=25 >>>>> ForceMonitorScavenge=0 >>>>> : pop=27051 free=215487 >>>>> >>>>> free is larger than population, have not had the >>>>> time to dig into this. >>>>> >>>>> Thanks, Robbin >>>>> >>>>> On 06/22/2017 10:19 PM, Roman Kennke wrote: >>>>> >>>>> So here's the latest iteration of that patch: >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.08/ >>>>> >>>>> >>>>> >>>>> I checked and fixed all the counters. The >>>>> problem here is that they >>>>> are >>>>> not updated in a single place >>>>> (deflate_idle_monitors() ) but in >>>>> several >>>>> places, potentially by multiple threads. I >>>>> split up deflation into >>>>> prepare_.. and a finish_.. methods to >>>>> initialize local and update >>>>> global >>>>> counters respectively, and pass around a >>>>> counters object (allocated on >>>>> stack) to the various code paths that use it. >>>>> Updating the counters >>>>> always happen under a lock, there's no need >>>>> to do anything special >>>>> with >>>>> regards to concurrency. >>>>> >>>>> I also checked the nmethod marking, but there >>>>> doesn't seem to be >>>>> anything in that code that looks problematic >>>>> under concurrency. The >>>>> worst that can happen is that two threads >>>>> write the same value into an >>>>> nmethod field. I think we can live with >>>>> that ;-) >>>>> >>>>> Good to go? >>>>> >>>>> Tested by running specjvm and jcstress >>>>> fastdebug+release without >>>>> issues. >>>>> >>>>> Roman >>>>> >>>>> Am 02.06.2017 um 12:39 schrieb Robbin Ehn: >>>>> >>>>> Hi Roman, >>>>> >>>>> On 06/02/2017 11:41 AM, Roman Kennke >>>>> wrote: >>>>> >>>>> Hi David, >>>>> thanks for reviewing. I'll be on >>>>> vacation the next two weeks too, >>>>> with >>>>> only sporadic access to work stuff. >>>>> Yes, exposure will not be as good as >>>>> otherwise, but it's not totally >>>>> untested either: the serial code path >>>>> is the same as the >>>>> parallel, the >>>>> only difference is that it's not >>>>> actually called by multiple >>>>> threads. >>>>> It's ok I think. >>>>> >>>>> I found two more issues that I think >>>>> should be addressed: >>>>> - There are some counters in >>>>> deflate_idle_monitors() and I'm not >>>>> sure I >>>>> correctly handle them in the split-up >>>>> and MT'ed thread-local/ global >>>>> list deflation >>>>> - nmethod marking seems to >>>>> unconditionally poke true or something >>>>> like >>>>> that in nmethod fields. This doesn't >>>>> hurt correctness-wise, but it's >>>>> probably worth checking if it's >>>>> already true, especially when doing >>>>> this >>>>> with multiple threads concurrently. >>>>> >>>>> I'll send an updated patch around >>>>> later, I hope I can get to it >>>>> today... >>>>> >>>>> >>>>> I'll review that when you get it out. >>>>> I think this looks as a reasonable step >>>>> before we tackle this with a >>>>> major effort, such as the JEP you and >>>>> Carsten doing. >>>>> And another effort to 'fix' nmethods >>>>> marking. >>>>> >>>>> Internal discussion yesterday lead us to >>>>> conclude that the runtime >>>>> will probably need more threads. >>>>> This would be a good driver to do a >>>>> 'global' worker pool which serves >>>>> both gc, runtime and safepoints with >>>>> threads. >>>>> >>>>> >>>>> Roman >>>>> >>>>> Hi Roman, >>>>> >>>>> I am about to disappear on an >>>>> extended vacation so will let others >>>>> pursue this. IIUC this is longer >>>>> an opt-in by the user at runtime, >>>>> but >>>>> an opt-in by the particular GC >>>>> developers. Okay. My only concern >>>>> with >>>>> that is if Shenandoah is the only >>>>> GC that currently opts in then >>>>> this >>>>> code is not going to get much >>>>> testing and will be more prone to >>>>> incidental breakage. >>>>> >>>>> >>>>> As I mentioned before, it seem like Erik >>>>> ? have some idea, maybe he >>>>> can do this after his barrier patch. >>>>> >>>>> Thanks! >>>>> >>>>> /Robbin >>>>> >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>> On 2/06/2017 2:21 AM, Roman >>>>> Kennke wrote: >>>>> >>>>> Am 01.06.2017 um 17:50 >>>>> schrieb Roman Kennke: >>>>> >>>>> Am 01.06.2017 um 14:18 >>>>> schrieb Robbin Ehn: >>>>> >>>>> Hi Roman, >>>>> >>>>> On 06/01/2017 11:29 >>>>> AM, Roman Kennke wrote: >>>>> >>>>> Am 31.05.2017 um >>>>> 22:06 schrieb Robbin Ehn: >>>>> >>>>> Hi Roman, I >>>>> agree that is really needed but: >>>>> >>>>> On 05/31/2017 >>>>> 10:27 AM, Roman Kennke wrote: >>>>> >>>>> I >>>>> realized that sharing workers with GC is not so easy. >>>>> >>>>> We need >>>>> to be able to use the workers at a safepoint during >>>>> >>>>> concurrent >>>>> GC work >>>>> (which also uses the same workers). This does not >>>>> only >>>>> require >>>>> that >>>>> those workers be suspended, like e.g. >>>>> >>>>> SuspendibleThreadSet::yield(), but they need to be idle, i.e. >>>>> have >>>>> finished >>>>> their tasks. This needs some careful handling to >>>>> work >>>>> without >>>>> races: it >>>>> requires a SuspendibleThreadSetJoiner around the >>>>> >>>>> corresponding >>>>> >>>>> run_task() call and also the tasks themselves need to join >>>>> the >>>>> STS and >>>>> handle >>>>> requests for safepoints not by yielding, but by >>>>> leaving >>>>> the >>>>> task. >>>>> This is >>>>> far too peculiar for me to make the call to hook >>>>> up GC >>>>> workers >>>>> for >>>>> safepoint cleanup, and I thus removed those parts. I >>>>> left the >>>>> API in >>>>> >>>>> CollectedHeap in place. I think GC devs who know better >>>>> about G1 >>>>> and CMS >>>>> should >>>>> make that call, or else just use a separate thread >>>>> pool. >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/ >>>>> >>>>> >>>>> >>>>> Is it ok >>>>> now? >>>>> >>>>> I still think >>>>> you should put the "Parallel Safepoint Cleanup" >>>>> workers >>>>> inside >>>>> Shenandoah, >>>>> so the >>>>> SafepointSynchronizer only calls get_safepoint_workers, >>>>> e.g.: >>>>> >>>>> >>>>> _cleanup_workers = heap->get_safepoint_workers(); >>>>> >>>>> _num_cleanup_workers = _cleanup_workers != NULL ? >>>>> >>>>> _cleanup_workers->total_workers() : 1; >>>>> >>>>> ParallelSPCleanupTask cleanup(_cleanup_subtasks); >>>>> >>>>> StrongRootsScope srs(_num_cleanup_workers); >>>>> if >>>>> (_cleanup_workers != NULL) { >>>>> >>>>> _cleanup_workers->run_task(&cleanup, >>>>> >>>>> _num_cleanup_workers); >>>>> } else { >>>>> cleanup.work >>>>> (0); >>>>> } >>>>> >>>>> That way you >>>>> don't even need your new flags, but it will be >>>>> up to >>>>> the >>>>> other GCs to >>>>> make their worker available >>>>> or cheat with >>>>> a separate workgang. >>>>> >>>>> I can do that, I >>>>> don't mind. The question is, do we want that? >>>>> >>>>> The problem is that >>>>> we do not want to haste such decision, we >>>>> believe >>>>> there is a better >>>>> solution. >>>>> I think you also >>>>> would want another solution. >>>>> But it's seems like >>>>> such solution with 1 'global' thread pool >>>>> either >>>>> own by GC or the VM >>>>> it self is quite the undertaking. >>>>> Since this probably >>>>> will not be done any time soon my >>>>> suggestion is, >>>>> to not hold you back >>>>> (we also want this), just to make >>>>> the code parallel and >>>>> as an intermediate step ask the GC if it >>>>> minds >>>>> sharing it's thread. >>>>> >>>>> Now when Shenandoah >>>>> is merged it's possible that e.g. G1 will >>>>> share >>>>> the code for a >>>>> separate thread pool, do something of it's own or >>>>> wait until the bigger >>>>> question about thread pool(s) have been >>>>> resolved. >>>>> >>>>> By adding a thread >>>>> pool directly to the SafepointSynchronizer >>>>> and >>>>> flags for it we might >>>>> limit our future options. >>>>> >>>>> I wouldn't call >>>>> it 'cheating with a separate workgang' >>>>> though. I >>>>> see >>>>> that both G1 and >>>>> CMS suspend their worker threads at a >>>>> safepoint. >>>>> However: >>>>> >>>>> Yes it's not cheating >>>>> but I want decent heuristics between e.g. >>>>> number >>>>> of concurrent marking >>>>> threads and parallel safepoint threads >>>>> since >>>>> they compete for cpu >>>>> time. >>>>> As the code looks >>>>> now, I think that decisions must be made by >>>>> the >>>>> GC. >>>>> >>>>> Ok, I see your point. I >>>>> updated the proposed patch accordingly: >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.06/ >>>>> >>>>> >>>>> >>>>> Oops. Minor mistake there. >>>>> Correction: >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.07/ >>>>> >>>>> >>>>> >>>>> (Removed 'class WorkGang' >>>>> from safepoint.hpp, and forgot to add it >>>>> into >>>>> collectedHeap.hpp, resulting >>>>> in build failure...) >>>>> >>>>> Roman >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. >>> >>> > From thomas.schatzl at oracle.com Tue Jul 4 07:17:08 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 04 Jul 2017 09:17:08 +0200 Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count number of cards concurrently refined In-Reply-To: <4331c480-7c20-8d39-9ea9-7418a86a878d@oracle.com> References: <1499075917.2802.8.camel@oracle.com> <4331c480-7c20-8d39-9ea9-7418a86a878d@oracle.com> Message-ID: <1499152628.2761.0.camel@oracle.com> Hi Stefan, On Mon, 2017-07-03 at 15:12 +0200, Stefan Johansson wrote: > > On 2017-07-03 11:58, Thomas Schatzl wrote: > > > > Hi all, > > > > ???can I have reviews for this small change that > > makes G1Remset::_conc_refined_cards only count the number of > > concurrently refined cards (+ some trivial renaming of the > > variable)? > > [...] > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8179677 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/ > Looks good, > StefanJ ? thanks for your review. Thomas From thomas.schatzl at oracle.com Tue Jul 4 07:17:59 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 04 Jul 2017 09:17:59 +0200 Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count number of cards concurrently refined In-Reply-To: <47787650-1ade-d89c-29a8-3b8b6e4e8bd0@oracle.com> References: <1499075917.2802.8.camel@oracle.com> <1499094282.2802.132.camel@oracle.com> <47787650-1ade-d89c-29a8-3b8b6e4e8bd0@oracle.com> Message-ID: <1499152679.2761.1.camel@oracle.com> Hi Erik, On Mon, 2017-07-03 at 17:41 +0200, Erik Helin wrote: > On 07/03/2017 05:04 PM, Thomas Schatzl wrote: > > > > Hi all, > > > > ? Erik asked for a few renamings and some additional comments. Here > > are > > the new webrevs: > > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1_to_2 (diff) > > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.2 (full) > Looks good, Reviewed. Thanks Thomas! > Erik > ? thanks for your review. Thomas From mikael.gerdin at oracle.com Tue Jul 4 08:10:34 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 4 Jul 2017 10:10:34 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <595A3E66.5050705@oracle.com> References: <59510D5E.10009@oracle.com> <595A3E66.5050705@oracle.com> Message-ID: <012c4b90-ab34-b683-a641-751714b53bcd@oracle.com> Hi Erik, On 2017-07-03 14:53, Erik ?sterlund wrote: > Hi Mikael, > > Thank you for the review! > > Regarding the use of + x in the current enum system for lock rankings, I > agree that it is not a brilliant system and you feel a bit sad when your > lock rank is "leaf+2". However, sometimes I feel like abstracting > numbers with names can become confusing as well - even misleading. Like > for example how leaf is no longer a leaf and how it is questionable > whether special is really all that special any longer. > > When I thought about possible name for access + 0 and access + 1, I was > thinking something in the lines of "perhaps access_inner/outer or > access_leaf/nonleaf", but then that might get confusing if suddenly > access will need 3 ranks for some reason and we get an "access_special" > rank or something. I suppose you're right. Let's leave the values as you suggested. > > Perhaps a different solution than enum names would be nice long-term for > lock ranks and deadlock detection, but I believe that might be outside > of the current scope for this change. Agreed. /Mikael > > Thanks, > /Erik > > On 2017-07-03 13:57, Mikael Gerdin wrote: >> Hi Erik, >> >> On 2017-06-26 15:34, Erik ?sterlund wrote: >>> Hi, >>> >>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ >> >> I think this change makes sense and I agree with your reasoning below. >> >> I'm leaning towards suggesting creating a named enum value for >> "access+1" to begin a move towards getting rid of adding and >> subtracting values from enums in this code. I don't have a good name >> for it, though. >> >> /Mikael >> >> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 >>> >>> The G1 barrier queues have very awkward lock orderings for the >>> following reasons: >>> >>> 1) These queues may queue up things when performing a reference write >>> or resolving a jweak (intentionally or just happened to be jweak, >>> even though it looks like a jobject), which can happen in a lot of >>> places in the code. We resolve JNIHandles while holding special locks >>> in many places. We perform reference writes also in many places. Now >>> the unsuspecting hotspot developer might think that it is okay to >>> resolve a JNIHandle or perform a reference write while possibly >>> holding a special lock. But no. In some cases, object writes have >>> been moved out of locks and replaced with lock-free CAS, only to >>> dodge the G1 write barrier locks. I don't think the G1 lock ordering >>> issues should shape the shared code rather than the other way around. >>> 2) There is an issue that the shared queue locks have a "special" >>> rank, which is below the lock ranks used by the cbl monitor and free >>> list monitor. This leads to an issue when these locks have to be >>> taken while holding the shared queue locks. The current solution is >>> to drop the shared queue locks temporarily, introducing nasty data >>> races. These races are guarded, but the whole race seems very >>> unnecessary. >>> >>> I argue that if the G1 write barrier queue locks were simply set >>> appropriately in the first place by analyzing what ranks they should >>> have, none of the above issues would exist. Therefore I propose this >>> new ordering. >>> >>> Specifically, I recognize that locks required for performing memory >>> accesses and resolving JNIHandles are more special than the "special" >>> rank. Therefore, this change introduces a new lock ordering category >>> called "access", which is to be used by barriers required to perform >>> memory accesses. In other words, by recognizing the rank is more >>> special than "special", we can remove "special" code to walk around >>> making its rank more "special". That seems desirable to me. The >>> access locks need to comply to the same constraints as the special >>> locks: they may not perform safepoint checks. >>> >>> The old lock ranks were: >>> >>> SATB_Q_FL_lock: special >>> SATB_Q_CBL_mon: leaf - 1 >>> Shared_SATB_Q_lock: leaf - 1 >>> >>> DirtyCardQ_FL_lock: special >>> DirtyCardQ_CBL_mon: leaf - 1 >>> Shared_DirtyCardQ_lock: leaf - 1 >>> >>> The new lock ranks are: >>> >>> SATB_Q_FL_lock: access (special - 2) >>> SATB_Q_CBL_mon: access (special - 2) >>> Shared_SATB_Q_lock: access + 1 (special - 1) >>> >>> DirtyCardQ_FL_lock: access (special - 2) >>> DirtyCardQ_CBL_mon: access (special - 2) >>> Shared_DirtyCardQ_lock: access + 1 (special - 1) >>> >>> Analysis: >>> >>> Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the same >>> group of locks. The free list lock, the completed buffer list monitor >>> and the shared queue lock. >>> >>> Observations: >>> 1) The free list lock and completed buffer list monitors (members of >>> PtrQueueSet) are disjoint. We never hold both of them at the same time. >>> Rationale: The free list lock is only used from >>> PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and >>> PtrQueueSet::reduce_free_list, and no callsite from there can be >>> expanded where the cbl monitor is acquired. So therefore it is >>> impossible to acquire the cbl monitor while holding the free list >>> lock. The opposite case of acquiring the free list lock while holding >>> the cbl monitor is also not possible; only the following places >>> acquire the cbl monitor: PtrQueueSet::enqueue_complete_buffer, >>> PtrQueueSet::merge_bufferlists, >>> PtrQueueSet::assert_completed_buffer_list_len_correct, >>> PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id, >>> FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer, >>> DirtyCardQueueSet::clear, >>> SATBMarkQueueSet::apply_closure_to_completed_buffer and >>> SATBMarkQueueSet::abandon_partial_marking. Again, neither of these >>> paths where the cbl monitor is held can expand callsites to a place >>> where the free list locks are held. Therefore it holds that the cbl >>> monitor can not be held while the free list lock is held, and the >>> free list lock can not be held while the cbl monitor is held. >>> Therefore they are held disjointly. >>> 2) We might hold the shared queue locks before acquiring the >>> completed buffer list monitor. (today we drop the shared queue lock >>> then and reacquire it later as a hack as already described) >>> 3) We do not acquire a shared queue lock while holding the free list >>> lock or completed buffer list monitor, as there is no reference from >>> a PtrQueueSet to its shared queue, so those code paths do not know >>> how to reach the shared PtrQueue to acquire its lock. The derived >>> classes are exceptions but they never use the shared queue lock while >>> holding the completed buffer list monitor or free list lock. >>> DirtyCardQueueSet uses the shared queue for concatenating logs (in a >>> safepoint without holding those locks). The SATBMarkQueueSet uses the >>> shared queue for filtering the buffers, fiddling with activeness, >>> printing and resetting, all without grabbing any locks. >>> 4) We do not acquire any other lock (above event) while holding the >>> free list lock or completed buffer list monitors. This was discovered >>> by manually expanding the call graphs from where these two locks are >>> held. >>> >>> Derived constraints: >>> a) Because of observation 1, the free list lock and completed buffer >>> list monitors can have the same rank. >>> b) Because of observations 1 and 2, the shared queue lock ought to >>> have a rank higher than the ranks of the free list lock and the >>> completed buffer list monitors (not the case today). >>> c) Because of of observation 3 and 2, the free list lock and >>> completed buffer list monitors ought to have a rank lower than the >>> rank of the shared queue lock. >>> d) Because of observation 4 (and constraints a-c), all the barrier >>> locks should be below the "special" rank without violating any >>> existing ranks. >>> >>> The proposed new lock ranks conform to the constraints derived from >>> my observations. It is worth noting that the potential relationship >>> that could break (and why they do not) are: >>> 1) If a lock is acquired from within the barriers that does not >>> involve the shared queue lock, the free list lock or the completed >>> buffer list monitor, we have now inverted their relationship as that >>> other lock would probably have a rank higher than or equal to >>> "special". But due to observation 4, there are no such cases. >>> 2) The relationship between the shared queue lock and the completed >>> buffer list monitor has been changed so both can be held at the same >>> time if the shared queue lock is acquired first (which it is). This >>> is arguably the way it should have been from the first place, and the >>> old solution had ugly hacks where we would drop the shared queue lock >>> to not run into the lock order assert (and only not to run into the >>> lock order assert, i.e. not to avoid potential deadlock) to ensure >>> the locks are not held at the same time. That code has now been >>> removed, so that the shared queue lock is still held when enqueueing >>> completed buffers (no dodgy dropping and reclaiming), and the code >>> for handling the races due to multiple concurrent enqueuers has also >>> been removed and replaced with an assertion that there simply should >>> not be multiple concurrent enqueuers. Since the shared queue lock is >>> now held throughout the whole operation, there will be no concurrent >>> enqueuers. >>> 3) The completed buffer list monitor used to have a higher rank than >>> the free list lock. Now they have the same. Therefore, they could >>> previously allow them to be held at the same time if the cbl monitor >>> was acquired first. However, as discussed, there is no such case, and >>> they ought to have the same rank not to confuse their true >>> disjointness. If anyone insists we do not break this relationship >>> despite the true disjointness, I could consent to adding another >>> access lock rank, like this: >>> http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I think >>> it seems better to have the same rank since they are actually truly >>> disjoint and should remain disjoint. >>> >>> I do recognize that long term, we *might* want a lock-free solution >>> or something (not saying we do or do not). But until then, the ranks >>> ought to be corrected so that they do not cause these problems >>> causing everyone to bash their head against the awkward G1 lock ranks >>> throughout the code and make hacks around it. >>> >>> Testing: JPRT with hotspot all and lots of local testing. >>> >>> Thanks, >>> /Erik > From thomas.schatzl at oracle.com Tue Jul 4 08:24:23 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 04 Jul 2017 10:24:23 +0200 Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure Message-ID: <1499156663.2761.6.camel@oracle.com> Hi all, ? can I get reviews for this change that renames and cleans up the use of?RefineCardTableEntryClosure in the code? RefineCardTableEntryClosure is the closure that is applied by the concurrent refinement threads. This change renames it slightly to indicate its use (G1RefineCardConcurrentlyClosure) and moves it to the G1RemSet files close to the closure that we use for refinement/Update RS during GC. This change is dependent on "JDK-8183226: Remembered set summarization accesses not fully initialized java thread DCQS" which is also currently out for review - that change reorganizes G1CollectedHeap initialization so that the change can actually move the closure. CR: https://bugs.openjdk.java.net/browse/JDK-8183128 Webrev: http://cr.openjdk.java.net/~tschatzl/8183128/webrev/ Testing: jprt, local benchmarks Thanks, ? Thomas From erik.osterlund at oracle.com Tue Jul 4 08:27:55 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 4 Jul 2017 10:27:55 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <012c4b90-ab34-b683-a641-751714b53bcd@oracle.com> References: <59510D5E.10009@oracle.com> <595A3E66.5050705@oracle.com> <012c4b90-ab34-b683-a641-751714b53bcd@oracle.com> Message-ID: <595B518B.2030703@oracle.com> Hi Mikael, Thank you for the review. /Erik On 2017-07-04 10:10, Mikael Gerdin wrote: > Hi Erik, > > On 2017-07-03 14:53, Erik ?sterlund wrote: >> Hi Mikael, >> >> Thank you for the review! >> >> Regarding the use of + x in the current enum system for lock >> rankings, I agree that it is not a brilliant system and you feel a >> bit sad when your lock rank is "leaf+2". However, sometimes I feel >> like abstracting numbers with names can become confusing as well - >> even misleading. Like for example how leaf is no longer a leaf and >> how it is questionable whether special is really all that special any >> longer. >> >> When I thought about possible name for access + 0 and access + 1, I >> was thinking something in the lines of "perhaps access_inner/outer or >> access_leaf/nonleaf", but then that might get confusing if suddenly >> access will need 3 ranks for some reason and we get an >> "access_special" rank or something. > > I suppose you're right. Let's leave the values as you suggested. > >> >> Perhaps a different solution than enum names would be nice long-term >> for lock ranks and deadlock detection, but I believe that might be >> outside of the current scope for this change. > > Agreed. > /Mikael > >> >> Thanks, >> /Erik >> >> On 2017-07-03 13:57, Mikael Gerdin wrote: >>> Hi Erik, >>> >>> On 2017-06-26 15:34, Erik ?sterlund wrote: >>>> Hi, >>>> >>>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ >>> >>> I think this change makes sense and I agree with your reasoning below. >>> >>> I'm leaning towards suggesting creating a named enum value for >>> "access+1" to begin a move towards getting rid of adding and >>> subtracting values from enums in this code. I don't have a good name >>> for it, though. >>> >>> /Mikael >>> >>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 >>>> >>>> The G1 barrier queues have very awkward lock orderings for the >>>> following reasons: >>>> >>>> 1) These queues may queue up things when performing a reference >>>> write or resolving a jweak (intentionally or just happened to be >>>> jweak, even though it looks like a jobject), which can happen in a >>>> lot of places in the code. We resolve JNIHandles while holding >>>> special locks in many places. We perform reference writes also in >>>> many places. Now the unsuspecting hotspot developer might think >>>> that it is okay to resolve a JNIHandle or perform a reference write >>>> while possibly holding a special lock. But no. In some cases, >>>> object writes have been moved out of locks and replaced with >>>> lock-free CAS, only to dodge the G1 write barrier locks. I don't >>>> think the G1 lock ordering issues should shape the shared code >>>> rather than the other way around. >>>> 2) There is an issue that the shared queue locks have a "special" >>>> rank, which is below the lock ranks used by the cbl monitor and >>>> free list monitor. This leads to an issue when these locks have to >>>> be taken while holding the shared queue locks. The current solution >>>> is to drop the shared queue locks temporarily, introducing nasty >>>> data races. These races are guarded, but the whole race seems very >>>> unnecessary. >>>> >>>> I argue that if the G1 write barrier queue locks were simply set >>>> appropriately in the first place by analyzing what ranks they >>>> should have, none of the above issues would exist. Therefore I >>>> propose this new ordering. >>>> >>>> Specifically, I recognize that locks required for performing memory >>>> accesses and resolving JNIHandles are more special than the >>>> "special" rank. Therefore, this change introduces a new lock >>>> ordering category called "access", which is to be used by barriers >>>> required to perform memory accesses. In other words, by recognizing >>>> the rank is more special than "special", we can remove "special" >>>> code to walk around making its rank more "special". That seems >>>> desirable to me. The access locks need to comply to the same >>>> constraints as the special locks: they may not perform safepoint >>>> checks. >>>> >>>> The old lock ranks were: >>>> >>>> SATB_Q_FL_lock: special >>>> SATB_Q_CBL_mon: leaf - 1 >>>> Shared_SATB_Q_lock: leaf - 1 >>>> >>>> DirtyCardQ_FL_lock: special >>>> DirtyCardQ_CBL_mon: leaf - 1 >>>> Shared_DirtyCardQ_lock: leaf - 1 >>>> >>>> The new lock ranks are: >>>> >>>> SATB_Q_FL_lock: access (special - 2) >>>> SATB_Q_CBL_mon: access (special - 2) >>>> Shared_SATB_Q_lock: access + 1 (special - 1) >>>> >>>> DirtyCardQ_FL_lock: access (special - 2) >>>> DirtyCardQ_CBL_mon: access (special - 2) >>>> Shared_DirtyCardQ_lock: access + 1 (special - 1) >>>> >>>> Analysis: >>>> >>>> Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the >>>> same group of locks. The free list lock, the completed buffer list >>>> monitor and the shared queue lock. >>>> >>>> Observations: >>>> 1) The free list lock and completed buffer list monitors (members >>>> of PtrQueueSet) are disjoint. We never hold both of them at the >>>> same time. >>>> Rationale: The free list lock is only used from >>>> PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and >>>> PtrQueueSet::reduce_free_list, and no callsite from there can be >>>> expanded where the cbl monitor is acquired. So therefore it is >>>> impossible to acquire the cbl monitor while holding the free list >>>> lock. The opposite case of acquiring the free list lock while >>>> holding the cbl monitor is also not possible; only the following >>>> places acquire the cbl monitor: >>>> PtrQueueSet::enqueue_complete_buffer, >>>> PtrQueueSet::merge_bufferlists, >>>> PtrQueueSet::assert_completed_buffer_list_len_correct, >>>> PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id, >>>> FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer, >>>> DirtyCardQueueSet::clear, >>>> SATBMarkQueueSet::apply_closure_to_completed_buffer and >>>> SATBMarkQueueSet::abandon_partial_marking. Again, neither of these >>>> paths where the cbl monitor is held can expand callsites to a place >>>> where the free list locks are held. Therefore it holds that the cbl >>>> monitor can not be held while the free list lock is held, and the >>>> free list lock can not be held while the cbl monitor is held. >>>> Therefore they are held disjointly. >>>> 2) We might hold the shared queue locks before acquiring the >>>> completed buffer list monitor. (today we drop the shared queue lock >>>> then and reacquire it later as a hack as already described) >>>> 3) We do not acquire a shared queue lock while holding the free >>>> list lock or completed buffer list monitor, as there is no >>>> reference from a PtrQueueSet to its shared queue, so those code >>>> paths do not know how to reach the shared PtrQueue to acquire its >>>> lock. The derived classes are exceptions but they never use the >>>> shared queue lock while holding the completed buffer list monitor >>>> or free list lock. DirtyCardQueueSet uses the shared queue for >>>> concatenating logs (in a safepoint without holding those locks). >>>> The SATBMarkQueueSet uses the shared queue for filtering the >>>> buffers, fiddling with activeness, printing and resetting, all >>>> without grabbing any locks. >>>> 4) We do not acquire any other lock (above event) while holding the >>>> free list lock or completed buffer list monitors. This was >>>> discovered by manually expanding the call graphs from where these >>>> two locks are held. >>>> >>>> Derived constraints: >>>> a) Because of observation 1, the free list lock and completed >>>> buffer list monitors can have the same rank. >>>> b) Because of observations 1 and 2, the shared queue lock ought to >>>> have a rank higher than the ranks of the free list lock and the >>>> completed buffer list monitors (not the case today). >>>> c) Because of of observation 3 and 2, the free list lock and >>>> completed buffer list monitors ought to have a rank lower than the >>>> rank of the shared queue lock. >>>> d) Because of observation 4 (and constraints a-c), all the barrier >>>> locks should be below the "special" rank without violating any >>>> existing ranks. >>>> >>>> The proposed new lock ranks conform to the constraints derived from >>>> my observations. It is worth noting that the potential relationship >>>> that could break (and why they do not) are: >>>> 1) If a lock is acquired from within the barriers that does not >>>> involve the shared queue lock, the free list lock or the completed >>>> buffer list monitor, we have now inverted their relationship as >>>> that other lock would probably have a rank higher than or equal to >>>> "special". But due to observation 4, there are no such cases. >>>> 2) The relationship between the shared queue lock and the completed >>>> buffer list monitor has been changed so both can be held at the >>>> same time if the shared queue lock is acquired first (which it is). >>>> This is arguably the way it should have been from the first place, >>>> and the old solution had ugly hacks where we would drop the shared >>>> queue lock to not run into the lock order assert (and only not to >>>> run into the lock order assert, i.e. not to avoid potential >>>> deadlock) to ensure the locks are not held at the same time. That >>>> code has now been removed, so that the shared queue lock is still >>>> held when enqueueing completed buffers (no dodgy dropping and >>>> reclaiming), and the code for handling the races due to multiple >>>> concurrent enqueuers has also been removed and replaced with an >>>> assertion that there simply should not be multiple concurrent >>>> enqueuers. Since the shared queue lock is now held throughout the >>>> whole operation, there will be no concurrent enqueuers. >>>> 3) The completed buffer list monitor used to have a higher rank >>>> than the free list lock. Now they have the same. Therefore, they >>>> could previously allow them to be held at the same time if the cbl >>>> monitor was acquired first. However, as discussed, there is no such >>>> case, and they ought to have the same rank not to confuse their >>>> true disjointness. If anyone insists we do not break this >>>> relationship despite the true disjointness, I could consent to >>>> adding another access lock rank, like this: >>>> http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I >>>> think it seems better to have the same rank since they are actually >>>> truly disjoint and should remain disjoint. >>>> >>>> I do recognize that long term, we *might* want a lock-free solution >>>> or something (not saying we do or do not). But until then, the >>>> ranks ought to be corrected so that they do not cause these >>>> problems causing everyone to bash their head against the awkward G1 >>>> lock ranks throughout the code and make hacks around it. >>>> >>>> Testing: JPRT with hotspot all and lots of local testing. >>>> >>>> Thanks, >>>> /Erik >> From erik.helin at oracle.com Tue Jul 4 11:40:19 2017 From: erik.helin at oracle.com (Erik Helin) Date: Tue, 4 Jul 2017 13:40:19 +0200 Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set Message-ID: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> Hi all, here comes a simple patch (just removing code) with a quite complicated justification :) So grab a cup of coffee, take out that good old pen and paper (it is almost impossible to convince yourself that this is correct without drawing) and enjoy the following little text: The G1RemSet::_into_cset_dirty_card_queue_set is no longer needed. It was originally added to keep track of cards with pointer into the collection set. In the case of evacuation failure, this set of cards would then be enqueued for refinement in order to construct/update remembered sets for regions that encountered evacuation failure (only regions in the collection set can encounter evacuation failure). However, this functionality is already provided by the call to G1ParScanThreadState::update_rs and the evac failure handling code. For pointers in regions outside of the collection set pointing into the collection set, we will always call G1ParScanThreadState::update_rs. G1ParScanThreadState::update_rs will enqueue the card containing the pointer pointing into the collection set onto G1CollectedHeap::_dirty_card_queue_set. So G1CollectedHeap::_dirty_card_queue_set will contain all the cards with pointers into the collection set (that are not themselves in the collection set). If an evacuation failure happens, then we will still trace through the object graph, calling do_oop_evac (but do_oop_evac will just return a pointer to the "from" object) for each object pointing into the collection set. This means that all cards in regions outside of the collection that contains pointers into the collection set will end up on G1CollectedHeap::_dirty_card_queue_set. For pointers in regions in the collection set pointing into the collection set, those will be handled by the evacuation failure handling code. The evacuation failure handling code will iterate over all objects in all regions that encountered an evacuation failure. If it encounters an object with a forwarding pointer pointing to itself, then it will enqueue the cards that contains that object's fields onto G1CollectedHeap::_dirty_card_queue_set. The two above paragraphs means that after a collection, G1CollectedHeap::_dirty_card_queue_set will always contain all cards that contained pointers into the collection set. This is true for both a successful collection and a collection that encountered evacuation failure. However, these cards are exactly the cards that G1RemSet::_into_cset_dirty_card_queue_set contains, so we might as well remove the G1RemSet::_into_cset_dirty_card_queue_set. Patch: http://cr.openjdk.java.net/~ehelin/8183539/00/ Issue: https://bugs.openjdk.java.net/browse/JDK-8183539 Testing: make test TEST=hotspot_gc on Linux x86-64 fastdebug Thanks, Erik From mikael.gerdin at oracle.com Tue Jul 4 12:17:52 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 4 Jul 2017 14:17:52 +0200 Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set In-Reply-To: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> Message-ID: Hi Erik, On 2017-07-04 13:40, Erik Helin wrote: > Hi all, > > here comes a simple patch (just removing code) with a quite complicated > justification :) So grab a cup of coffee, take out that good old pen and > paper (it is almost impossible to convince yourself that this is correct > without drawing) and enjoy the following little text: > > The G1RemSet::_into_cset_dirty_card_queue_set is no longer needed. It > was originally added to keep track of cards with pointer into the > collection set. In the case of evacuation failure, this set of cards > would then be enqueued for refinement in order to construct/update > remembered sets for regions that encountered evacuation failure (only > regions in the collection set can encounter evacuation failure). > However, this functionality is already provided by the call to > G1ParScanThreadState::update_rs and the evac failure handling code. > > For pointers in regions outside of the collection set pointing into the > collection set, we will always call G1ParScanThreadState::update_rs. > G1ParScanThreadState::update_rs will enqueue the card containing the > pointer pointing into the collection set onto > G1CollectedHeap::_dirty_card_queue_set. So > G1CollectedHeap::_dirty_card_queue_set will contain all the cards with > pointers into the collection set (that are not themselves in the > collection set). If an evacuation failure happens, then we will still > trace through the object graph, calling do_oop_evac (but do_oop_evac > will just return a pointer to the "from" object) for each object > pointing into the collection set. This means that all cards in regions > outside of the collection that contains pointers into the collection set > will end up on G1CollectedHeap::_dirty_card_queue_set. > > For pointers in regions in the collection set pointing into the > collection set, those will be handled by the evacuation failure handling > code. The evacuation failure handling code will iterate over all objects > in all regions that encountered an evacuation failure. If it encounters > an object with a forwarding pointer pointing to itself, then it will > enqueue the cards that contains that object's fields onto > G1CollectedHeap::_dirty_card_queue_set. > > The two above paragraphs means that after a collection, > G1CollectedHeap::_dirty_card_queue_set will always contain all cards > that contained pointers into the collection set. This is true for both a > successful collection and a collection that encountered evacuation > failure. However, these cards are exactly the cards that > G1RemSet::_into_cset_dirty_card_queue_set contains, so we might as well > remove the G1RemSet::_into_cset_dirty_card_queue_set. > > Patch: http://cr.openjdk.java.net/~ehelin/8183539/00/ > Issue: https://bugs.openjdk.java.net/browse/JDK-8183539 > Testing: make test TEST=hotspot_gc on Linux x86-64 fastdebug Do you know if any of the tests actually would have failed if rem set reconstruction after evacuation failure didn't work properly? I'd feel safer with this change if you ran with some verification code to ensure that the into_cset queue was always useless when evac failure occurs. Thanks /Mikael > > Thanks, > Erik From thomas.schatzl at oracle.com Tue Jul 4 15:24:56 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 04 Jul 2017 17:24:56 +0200 Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set In-Reply-To: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> Message-ID: <1499181896.2757.19.camel@oracle.com> Hi Erik, On Tue, 2017-07-04 at 13:40 +0200, Erik Helin wrote: > Hi all, > > here comes a simple patch (just removing code) with a quite > complicated justification :) So grab a cup of coffee, take out that > good old pen and paper (it is almost impossible to convince yourself > that this is correct without drawing) and enjoy the following little > text: > > [... long explanation...] > Patch: http://cr.openjdk.java.net/~ehelin/8183539/00/ > Issue: https://bugs.openjdk.java.net/browse/JDK-8183539 > Testing: make test TEST=hotspot_gc on Linux x86-64 fastdebug ? while I think the explanation is good, and actually we discussed this together, some more testing would be nice ;) Something like gcbasher with G1EvacuationFailureALot. Minor nit:?g1RemSet.cpp:710 the "return" statement is superfluous. (although I have already a change in mind that re-adds a return value ;)) Thanks, ? Thomas From thomas.schatzl at oracle.com Tue Jul 4 15:25:45 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 04 Jul 2017 17:25:45 +0200 Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards for NULL references Message-ID: <1499181945.2757.20.camel@oracle.com> Hi, ? can I have reviews for this change that adds a NULL-check in the UpdateRSetDeferred closure so that we do not enqueue cards with NULL references in it during evacuation failure? CR: https://bugs.openjdk.java.net/browse/JDK-8183127 Webrev: http://cr.openjdk.java.net/~tschatzl/8183127/webrev/ Testing: jprt,?gcbasher with G1EvacuationFailureALot for 1/2 hour I think this amount of testing is sufficient as the reasoning for this change is not *that* complicated. Thanks, ? Thomas From thomas.schatzl at oracle.com Tue Jul 4 15:42:33 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 04 Jul 2017 17:42:33 +0200 Subject: RFR (S): 8179679: Rearrange filters before card scanning In-Reply-To: <1499081093.2802.30.camel@oracle.com> References: <1499081093.2802.30.camel@oracle.com> Message-ID: <1499182953.2757.21.camel@oracle.com> Hi all, On Mon, 2017-07-03 at 13:24 +0200, Thomas Schatzl wrote: >?Hi all, >? >?? please have a look at this change that rearranges the checks in the >?G1RemSet card scanning a bit in order to: >? ?Erik had a look at this change with the following comments: - rename?card_region_idx ->?region_idx_for_card - factor out the two calls to claim a card and dirty its region into a method - move calculation of "card_region" into the scan_card() method. - he pointed out that the change can use G1CollectedHeap::region_at() instead of G1CollectedHeap::heap_region_containing() as it is simpler. - there has been another comment on why the change claims the card after checking whether the card is within the region's boundaries, and if that wouldn't be better performed right after the is_claimed check. Doing so will claim cards originating from stray remembered set entries into the current survivor regions as claimed, since we do not clear these regions later again (see G1ClearCardTableTask::work()) - their cards need to be "Young", and this is done during allocation of the region. This results in the card table verification to fail later. I think if we should think of changing the handling of survivor regions during the clear CT phase as part of a different CR. For now I added a comment. Webrev: http://cr.openjdk.java.net/~tschatzl/8179679/webrev.1_to_2?(diff) http://cr.openjdk.java.net/~tschatzl/8179679/webrev.2?(full) Testing: gcbasher Thanks, ? Thomas From thomas.schatzl at oracle.com Tue Jul 4 17:43:35 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 04 Jul 2017 19:43:35 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <59510D5E.10009@oracle.com> References: <59510D5E.10009@oracle.com> Message-ID: <1499190215.2423.3.camel@oracle.com> Hi, On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote: > Hi, > > Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 > ? looks good apart from the comment at Monitor::event_types. It now contradicts itself from one sentence to the next ("special must be lowest" and then "oh no, after all access must be lowest"). Please try to find some better wording here :) > The G1 barrier queues have very awkward lock orderings for the > following reasons: > [...] > > I do recognize that long term, we *might* want a lock-free solution > or something (not saying we do or do not). But until then, the ranks > ought to be corrected so that they do not cause these problems > causing everyone to bash their head against the awkward G1 lock ranks > throughout the code and make hacks around it. > > Testing: JPRT with hotspot all and lots of local testing. Thanks, ? Thomas From thomas.schatzl at oracle.com Tue Jul 4 18:14:04 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 04 Jul 2017 20:14:04 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <1499190215.2423.3.camel@oracle.com> References: <59510D5E.10009@oracle.com> <1499190215.2423.3.camel@oracle.com> Message-ID: <1499192044.2423.8.camel@oracle.com> Hi, On Tue, 2017-07-04 at 19:43 +0200, Thomas Schatzl wrote: > Hi, > > On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote: > > > > Hi, > > > > Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 > > > ? looks good apart from the comment at Monitor::event_types. It now > contradicts itself from one sentence to the next ("special must be > lowest" and then "oh no, after all access must be lowest"). Please > try to find some better wording here :) Some more comments about the comment added in this change: ? 96???// The rank access is reserved for locks that may be required to perform ? 97???// memory accesses that require special GC barriers, such as SATB barriers. ? 98???// Since memory accesses should be able to be performed pretty much anywhere ? 99???// in the code, that wannts being more special than the "special" rank. - s/wannts/requires in that comment. - I do not think the access lock rank is used for special GC barriers, at least the "SATB barrier" is a bad example. The SATB barrier is commonly the pre-write barrier in generated code, and the locks do not have a lot in common with write barriers. Maybe the text wanted to give an example for why locks of this rank could be called at any time - because the lock might be taken as part of some SATB barrier code? Thanks, ? Thomas From rkennke at redhat.com Tue Jul 4 18:47:52 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 4 Jul 2017 20:47:52 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap Message-ID: AdaptiveSizePolicy is not used/called from outside the GCs, and not all GCs need them. It makes sense to remove it from the CollectedHeap and CollectorPolicy interfaces and move them down to the actual subclasses that used them. I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only used/implemented in the parallel GC. Also, I made this class AllStatic (was StackObj) Tested by running hotspot_gc jtreg tests without regressions. http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ Roman From kim.barrett at oracle.com Wed Jul 5 02:00:26 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 4 Jul 2017 22:00:26 -0400 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <59510D5E.10009@oracle.com> References: <59510D5E.10009@oracle.com> Message-ID: <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> > On Jun 26, 2017, at 9:34 AM, Erik ?sterlund wrote: > > Hi, > > Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 ------------------------------------------------------------------------------ src/share/vm/gc/g1/ptrQueue.cpp Removing unlock / relock around 78 qset()->enqueue_complete_buffer(node); I would prefer that this part of this changeset not be made at this time. This part isn't necessary for the main point of this changeset. It's a cleanup that is enabled by the lock rank changes, where the rank changes are required for other reasons. It also at least conflicts with, and probably breaks, a pending change of mine. (I have a largish stack of patches in this area that didn't quite make it into JDK 9 before the original FC date, and which I've been (all too slowly) trying to work my way through and bring into JDK 10.) The RFR says: > 2) There is an issue that the shared queue locks have a "special" > rank, which is below the lock ranks used by the cbl monitor and free > list monitor. This leads to an issue when these locks have to be taken > while holding the shared queue locks. The current solution is to drop > the shared queue locks temporarily, introducing nasty data > races. These races are guarded, but the whole race seems very > unnecessary. This isn't entirely accurate, as the shared queue locks are not "special" rank; the current lock ranks are described correctly later in the RFR. It's true there is an "interesting" bit of code there to temporarily drop the shared queue lock. I don't think it's harmful to do so, and could have some small benefit now. More importantly, one of the changes in that afore mentioned stack of patches puts more (possibly significantly more in some cases) work into that dropped-lock region. And if that idea ultimately doesn't pan out, simply removing the unlock/relock pair is not, IMO, the right way to clean things up; there is some additional refactoring that ought to be done. ------------------------------------------------------------------------------ The lock ranking changes look good. From mikael.gerdin at oracle.com Wed Jul 5 08:30:14 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 5 Jul 2017 10:30:14 +0200 Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards for NULL references In-Reply-To: <1499181945.2757.20.camel@oracle.com> References: <1499181945.2757.20.camel@oracle.com> Message-ID: <727d03c9-c206-ce87-093c-3eee21a20049@oracle.com> Hi Thomas, On 2017-07-04 17:25, Thomas Schatzl wrote: > Hi, > > can I have reviews for this change that adds a NULL-check in the > UpdateRSetDeferred closure so that we do not enqueue cards with NULL > references in it during evacuation failure? > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183127 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183127/webrev/ Looks good to me. I agree that the amount of testing seems sufficient. /Mikael > Testing: > jprt, gcbasher with G1EvacuationFailureALot for 1/2 hour > > I think this amount of testing is sufficient as the reasoning for this > change is not *that* complicated. > > Thanks, > Thomas > From erik.osterlund at oracle.com Wed Jul 5 08:51:45 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 5 Jul 2017 10:51:45 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <1499192044.2423.8.camel@oracle.com> References: <59510D5E.10009@oracle.com> <1499190215.2423.3.camel@oracle.com> <1499192044.2423.8.camel@oracle.com> Message-ID: <595CA8A1.4040101@oracle.com> Hi Thomas, On 2017-07-04 20:14, Thomas Schatzl wrote: > Hi, > > On Tue, 2017-07-04 at 19:43 +0200, Thomas Schatzl wrote: >> Hi, >> >> On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote: >>> Hi, >>> >>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 >>> >> looks good apart from the comment at Monitor::event_types. It now >> contradicts itself from one sentence to the next ("special must be >> lowest" and then "oh no, after all access must be lowest"). Please >> try to find some better wording here :) > Some more comments about the comment added in this change: > > 96 // The rank access is reserved for locks that may be required to > perform > 97 // memory accesses that require special GC barriers, such as > SATB barriers. > 98 // Since memory accesses should be able to be performed pretty > much anywhere > 99 // in the code, that wannts being more special than the > "special" rank. > > - s/wannts/requires in that comment. Fixed. > - I do not think the access lock rank is used for special GC barriers, > at least the "SATB barrier" is a bad example. The SATB barrier is > commonly the pre-write barrier in generated code, and the locks do not > have a lot in common with write barriers. > Maybe the text wanted to give an example for why locks of this rank > could be called at any time - because the lock might be taken as part > of some SATB barrier code? I do not understand why SATB is a bad example. Perhaps you could elaborate? It is specifically the SATB barriers that are the biggest issue for me and what made me want to make this change. It is required for both writes but also for all weak loads. And it is specifically the weak loads that give me the most headache and serves as the main motivator for this. These include resolving jweaks, looking up strings, keeping class holders alive on compiler threads when looking up metadata in ciEnv, and a whole bunch of other stuff. The current code for handling weak loads is full of hacks like these: { MutexLockerEx m(...); oop obj = load_weak_oop(...); } keep_alive(obj); return obj; ...where the keep alive barrier required by SATB for weak loads has been moved way out from the critical section (even multiple levels up in the call tree) due to lock ordering problems with G1 SATB barrier code that forbids this barrier while holding certain locks. For the new GC barrier interface that introduces declarative semantics, I need the barriers to be tightly bound to the access, and I need accesses to not be disallowed due to holding other VM locks. We already perform JNIHandles::resolve while holding "special" ranked locks today, and hopefully get away with it by making sure these resolutions can never be passed a jweak disguised as a jobject. But I do not want to require hotspot developers to have to think about whether what they are resolving could be weak and then have to consider that SATB barriers require locks with high ranks, requiring them to rewrite the code. Having said that, of course the post-write barriers are problematic as well, as I want it to be possible to perform stores without having to think about random G1 locks in a similar fashion. Speaking of which, I am entertaining the idea that perhaps the HeapRegionRemSet::_m lock ought to get the new access rank too. It seems to me like it could happen that a JavaThread performing a reference store decides to join in on concurrent refinement and has to take that _m lock when adding a reference to a remembered set. Therefore, this current "leaf" ranked lock could be acquired when performing a store on JavaThreads. The current leaf rank is not conflicting with my currently proposed changes, and I do not require it for refactoring weak loads. The reason is that: 1) Only JavaThreads help out with refinement, and only due to their local queue being is full (not when e.g. a card could not be parsed and the shared queue is grabbed). 2) JavaThreads do not acquire the shared queue lock before calling enqueue on the local queue in their barriers as they use their own local queue instead. 3) Because of 1 and 2, when collaborative concurrent refinement is called from the queue, the shared queue lock is not held. 4) The cbl monitor and free list lock are not held either when concurrent refinement is called from the queue. 5) Due to 3 and 4, no access locks from the queues are held when calling concurrent refinement helper code. Having said that, it would still be good for consistency to move that lock down to access too, so that JavaThread reference stores can be performed more freely in the code. If there are plans of getting rid of that lock from refinement, then I think I can live with the current leaf rank, but if there are no plans of getting rid of that lock from refinement, I think I should probably squeeze that lock order change into this change. Perhaps the rank should be changed meanwhile anyway. Do you agree about this? Thanks, /Erik > Thanks, > Thomas > From erik.osterlund at oracle.com Wed Jul 5 08:53:46 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 5 Jul 2017 10:53:46 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <1499190215.2423.3.camel@oracle.com> References: <59510D5E.10009@oracle.com> <1499190215.2423.3.camel@oracle.com> Message-ID: <595CA91A.5070901@oracle.com> Hi Thomas, Thanks for the review. On 2017-07-04 19:43, Thomas Schatzl wrote: > Hi, > > On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote: >> Hi, >> >> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 >> > looks good apart from the comment at Monitor::event_types. It now > contradicts itself from one sentence to the next ("special must be > lowest" and then "oh no, after all access must be lowest"). Please try > to find some better wording here :) Agreed. Will fix and send out new webrev after I receive a reply to my reply to your other email. That turned into a more complicated sentence than I anticipated. Thanks, /Erik >> The G1 barrier queues have very awkward lock orderings for the >> following reasons: >> > [...] >> I do recognize that long term, we *might* want a lock-free solution >> or something (not saying we do or do not). But until then, the ranks >> ought to be corrected so that they do not cause these problems >> causing everyone to bash their head against the awkward G1 lock ranks >> throughout the code and make hacks around it. >> >> Testing: JPRT with hotspot all and lots of local testing. > Thanks, > Thomas > From erik.osterlund at oracle.com Wed Jul 5 10:24:00 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 5 Jul 2017 12:24:00 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> Message-ID: <595CBE40.5050603@oracle.com> Hi Kim, Thank you for the review. On 2017-07-05 04:00, Kim Barrett wrote: >> On Jun 26, 2017, at 9:34 AM, Erik ?sterlund wrote: >> >> Hi, >> >> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 > ------------------------------------------------------------------------------ > src/share/vm/gc/g1/ptrQueue.cpp > Removing unlock / relock around > 78 qset()->enqueue_complete_buffer(node); > > I would prefer that this part of this changeset not be made at this > time. > > This part isn't necessary for the main point of this changeset. It's > a cleanup that is enabled by the lock rank changes, where the rank > changes are required for other reasons. Okay. > It also at least conflicts with, and probably breaks, a pending change > of mine. (I have a largish stack of patches in this area that didn't > quite make it into JDK 9 before the original FC date, and which I've > been (all too slowly) trying to work my way through and bring into JDK > 10.) I agree that it would be possible to just correct the ranks while allowing the spaghetti synchronization code to remain in the code base. Here are some comments about that to me not so attractive idea: 1) I would really like to get rid of that code, because I think it is poor synchronization practice and its stated reason for existence is gone now. 2) I have to do *something* about that part in the current change, otherwise the comment motivating its existence will be incorrect after my lock rank change. There is no longer a legitimate motivation for doing that unlock and re-lock. So we have the choice of making a new made up motivation why we do this anyway, or to remove it. For me the choice is easily to remove it. 3) If some new actual motivation for dropping that lock arises later on down the road (which I am dubious about), then I do not see an issue with simply re-adding it then, when/if that becomes necessary again, with a new corresponding motivation added in appropriately. As far as your new changes go, I am curious what they do to motivate unlocking/re-locking this shared queue lock again. As outlined in my recent email to Thomas, we do not hold either of these queue locks when concurrent refinement helper code is called from GC barriers invoked from JavaThreads, even with my new changes. If it is in this code path that you will perform more work (just speculating), then that should be invariant of this cleanup. Therefore I would like to know, since I am asked to withdraw the code that cleans up the hacky spaghetti synchronization code, with the motivation that there might be a new reason for doing this later on, that we are at least certain that this unlock/re-lock will for sure be needed then. > The RFR says: > >> 2) There is an issue that the shared queue locks have a "special" >> rank, which is below the lock ranks used by the cbl monitor and free >> list monitor. This leads to an issue when these locks have to be taken >> while holding the shared queue locks. The current solution is to drop >> the shared queue locks temporarily, introducing nasty data >> races. These races are guarded, but the whole race seems very >> unnecessary. > This isn't entirely accurate, as the shared queue locks are not > "special" rank; the current lock ranks are described correctly later > in the RFR. Yes you are right. > It's true there is an "interesting" bit of code there to temporarily > drop the shared queue lock. I don't think it's harmful to do so, and > could have some small benefit now. More importantly, one of the > changes in that afore mentioned stack of patches puts more (possibly > significantly more in some cases) work into that dropped-lock region. > And if that idea ultimately doesn't pan out, simply removing the > unlock/relock pair is not, IMO, the right way to clean things up; > there is some additional refactoring that ought to be done. Could you please elaborate why you do not consider removing the unlock/lock due to incorrect lock ranks being the right cleanup after that very incorrect lock rank issue has been resolved? > ------------------------------------------------------------------------------ > > The lock ranking changes look good. Thanks, I am glad we agree about this. /Erik From mikael.gerdin at oracle.com Wed Jul 5 11:12:20 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 5 Jul 2017 13:12:20 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: References: Message-ID: <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com> Hi Roman, On 2017-07-04 20:47, Roman Kennke wrote: > AdaptiveSizePolicy is not used/called from outside the GCs, and not all > GCs need them. It makes sense to remove it from the CollectedHeap and > CollectorPolicy interfaces and move them down to the actual subclasses > that used them. > > I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only > used/implemented in the parallel GC. Also, I made this class AllStatic > (was StackObj) > > Tested by running hotspot_gc jtreg tests without regressions. > > http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ Please correct me if I'm wrong here but it looks like all the non-G1 collectors set the _should_clear_all_soft_refs based on gc_overhead_limit_near. Perhaps the ClearedAllSoftRefs scoped object could be modified to only work with GenCollectorPolicy derived policies (which include parallel *shrugs*) and G1 should just stop worrying about _all_soft_refs_clear. Looking closer, I can't even find G1 code looking at that member so maybe it, too, should be moved to GenCollectorPolicy? What do you think? /Mikael > > > Roman > From erik.osterlund at oracle.com Wed Jul 5 11:39:48 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 5 Jul 2017 13:39:48 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <595CA91A.5070901@oracle.com> References: <59510D5E.10009@oracle.com> <1499190215.2423.3.camel@oracle.com> <595CA91A.5070901@oracle.com> Message-ID: <595CD004.6000206@oracle.com> Hi, Thomas and I discussed offline and came to the following conclusions: 1) Lowering the lock rank of HeapRegionRemSet::_m to access would be nice indeed, but probably deserves a separate RFE with further reasoning and analysis. Will stick to the queue-related lock ranks in this RFE. 2) We agree mostly about the comments, but I have a new webrev that hopefully has even more clear comments regarding the new access rank. Incremental webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02_03/ Full webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.03/ Thanks for reviewing, and hope the new comments are satisfactory. /Erik On 2017-07-05 10:53, Erik ?sterlund wrote: > Hi Thomas, > > Thanks for the review. > > On 2017-07-04 19:43, Thomas Schatzl wrote: >> Hi, >> >> On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote: >>> Hi, >>> >>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 >>> >> looks good apart from the comment at Monitor::event_types. It now >> contradicts itself from one sentence to the next ("special must be >> lowest" and then "oh no, after all access must be lowest"). Please try >> to find some better wording here :) > > Agreed. Will fix and send out new webrev after I receive a reply to my > reply to your other email. That turned into a more complicated > sentence than I anticipated. > > Thanks, > /Erik > >>> The G1 barrier queues have very awkward lock orderings for the >>> following reasons: >>> >> [...] >>> I do recognize that long term, we *might* want a lock-free solution >>> or something (not saying we do or do not). But until then, the ranks >>> ought to be corrected so that they do not cause these problems >>> causing everyone to bash their head against the awkward G1 lock ranks >>> throughout the code and make hacks around it. >>> >>> Testing: JPRT with hotspot all and lots of local testing. >> Thanks, >> Thomas >> > From mikael.gerdin at oracle.com Wed Jul 5 11:58:14 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 5 Jul 2017 13:58:14 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> Message-ID: Hi Roman, On 2017-07-03 17:05, Roman Kennke wrote: > Am 03.07.2017 um 11:13 schrieb Roman Kennke: >> Am 03.07.2017 um 09:35 schrieb Mikael Gerdin: >>> Hi Roman, >>> >>> On 2017-06-30 18:32, Roman Kennke wrote: >>>> I came across one problem using this approach: We will have 2 instances >>>> of CollectedHeap around, where there's usually only 1, and some code >>>> expects only 1. For example, in CollectedHeap constructor, we create new >>>> PerfData variables, and we now create them 2x, which leads to an assert >>>> being thrown. I suspect there is more code like that. >>>> >>>> I will attempt to refactor this a little more, maybe it's not that bad, >>>> but it's probably not worth spending too much time on it. >>> I think refactoring the code to not expect a singleton CollectedHeap >>> instance is a bit too much. >>> Perhaps there is another way to share common code between Serial and >>> CMS but that might require a bit more thought. >> Yeah, definitely. I hit another difficulty: pretty much the same issues >> that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now show up >> with Generation and its subclasses.. >> >> How about we push the original patch that I've posted, and work from >> there? In fact, I *have* found some little things I would change (some >> more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have >> overlooked in my first pass...) > > So here's the little change (two more places in genCollectedHeap.hpp > where UseConcMarkSweepGC was used to alter behaviour: > > http://cr.openjdk.java.net/~rkennke/8179387/webrev.02/ > > > Ok to push this? I think this looks like a good step in the right direction! One thing I noticed is that you can put "enum GCH_strong_roots_tasks" inside of GenCollectedHeap to avoid tainting the global namespace with the enum members. Just above the declaration of _process_strong_tasks seems like an excellent location for the enum declaration :) This looks like it's not needed anymore. bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) { if (!UseConcMarkSweepGC) { return false; } /Mikael > > Roman > From thomas.schatzl at oracle.com Wed Jul 5 12:37:26 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 05 Jul 2017 14:37:26 +0200 Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards for NULL references In-Reply-To: <727d03c9-c206-ce87-093c-3eee21a20049@oracle.com> References: <1499181945.2757.20.camel@oracle.com> <727d03c9-c206-ce87-093c-3eee21a20049@oracle.com> Message-ID: <1499258246.15955.3.camel@oracle.com> Hi Mikael, On Wed, 2017-07-05 at 10:30 +0200, Mikael Gerdin wrote: > Hi Thomas, > > On 2017-07-04 17:25, Thomas Schatzl wrote: > > > > Hi, > > > > ???can I have reviews for this change that adds a NULL-check in the > > UpdateRSetDeferred closure so that we do not enqueue cards with > > NULL > > references in it during evacuation failure? > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8183127 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8183127/webrev/ > Looks good to me. I agree that the amount of testing seems > sufficient. ? thanks for your review. Thomas From daniel.daugherty at oracle.com Wed Jul 5 18:30:37 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 5 Jul 2017 12:30:37 -0600 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> References: <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com> <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> Message-ID: On 6/27/17 1:47 PM, Roman Kennke wrote: > Hi Robbin, > > Ugh. Thanks for catching this. > Problem was that I was accounting the thread-local deflations twice: > once in thread-local processing (basically a leftover from my earlier > attempt to implement this accounting) and then again in > finish_deflate_idle_monitors(). Should be fixed here: > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/ > Are you thinking that this fix resolves all three bugs: 8132849 Increased stop time in cleanup phase because of single-threaded walk of thread stacks in NMethodSweeper::mark_active_nmethods() 8153224 Monitor deflation prolong safepoints 8180932 Parallelize safepoint cleanup JDK-8132849 is assigned to Tobias; it would be good to get Tobias' review of this fix also. General comments: - Please don't forget to update Copyright years as needed before pushing src/share/vm/gc/shared/collectedHeap.hpp No comments. src/share/vm/runtime/safepoint.hpp L78: enum SafepointCleanupTasks { You might want to add a comment here: // The enums are listed in the order of the tasks when done serially. src/share/vm/runtime/safepoint.cpp L556: ! thread->is_Code_cache_sweeper_thread()) { L581: if (! _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) { L589: if (! _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_UPDATE_INLINE_CACHES)) { L597: if (! _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_COMPILATION_POLICY)) { L605: if (! _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_SYMBOL_TABLE_REHASH)) { L615: if (! _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_STRING_TABLE_REHASH)) { L625: if (! _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_CLD_PURGE)) { nit: HotSpot style doesn't usually have a space after unary '!'. L638: // Various cleaning tasks that should be done periodically at safepoints L641: // Prepare for monitor deflation nit: Please add a period to the end of these sentences. src/share/vm/runtime/sweeper.hpp No comments. src/share/vm/runtime/sweeper.cpp L205: // TODO: Is this really needed? L206: OrderAccess::storestore(); That's a good question. Looks like that storestore() was added by this changeset: $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp changeset: 5357:510fbd28919c user: anoll date: Fri Sep 27 10:50:55 2013 +0200 summary: 8020151: PSR:PERF Large performance regressions when code cache is filled The changeset is not small and it looks like two OrderAccess::storestore() calls were added (and one load_ptr_acquire() was deleted): $ hg diff -r 5356 -r 5357 | grep OrderAccess + OrderAccess::storestore(); - nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code); + OrderAccess::storestore(); It could be that the storestore() is matching an existing OrderAccess operation or it could have been added in an abundance of caution. We definitely need a Compiler team person to take a look here. src/share/vm/runtime/synchronizer.hpp L36: int nInuse; // currently associated with objects L37: int nInCirculation; // extant L38: int nScavenged; // reclaimed nit: Please add one more space before '//' on L36,L37. src/share/vm/runtime/synchronizer.cpp L1663: // Walk a given monitor list, and deflate idle monitors L1664: // The given list could be a per-thread list or a global list L1665: // Caller acquires gListLock L1666: int ObjectSynchronizer::deflate_monitor_list(ObjectMonitor** listHeadp, L1802: int deflated_count = deflate_monitor_list(thread->omInUseList_addr(), &freeHeadp, &freeTailp); L1804: Thread::muxAcquire(&gListLock, "scavenge - return"); The above deflate_monitor_list() now occurs outside of the gListLock where the old code held the gListLock for this call. Yes, it is operating on the thread local list, but what keeps two different worker threads from trying to deflate_monitor_list() on the same JavaThread at the same time? Update: OK, so it looks like when we're doing parallel cleanup, each worker thread cleans up thread local monitors for the JavaThreads. I don't know this WorkGang stuff, but are these distinct threads from the JavaThreads? Or is each JavaThread "borrowed" to do its down monitor cleanup while we're at the safepoint? (How in the world would that idea work? Maybe I need more coffee here...) Without the gListLock, I don't see how the worker threads avoid conflicting over the same thread local list. Minimally, the comment on L1665 needs updating. L1697: counters->nInuse = 0; // currently associated with objects L1698: counters->nInCirculation = 0; // extant L1699: counters->nScavenged = 0; // reclaimed nit: Please add one more space before '//' on L1697, L1698. old L1698: int nInuse = 0; old L1713: int inUse = 0; Nice catch here. I've read this code countless times and missed this bug until now. It explains why some of my Java monitor testing had odd "in use" counts. L1797: if (! MonitorInUseLists) return; nit: HotSpot style doesn't usually have a space after unary '!'. L1808: thread->omInUseCount-= deflated_count; nit: Please add a space before '-='. src/share/vm/runtime/thread.hpp No comments. src/share/vm/runtime/thread.cpp No comments. This is very nice work and a great cleanup for a complicated part of the system. David Simms did some recent work on the MonitorInUseLists stuff. If he has time, it might be good for him to take a quick look at this changeset, but I don't know his summer vacation schedule so that may not be possible. The only comment I need resolved is about the locking for the thread local deflate_monitor_list() call. Everything else is minor. Dan > > Side question: which jtreg targets do you usually run? > > Trying: make test TEST=hotspot_all > gives me *lots* of failures due to missing jcstress stuff (?!) > And even other subsets seem to depend on several bits and pieces that I > have no idea about. > > Roman > > Am 27.06.2017 um 16:51 schrieb Robbin Ehn: >> Hi Roman, >> >> There is something wrong in calculations: >> INFO: Deflate: InCirc=43 InUse=18 Scavenged=25 ForceMonitorScavenge=0 >> : pop=27051 free=215487 >> >> free is larger than population, have not had the time to dig into this. >> >> Thanks, Robbin >> >> On 06/22/2017 10:19 PM, Roman Kennke wrote: >>> So here's the latest iteration of that patch: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.08/ >>> >>> >>> I checked and fixed all the counters. The problem here is that they are >>> not updated in a single place (deflate_idle_monitors() ) but in several >>> places, potentially by multiple threads. I split up deflation into >>> prepare_.. and a finish_.. methods to initialize local and update global >>> counters respectively, and pass around a counters object (allocated on >>> stack) to the various code paths that use it. Updating the counters >>> always happen under a lock, there's no need to do anything special with >>> regards to concurrency. >>> >>> I also checked the nmethod marking, but there doesn't seem to be >>> anything in that code that looks problematic under concurrency. The >>> worst that can happen is that two threads write the same value into an >>> nmethod field. I think we can live with that ;-) >>> >>> Good to go? >>> >>> Tested by running specjvm and jcstress fastdebug+release without issues. >>> >>> Roman >>> >>> Am 02.06.2017 um 12:39 schrieb Robbin Ehn: >>>> Hi Roman, >>>> >>>> On 06/02/2017 11:41 AM, Roman Kennke wrote: >>>>> Hi David, >>>>> thanks for reviewing. I'll be on vacation the next two weeks too, with >>>>> only sporadic access to work stuff. >>>>> Yes, exposure will not be as good as otherwise, but it's not totally >>>>> untested either: the serial code path is the same as the parallel, the >>>>> only difference is that it's not actually called by multiple threads. >>>>> It's ok I think. >>>>> >>>>> I found two more issues that I think should be addressed: >>>>> - There are some counters in deflate_idle_monitors() and I'm not >>>>> sure I >>>>> correctly handle them in the split-up and MT'ed thread-local/ global >>>>> list deflation >>>>> - nmethod marking seems to unconditionally poke true or something like >>>>> that in nmethod fields. This doesn't hurt correctness-wise, but it's >>>>> probably worth checking if it's already true, especially when doing >>>>> this >>>>> with multiple threads concurrently. >>>>> >>>>> I'll send an updated patch around later, I hope I can get to it >>>>> today... >>>> I'll review that when you get it out. >>>> I think this looks as a reasonable step before we tackle this with a >>>> major effort, such as the JEP you and Carsten doing. >>>> And another effort to 'fix' nmethods marking. >>>> >>>> Internal discussion yesterday lead us to conclude that the runtime >>>> will probably need more threads. >>>> This would be a good driver to do a 'global' worker pool which serves >>>> both gc, runtime and safepoints with threads. >>>> >>>>> Roman >>>>> >>>>>> Hi Roman, >>>>>> >>>>>> I am about to disappear on an extended vacation so will let others >>>>>> pursue this. IIUC this is longer an opt-in by the user at runtime, >>>>>> but >>>>>> an opt-in by the particular GC developers. Okay. My only concern with >>>>>> that is if Shenandoah is the only GC that currently opts in then this >>>>>> code is not going to get much testing and will be more prone to >>>>>> incidental breakage. >>>> As I mentioned before, it seem like Erik ? have some idea, maybe he >>>> can do this after his barrier patch. >>>> >>>> Thanks! >>>> >>>> /Robbin >>>> >>>>>> Cheers, >>>>>> David >>>>>> >>>>>> On 2/06/2017 2:21 AM, Roman Kennke wrote: >>>>>>> Am 01.06.2017 um 17:50 schrieb Roman Kennke: >>>>>>>> Am 01.06.2017 um 14:18 schrieb Robbin Ehn: >>>>>>>>> Hi Roman, >>>>>>>>> >>>>>>>>> On 06/01/2017 11:29 AM, Roman Kennke wrote: >>>>>>>>>> Am 31.05.2017 um 22:06 schrieb Robbin Ehn: >>>>>>>>>>> Hi Roman, I agree that is really needed but: >>>>>>>>>>> >>>>>>>>>>> On 05/31/2017 10:27 AM, Roman Kennke wrote: >>>>>>>>>>>> I realized that sharing workers with GC is not so easy. >>>>>>>>>>>> >>>>>>>>>>>> We need to be able to use the workers at a safepoint during >>>>>>>>>>>> concurrent >>>>>>>>>>>> GC work (which also uses the same workers). This does not only >>>>>>>>>>>> require >>>>>>>>>>>> that those workers be suspended, like e.g. >>>>>>>>>>>> SuspendibleThreadSet::yield(), but they need to be idle, i.e. >>>>>>>>>>>> have >>>>>>>>>>>> finished their tasks. This needs some careful handling to work >>>>>>>>>>>> without >>>>>>>>>>>> races: it requires a SuspendibleThreadSetJoiner around the >>>>>>>>>>>> corresponding >>>>>>>>>>>> run_task() call and also the tasks themselves need to join the >>>>>>>>>>>> STS and >>>>>>>>>>>> handle requests for safepoints not by yielding, but by leaving >>>>>>>>>>>> the >>>>>>>>>>>> task. >>>>>>>>>>>> This is far too peculiar for me to make the call to hook up GC >>>>>>>>>>>> workers >>>>>>>>>>>> for safepoint cleanup, and I thus removed those parts. I >>>>>>>>>>>> left the >>>>>>>>>>>> API in >>>>>>>>>>>> CollectedHeap in place. I think GC devs who know better >>>>>>>>>>>> about G1 >>>>>>>>>>>> and CMS >>>>>>>>>>>> should make that call, or else just use a separate thread pool. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Is it ok now? >>>>>>>>>>> I still think you should put the "Parallel Safepoint Cleanup" >>>>>>>>>>> workers >>>>>>>>>>> inside Shenandoah, >>>>>>>>>>> so the SafepointSynchronizer only calls get_safepoint_workers, >>>>>>>>>>> e.g.: >>>>>>>>>>> >>>>>>>>>>> _cleanup_workers = heap->get_safepoint_workers(); >>>>>>>>>>> _num_cleanup_workers = _cleanup_workers != NULL ? >>>>>>>>>>> _cleanup_workers->total_workers() : 1; >>>>>>>>>>> ParallelSPCleanupTask cleanup(_cleanup_subtasks); >>>>>>>>>>> StrongRootsScope srs(_num_cleanup_workers); >>>>>>>>>>> if (_cleanup_workers != NULL) { >>>>>>>>>>> _cleanup_workers->run_task(&cleanup, >>>>>>>>>>> _num_cleanup_workers); >>>>>>>>>>> } else { >>>>>>>>>>> cleanup.work(0); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> That way you don't even need your new flags, but it will be >>>>>>>>>>> up to >>>>>>>>>>> the >>>>>>>>>>> other GCs to make their worker available >>>>>>>>>>> or cheat with a separate workgang. >>>>>>>>>> I can do that, I don't mind. The question is, do we want that? >>>>>>>>> The problem is that we do not want to haste such decision, we >>>>>>>>> believe >>>>>>>>> there is a better solution. >>>>>>>>> I think you also would want another solution. >>>>>>>>> But it's seems like such solution with 1 'global' thread pool >>>>>>>>> either >>>>>>>>> own by GC or the VM it self is quite the undertaking. >>>>>>>>> Since this probably will not be done any time soon my >>>>>>>>> suggestion is, >>>>>>>>> to not hold you back (we also want this), just to make >>>>>>>>> the code parallel and as an intermediate step ask the GC if it >>>>>>>>> minds >>>>>>>>> sharing it's thread. >>>>>>>>> >>>>>>>>> Now when Shenandoah is merged it's possible that e.g. G1 will >>>>>>>>> share >>>>>>>>> the code for a separate thread pool, do something of it's own or >>>>>>>>> wait until the bigger question about thread pool(s) have been >>>>>>>>> resolved. >>>>>>>>> >>>>>>>>> By adding a thread pool directly to the SafepointSynchronizer and >>>>>>>>> flags for it we might limit our future options. >>>>>>>>> >>>>>>>>>> I wouldn't call it 'cheating with a separate workgang' though. I >>>>>>>>>> see >>>>>>>>>> that both G1 and CMS suspend their worker threads at a safepoint. >>>>>>>>>> However: >>>>>>>>> Yes it's not cheating but I want decent heuristics between e.g. >>>>>>>>> number >>>>>>>>> of concurrent marking threads and parallel safepoint threads since >>>>>>>>> they compete for cpu time. >>>>>>>>> As the code looks now, I think that decisions must be made by the >>>>>>>>> GC. >>>>>>>> Ok, I see your point. I updated the proposed patch accordingly: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.06/ >>>>>>>> >>>>>>> Oops. Minor mistake there. Correction: >>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.07/ >>>>>>> >>>>>>> >>>>>>> (Removed 'class WorkGang' from safepoint.hpp, and forgot to add it >>>>>>> into >>>>>>> collectedHeap.hpp, resulting in build failure...) >>>>>>> >>>>>>> Roman >>>>>>> > From rkennke at redhat.com Wed Jul 5 21:17:51 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 5 Jul 2017 23:17:51 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com> <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> Message-ID: <8def1665-1fb3-c7a2-bc0d-0b63601a0c56@redhat.com> Am 05.07.2017 um 20:30 schrieb Daniel D. Daugherty: > On 6/27/17 1:47 PM, Roman Kennke wrote: >> Hi Robbin, >> >> Ugh. Thanks for catching this. >> Problem was that I was accounting the thread-local deflations twice: >> once in thread-local processing (basically a leftover from my earlier >> attempt to implement this accounting) and then again in >> finish_deflate_idle_monitors(). Should be fixed here: >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/ >> > > Are you thinking that this fix resolves all three bugs: > > 8132849 Increased stop time in cleanup phase because of > single-threaded > walk of thread stacks in > NMethodSweeper::mark_active_nmethods() Yes. It requires additional support code by a GC though to become actually multithreaded. > 8153224 Monitor deflation prolong safepoints Yes. But there's more that we want to do: - deflate monitors during GC thread scanning (this is a huge winner) - ultimately, deflate monitors concurrently (a JEP is on the way to address this) > 8180932 Parallelize safepoint cleanup Yes :-) > JDK-8132849 is assigned to Tobias; it would be good to get Tobias' > review of this fix also. Ok, I will reach out to him. > General comments: > - Please don't forget to update Copyright years as needed before > pushing Fixed. > > src/share/vm/runtime/safepoint.hpp > L78: enum SafepointCleanupTasks { > You might want to add a comment here: > // The enums are listed in the order of the tasks when > done serially. Good idea. Done. > src/share/vm/runtime/safepoint.cpp > L556: ! thread->is_Code_cache_sweeper_thread()) { > L581: if (! > _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) > { > L589: if (! > _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_UPDATE_INLINE_CACHES)) > { > L597: if (! > _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_COMPILATION_POLICY)) > { > L605: if (! > _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_SYMBOL_TABLE_REHASH)) > { > L615: if (! > _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_STRING_TABLE_REHASH)) > { > L625: if (! > _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_CLD_PURGE)) > { > nit: HotSpot style doesn't usually have a space after unary '!'. Ok, thanks! I didn't know that. Is there a document that describes the Hotspot style? Because, from the top of my head, I can name 3 source files all in entirely different styles ;-) > > L638: // Various cleaning tasks that should be done periodically > at safepoints > L641: // Prepare for monitor deflation > nit: Please add a period to the end of these sentences. > Done. > src/share/vm/runtime/sweeper.cpp > L205: // TODO: Is this really needed? > L206: OrderAccess::storestore(); > That's a good question. Looks like that storestore() was > added by this changeset: > > $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp > changeset: 5357:510fbd28919c > user: anoll > date: Fri Sep 27 10:50:55 2013 +0200 > summary: 8020151: PSR:PERF Large performance regressions > when code cache is filled > > The changeset is not small and it looks like two > OrderAccess::storestore() calls were added (and one > load_ptr_acquire() was deleted): > > $ hg diff -r 5356 -r 5357 | grep OrderAccess > + OrderAccess::storestore(); > - nmethod *code = (nmethod > *)OrderAccess::load_ptr_acquire(&_code); > + OrderAccess::storestore(); > > It could be that the storestore() is matching an existing > OrderAccess operation or it could have been added in an > abundance of caution. We definitely need a Compiler team > person to take a look here. I looked around a little bit. As far as I can tell, all compiler threads are stopped at a safepoint there. And I don't see anything else that uses the affected fields during the safepoint. There's a fence() before resuming safepointed threads. I think it should be safe without storestore(), but would like to get confirmation from compiler team too. > src/share/vm/runtime/synchronizer.hpp > L36: int nInuse; // currently associated with objects > L37: int nInCirculation; // extant > L38: int nScavenged; // reclaimed > nit: Please add one more space before '//' on L36,L37. Oops. Done. > src/share/vm/runtime/synchronizer.cpp > L1663: // Walk a given monitor list, and deflate idle monitors > L1664: // The given list could be a per-thread list or a global list > L1665: // Caller acquires gListLock > L1666: int > ObjectSynchronizer::deflate_monitor_list(ObjectMonitor** listHeadp, > > L1802: int deflated_count = > deflate_monitor_list(thread->omInUseList_addr(), &freeHeadp, &freeTailp); > L1804: Thread::muxAcquire(&gListLock, "scavenge - return"); > The above deflate_monitor_list() now occurs outside of the > gListLock where the old code held the gListLock for this call. > > Yes, it is operating on the thread local list, but what keeps > two different worker threads from trying to > deflate_monitor_list() > on the same JavaThread at the same time? The mechanics in Threads::parallel_java_threads_do() (which I adapted from Threads::possibly_parallel_oops_do()) ensure that each worker thread claims a Java thread before processing it. This ensures that each Java thread is processed by exactly one worker thread. > Without the gListLock, I don't see how the worker threads > avoid conflicting over the same thread local list. Minimally, > the comment on L1665 needs updating. Okidoki, I added those blocks there: // In the case of parallel processing of thread local monitor lists, // work is done by Threads::parallel_threads_do() which ensures that // each Java thread is processed by exactly one worker thread, and // thus avoid conflicts that would arise when worker threads would // process the same monitor lists concurrently. // // See also ParallelSPCleanupTask and // SafepointSynchronizer::do_cleanup_tasks() in safepoint.cpp and // Threads::parallel_java_threads_do() in thread.cpp. > > L1697: counters->nInuse = 0; // currently associated > with objects > L1698: counters->nInCirculation = 0; // extant > L1699: counters->nScavenged = 0; // reclaimed > nit: Please add one more space before '//' on L1697, L1698. Done. > old L1698: int nInuse = 0; > old L1713: int inUse = 0; > Nice catch here. I've read this code countless times and missed > this bug until now. It explains why some of my Java monitor > testing > had odd "in use" counts. Hmm. I am not aware of a bug there. the inUse declaration was unused, that is all (I think..) > L1797: if (! MonitorInUseLists) return; > nit: HotSpot style doesn't usually have a space after unary '!'. Done. > L1808: thread->omInUseCount-= deflated_count; > nit: Please add a space before '-='. Done. Also some lines up: gOmInUseCount-= deflated_count; > The only comment I need resolved is about the locking for the thread > local deflate_monitor_list() call. Everything else is minor. Thanks so much for the thorough review! So here's revision #11: http://cr.openjdk.java.net/~rkennke/8180932/webrev.10/ Roman -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Jul 5 23:30:42 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 5 Jul 2017 17:30:42 -0600 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <8def1665-1fb3-c7a2-bc0d-0b63601a0c56@redhat.com> References: <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com> <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <8def1665-1fb3-c7a2-bc0d-0b63601a0c56@redhat.com> Message-ID: <2770cc80-3dfe-4c0b-7e64-36778d82fbae@oracle.com> On 7/5/17 3:17 PM, Roman Kennke wrote: > Am 05.07.2017 um 20:30 schrieb Daniel D. Daugherty: >> On 6/27/17 1:47 PM, Roman Kennke wrote: >>> Hi Robbin, >>> >>> Ugh. Thanks for catching this. >>> Problem was that I was accounting the thread-local deflations twice: >>> once in thread-local processing (basically a leftover from my earlier >>> attempt to implement this accounting) and then again in >>> finish_deflate_idle_monitors(). Should be fixed here: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/ >>> >> >> Are you thinking that this fix resolves all three bugs: >> >> 8132849 Increased stop time in cleanup phase because of >> single-threaded >> walk of thread stacks in >> NMethodSweeper::mark_active_nmethods() > Yes. It requires additional support code by a GC though to become > actually multithreaded. >> 8153224 Monitor deflation prolong safepoints > Yes. But there's more that we want to do: > - deflate monitors during GC thread scanning (this is a huge winner) > - ultimately, deflate monitors concurrently (a JEP is on the way to > address this) > >> 8180932 Parallelize safepoint cleanup > Yes :-) > >> JDK-8132849 is assigned to Tobias; it would be good to get Tobias' >> review of this fix also. > Ok, I will reach out to him. > >> General comments: >> - Please don't forget to update Copyright years as needed before >> pushing > Fixed. >> >> src/share/vm/runtime/safepoint.hpp >> L78: enum SafepointCleanupTasks { >> You might want to add a comment here: >> // The enums are listed in the order of the tasks when >> done serially. > Good idea. Done. >> src/share/vm/runtime/safepoint.cpp >> L556: ! thread->is_Code_cache_sweeper_thread()) { >> L581: if (! >> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) >> { >> L589: if (! >> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_UPDATE_INLINE_CACHES)) >> { >> L597: if (! >> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_COMPILATION_POLICY)) >> { >> L605: if (! >> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_SYMBOL_TABLE_REHASH)) >> { >> L615: if (! >> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_STRING_TABLE_REHASH)) >> { >> L625: if (! >> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_CLD_PURGE)) >> { >> nit: HotSpot style doesn't usually have a space after unary '!'. > Ok, thanks! I didn't know that. Is there a document that describes the > Hotspot style? There is such a document: https://wiki.openjdk.java.net/display/HotSpot/StyleGuide I believe John Rose is the usual maintainer of the doc... > Because, from the top of my head, I can name 3 source files all in > entirely different styles ;-) True, very true... unfortunately. I don't know if John's doc mentions it, but a general rule is to follow the prevailing style in the file. Sometime this is impossible because sometimes we see multiple styles in the same file (and we pull our hair out)... >> >> L638: // Various cleaning tasks that should be done periodically >> at safepoints >> L641: // Prepare for monitor deflation >> nit: Please add a period to the end of these sentences. >> > Done. >> src/share/vm/runtime/sweeper.cpp >> L205: // TODO: Is this really needed? >> L206: OrderAccess::storestore(); >> That's a good question. Looks like that storestore() was >> added by this changeset: >> >> $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp >> changeset: 5357:510fbd28919c >> user: anoll >> date: Fri Sep 27 10:50:55 2013 +0200 >> summary: 8020151: PSR:PERF Large performance regressions >> when code cache is filled >> >> The changeset is not small and it looks like two >> OrderAccess::storestore() calls were added (and one >> load_ptr_acquire() was deleted): >> >> $ hg diff -r 5356 -r 5357 | grep OrderAccess >> + OrderAccess::storestore(); >> - nmethod *code = (nmethod >> *)OrderAccess::load_ptr_acquire(&_code); >> + OrderAccess::storestore(); >> >> It could be that the storestore() is matching an existing >> OrderAccess operation or it could have been added in an >> abundance of caution. We definitely need a Compiler team >> person to take a look here. > I looked around a little bit. As far as I can tell, all compiler > threads are stopped at a safepoint there. And I don't see anything > else that uses the affected fields during the safepoint. There's a > fence() before resuming safepointed threads. I think it should be safe > without storestore(), but would like to get confirmation from compiler > team too. Good idea! :-) >> src/share/vm/runtime/synchronizer.hpp >> L36: int nInuse; // currently associated with objects >> L37: int nInCirculation; // extant >> L38: int nScavenged; // reclaimed >> nit: Please add one more space before '//' on L36,L37. > Oops. Done. >> src/share/vm/runtime/synchronizer.cpp >> L1663: // Walk a given monitor list, and deflate idle monitors >> L1664: // The given list could be a per-thread list or a global list >> L1665: // Caller acquires gListLock >> L1666: int >> ObjectSynchronizer::deflate_monitor_list(ObjectMonitor** listHeadp, >> >> L1802: int deflated_count = >> deflate_monitor_list(thread->omInUseList_addr(), &freeHeadp, >> &freeTailp); >> L1804: Thread::muxAcquire(&gListLock, "scavenge - return"); >> The above deflate_monitor_list() now occurs outside of the >> gListLock where the old code held the gListLock for this call. >> >> Yes, it is operating on the thread local list, but what keeps >> two different worker threads from trying to >> deflate_monitor_list() >> on the same JavaThread at the same time? > The mechanics in Threads::parallel_java_threads_do() (which I adapted > from Threads::possibly_parallel_oops_do()) ensure that each worker > thread claims a Java thread before processing it. This ensures that > each Java thread is processed by exactly one worker thread. Cool. No race there. >> Without the gListLock, I don't see how the worker threads >> avoid conflicting over the same thread local list. Minimally, >> the comment on L1665 needs updating. > Okidoki, I added those blocks there: > > // In the case of parallel processing of thread local monitor lists, > // work is done by Threads::parallel_threads_do() which ensures that > // each Java thread is processed by exactly one worker thread, and > // thus avoid conflicts that would arise when worker threads would > // process the same monitor lists concurrently. > // > // See also ParallelSPCleanupTask and > // SafepointSynchronizer::do_cleanup_tasks() in safepoint.cpp and > // Threads::parallel_java_threads_do() in thread.cpp. I like the comment. (Others may find it wordy, but my comments are often thought to be wordy...) > >> >> L1697: counters->nInuse = 0; // currently associated >> with objects >> L1698: counters->nInCirculation = 0; // extant >> L1699: counters->nScavenged = 0; // reclaimed >> nit: Please add one more space before '//' on L1697, L1698. > Done. >> old L1698: int nInuse = 0; >> old L1713: int inUse = 0; >> Nice catch here. I've read this code countless times and missed >> this bug until now. It explains why some of my Java monitor >> testing >> had odd "in use" counts. > Hmm. I am not aware of a bug there. the inUse declaration was unused, > that is all (I think..) You would have that that when I pasted the two lines into the comment, I would have noticed the difference in the names... sigh... >> L1797: if (! MonitorInUseLists) return; >> nit: HotSpot style doesn't usually have a space after unary '!'. > Done. >> L1808: thread->omInUseCount-= deflated_count; >> nit: Please add a space before '-='. > Done. Also some lines up: > > gOmInUseCount-= deflated_count; > >> The only comment I need resolved is about the locking for the thread >> local deflate_monitor_list() call. Everything else is minor. > > Thanks so much for the thorough review! > > So here's revision #11: > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.10/ > > > Roman src/share/vm/runtime/synchronizer.cpp L1664: // Caller acquires gListLock. The new stuff you added below the existing comment is fine. However, that existing comment is still wrong because the caller doesn't always acquire gListLock. Perhaps: // Caller acquires gListLock when operating on a global list. Thanks for making the changes. Thumbs up! Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.helin at oracle.com Thu Jul 6 08:06:25 2017 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 6 Jul 2017 10:06:25 +0200 Subject: RFR (S): 8179679: Rearrange filters before card scanning In-Reply-To: <1499182953.2757.21.camel@oracle.com> References: <1499081093.2802.30.camel@oracle.com> <1499182953.2757.21.camel@oracle.com> Message-ID: <51aa3d5a-70d9-da28-12cd-2a05e949c4f9@oracle.com> Hi Thomas, looks good to me, Reviewed. Thanks, Erik On 07/04/2017 05:42 PM, Thomas Schatzl wrote: > Hi all, > > On Mon, 2017-07-03 at 13:24 +0200, Thomas Schatzl wrote: >> Hi all, >> >> please have a look at this change that rearranges the checks in the >> G1RemSet card scanning a bit in order to: >> > > Erik had a look at this change with the following comments: > > - rename card_region_idx -> region_idx_for_card > - factor out the two calls to claim a card and dirty its region into a > method > - move calculation of "card_region" into the scan_card() method. > - he pointed out that the change can use G1CollectedHeap::region_at() > instead of G1CollectedHeap::heap_region_containing() as it is simpler. > - there has been another comment on why the change claims the card > after checking whether the card is within the region's boundaries, and > if that wouldn't be better performed right after the is_claimed check. > > Doing so will claim cards originating from stray remembered set entries > into the current survivor regions as claimed, since we do not clear > these regions later again (see G1ClearCardTableTask::work()) - their > cards need to be "Young", and this is done during allocation of the > region. > > This results in the card table verification to fail later. > > I think if we should think of changing the handling of survivor regions > during the clear CT phase as part of a different CR. For now I added a > comment. > > Webrev: > http://cr.openjdk.java.net/~tschatzl/8179679/webrev.1_to_2 (diff) > http://cr.openjdk.java.net/~tschatzl/8179679/webrev.2 (full) > Testing: > gcbasher > > Thanks, > Thomas > From erik.helin at oracle.com Thu Jul 6 08:12:26 2017 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 6 Jul 2017 10:12:26 +0200 Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards for NULL references In-Reply-To: <1499181945.2757.20.camel@oracle.com> References: <1499181945.2757.20.camel@oracle.com> Message-ID: On 07/04/2017 05:25 PM, Thomas Schatzl wrote: > Hi, > > can I have reviews for this change that adds a NULL-check in the > UpdateRSetDeferred closure so that we do not enqueue cards with NULL > references in it during evacuation failure? > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183127 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183127/webrev/ Looks good, Reviewed. Thanks, Erik > Testing: > jprt, gcbasher with G1EvacuationFailureALot for 1/2 hour > > I think this amount of testing is sufficient as the reasoning for this > change is not *that* complicated. > > Thanks, > Thomas > From erik.helin at oracle.com Thu Jul 6 08:20:42 2017 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 6 Jul 2017 10:20:42 +0200 Subject: RFR (XS): 8183397: Ensure consistent closure filtering during evacuation In-Reply-To: <1499081088.2802.29.camel@oracle.com> References: <1499081088.2802.29.camel@oracle.com> Message-ID: <64943738-9d9f-0d88-b44e-9a9ec0812f33@oracle.com> On 07/03/2017 01:24 PM, Thomas Schatzl wrote: > Hi all, Hi Thomas, > can I have reviews for this change that fixes an observation that has > been made recently by Erik, i.e. that the "else" part of several > evacuation closures inconsistently filters out non-cross-region > references before checking whether the referenced object is a humongous > or ext region. > > This causes somewhat hard to diagnose performance issues, and earlier > filtering does not hurt if done anyway. > > (Note that the current way of checking in all but the UpdateRS closure > using HeapRegion::is_in_same_region() seems optimal. The only reason > why the other way in the UpdateRS closure is better because the code > needs the "to" HeapRegion pointer anyway) > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183397 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183397/webrev/ - } else if (in_cset_state.is_humongous()) { + } else { + if (in_cset_state.is_humongous()) { Why change `else if` to `else { if (...) {` here? Does it result in the compiler generating faster code for this case? Thanks, Erik > Testing: > jprt, performance regression analysis > > Thanks, > Thomas > From thomas.schatzl at oracle.com Thu Jul 6 08:28:21 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 06 Jul 2017 10:28:21 +0200 Subject: RFR (XS): 8183397: Ensure consistent closure filtering during evacuation In-Reply-To: <64943738-9d9f-0d88-b44e-9a9ec0812f33@oracle.com> References: <1499081088.2802.29.camel@oracle.com> <64943738-9d9f-0d88-b44e-9a9ec0812f33@oracle.com> Message-ID: <1499329701.2760.3.camel@oracle.com> Hi Erik, On Thu, 2017-07-06 at 10:20 +0200, Erik Helin wrote: > On 07/03/2017 01:24 PM, Thomas Schatzl wrote: > > > > Hi all, > Hi Thomas, > > > > > ? can I have reviews for this change that fixes an observation that > > has > > been made recently by Erik, i.e. that the "else" part of several > > evacuation closures inconsistently filters out non-cross-region > > references before checking whether the referenced object is a > > humongous > > or ext region. > > > > This causes somewhat hard to diagnose performance issues, and > > earlier > > filtering does not hurt if done anyway. > > > > (Note that the current way of checking in all but the UpdateRS > > closure > > using HeapRegion::is_in_same_region() seems optimal. The only > > reason > > why the other way in the UpdateRS closure is better because the > > code > > needs the "to" HeapRegion pointer anyway) > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8183397 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8183397/webrev/ > -??} else if (in_cset_state.is_humongous()) { > +??} else { > +????if (in_cset_state.is_humongous()) { > > Why change `else if` to `else { if (...) {` here? Does it result in > the > compiler generating faster code for this case? ? no. It only makes this do_oop_*() method look similar in structure to our do_oop_*() methods in the closures. I.e. if (in_cset.state.is_in_cset()) { ? // do stuff for refs into cset } else { ? // expanding handle_non_cset_obj_common() ? if (state.is_humongous()) { ? } else ... } I felt this improves overall readability, but this may only be because I have been working in this code a lot recently. I can revert this change. Thanks for your review, ? Thomas From thomas.schatzl at oracle.com Thu Jul 6 08:29:08 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 06 Jul 2017 10:29:08 +0200 Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards for NULL references In-Reply-To: References: <1499181945.2757.20.camel@oracle.com> Message-ID: <1499329748.2760.4.camel@oracle.com> Hi Erik, On Thu, 2017-07-06 at 10:12 +0200, Erik Helin wrote: >?On 07/04/2017 05:25 PM, Thomas Schatzl wrote: > > >?>? [...] >?>?CR: >?>?https://bugs.openjdk.java.net/browse/JDK-8183127 >?>?Webrev: >?>?http://cr.openjdk.java.net/~tschatzl/8183127/webrev/ >?Looks good, Reviewed. >? >?Thanks, >?Erik Thanks for your review, ? Thomas From stefan.johansson at oracle.com Thu Jul 6 08:46:15 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Thu, 6 Jul 2017 10:46:15 +0200 Subject: RFR (S): 8179679: Rearrange filters before card scanning In-Reply-To: <51aa3d5a-70d9-da28-12cd-2a05e949c4f9@oracle.com> References: <1499081093.2802.30.camel@oracle.com> <1499182953.2757.21.camel@oracle.com> <51aa3d5a-70d9-da28-12cd-2a05e949c4f9@oracle.com> Message-ID: On 2017-07-06 10:06, Erik Helin wrote: > Hi Thomas, > > looks good to me, Reviewed. +1 Nice cleanup, StefanJ > Thanks, > Erik > > On 07/04/2017 05:42 PM, Thomas Schatzl wrote: >> Hi all, >> >> On Mon, 2017-07-03 at 13:24 +0200, Thomas Schatzl wrote: >>> Hi all, >>> >>> please have a look at this change that rearranges the checks in the >>> G1RemSet card scanning a bit in order to: >>> >> Erik had a look at this change with the following comments: >> >> - rename card_region_idx -> region_idx_for_card >> - factor out the two calls to claim a card and dirty its region into a >> method >> - move calculation of "card_region" into the scan_card() method. >> - he pointed out that the change can use G1CollectedHeap::region_at() >> instead of G1CollectedHeap::heap_region_containing() as it is simpler. >> - there has been another comment on why the change claims the card >> after checking whether the card is within the region's boundaries, and >> if that wouldn't be better performed right after the is_claimed check. >> >> Doing so will claim cards originating from stray remembered set entries >> into the current survivor regions as claimed, since we do not clear >> these regions later again (see G1ClearCardTableTask::work()) - their >> cards need to be "Young", and this is done during allocation of the >> region. >> >> This results in the card table verification to fail later. >> >> I think if we should think of changing the handling of survivor regions >> during the clear CT phase as part of a different CR. For now I added a >> comment. >> >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8179679/webrev.1_to_2 (diff) >> http://cr.openjdk.java.net/~tschatzl/8179679/webrev.2 (full) >> Testing: >> gcbasher >> >> Thanks, >> Thomas >> From mikael.gerdin at oracle.com Thu Jul 6 09:20:32 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 6 Jul 2017 11:20:32 +0200 Subject: RFR (M) 8183923: Get rid of FreeBlockDictionary and dithering Message-ID: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com> Hi all, Please review this cleanup inspired by looking at Roman's CMS cleanup :) FreeBlockDictionary is an old abstraction for multiple CMS freelist datastructures which never appear to have been implemented, getting rid of it also simplifies some code in Metaspace so it's not all CMS stuff. Testing: jprt Bug: https://bugs.openjdk.java.net/browse/JDK-8183923 Webrev: http://cr.openjdk.java.net/~mgerdin/8183923/webrev.0/index.html Thanks /Mikael From thomas.schatzl at oracle.com Thu Jul 6 10:08:35 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 06 Jul 2017 12:08:35 +0200 Subject: RFR (S): 8179679: Rearrange filters before card scanning In-Reply-To: References: <1499081093.2802.30.camel@oracle.com> <1499182953.2757.21.camel@oracle.com> <51aa3d5a-70d9-da28-12cd-2a05e949c4f9@oracle.com> Message-ID: <1499335715.2760.6.camel@oracle.com> On Thu, 2017-07-06 at 10:46 +0200, Stefan Johansson wrote: > > On 2017-07-06 10:06, Erik Helin wrote: > > > > Hi Thomas, > > > > looks good to me, Reviewed. > +1 > Thanks for your reviews Stefan and Erik! Thomas From tobias.hartmann at oracle.com Thu Jul 6 10:14:26 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 6 Jul 2017 12:14:26 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com> <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> Message-ID: <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> Hi, On 05.07.2017 20:30, Daniel D. Daugherty wrote: > JDK-8132849 is assigned to Tobias; it would be good to get Tobias' > review of this fix also. Thanks for the notification. The sweeper/safepoint changes look good to me! > src/share/vm/runtime/sweeper.cpp > L205: // TODO: Is this really needed? > L206: OrderAccess::storestore(); > That's a good question. Looks like that storestore() was > added by this changeset: > > $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp > changeset: 5357:510fbd28919c > user: anoll > date: Fri Sep 27 10:50:55 2013 +0200 > summary: 8020151: PSR:PERF Large performance regressions when code cache is filled > > The changeset is not small and it looks like two > OrderAccess::storestore() calls were added (and one > load_ptr_acquire() was deleted): > > $ hg diff -r 5356 -r 5357 | grep OrderAccess > + OrderAccess::storestore(); > - nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code); > + OrderAccess::storestore(); > > It could be that the storestore() is matching an existing > OrderAccess operation or it could have been added in an > abundance of caution. We definitely need a Compiler team > person to take a look here. Unfortunately, I'm also not sure if that barrier is required. Looking at the old RFR thread: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html It seems that Igor V. suggested this: "You definitely need a store-store barrier for non-TSO architectures after the mark_as_seen_on_stack() call on line 1360. Otherwise it still can be reordered by the CPU with respect to the following state assignment. Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores." http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html The requested OrderAccess::storestore() was added to nmethod::make_not_entrant_or_zombie() but seems like Albert also added one to NMethodSweeper::mark_active_nmethods(). I'll ping Igor, maybe he knows more. Thanks, Tobias From erik.helin at oracle.com Thu Jul 6 12:52:27 2017 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 6 Jul 2017 14:52:27 +0200 Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure In-Reply-To: <1499156663.2761.6.camel@oracle.com> References: <1499156663.2761.6.camel@oracle.com> Message-ID: <08286762-411b-3079-9802-814c806af946@oracle.com> Hi Thomas, On 07/04/2017 10:24 AM, Thomas Schatzl wrote: > Hi all, > > can I get reviews for this change that renames and cleans up the use > of RefineCardTableEntryClosure in the code? > > RefineCardTableEntryClosure is the closure that is applied by the > concurrent refinement threads. This change renames it slightly to > indicate its use (G1RefineCardConcurrentlyClosure) and moves it to the > G1RemSet files close to the closure that we use for refinement/Update > RS during GC. great cleanup! Looking at the code, what do you think about moving G1RefineCardConcurrentlyClosure into concurrentG1RefineThread.cpp (and make it a private class to ConcurrentG1RefineThread)? AFAICS, ConcurrentG1RefineThread is the only code using this closure. If we do it this way, then we can actually make DirtyCardQueueSet::apply_closure_to_completed_buffer a template method, taking the Closure a template, as in: template bool apply_closure_to_completed_buffer(Closure* cl, uint worker_i, size_t stop_at, bool during_pause) This means that closures could get inlined, which doesn't mean that much for G1RefineCardConcurrentlyClosure, but could give a small boost for G1RefineCardClosure (for that to work, G1CollectedHeap::iterate_dirty_card_closure must take a G1RefineCardClosure, but that is ok, because that is the only closure type we pass to that method). Also, you do not need the forward declaration in G1CollectedHeap, it will not make use of this closure then :) If you want to "go the extra mile", then you can also pass a G1RemSet* as an argument to the G1RefineCardConcurrentlyClosure constructor and store it in a field, to avoid accessing the G1CollectedHeap via the singleton: G1CollectedHeap::heap()->g1_rem_set()->refine_card_concurrently(card_ptr, worker_i); (plus, G1RefineCardConcurrentlyClosure only needs a G1RemSet* pointer anyway ;)) Thanks, Erik > This change is dependent on "JDK-8183226: Remembered set summarization > accesses not fully initialized java thread DCQS" which is also > currently out for review - that change reorganizes G1CollectedHeap > initialization so that the change can actually move the closure. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183128 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183128/webrev/ > Testing: > jprt, local benchmarks > > Thanks, > Thomas > From rkennke at redhat.com Thu Jul 6 13:18:07 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 6 Jul 2017 15:18:07 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> References: <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com> <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> Message-ID: Am 06.07.2017 um 12:14 schrieb Tobias Hartmann: > Hi, > > On 05.07.2017 20:30, Daniel D. Daugherty wrote: >> JDK-8132849 is assigned to Tobias; it would be good to get Tobias' >> review of this fix also. > Thanks for the notification. The sweeper/safepoint changes look good to me! Thanks! I guess I'm going to need a sponsor when the orderAccess::storestore() issue is resolved. I'd say *if* we decide to keep the storestore() as conservative measure, it makes sense to also add it to the parallel processing routines like this: diff --git a/src/share/vm/runtime/safepoint.cpp b/src/share/vm/runtime/safepoint.cpp --- a/src/share/vm/runtime/safepoint.cpp +++ b/src/share/vm/runtime/safepoint.cpp @@ -550,6 +550,12 @@ _counters(counters), _nmethod_cl(NMethodSweeper::prepare_mark_active_nmethods()) {} + ~ParallelSPCleanupThreadClosure() { + // This is here to be consistent with sweeper.cpp NMethodSweeper::mark_active_nmethods(). + // TODO: Is this really needed? + OrderAccess::storestore(); + } + void do_thread(Thread* thread) { ObjectSynchronizer::deflate_thread_local_monitors(thread, _counters); if (_nmethod_cl != NULL && thread->is_Java_thread() && I've included this in the following (final?) webrev: http://cr.openjdk.java.net/~rkennke/8180932/webrev.11/ (I've also added Tobias to Reviewed-by: list... if anybody wants to sponsor it as-is, simply grab the changeset from here: http://cr.openjdk.java.net/~rkennke/8180932/webrev.11/hotspot.changeset ) Cheers, Roman >> src/share/vm/runtime/sweeper.cpp >> L205: // TODO: Is this really needed? >> L206: OrderAccess::storestore(); >> That's a good question. Looks like that storestore() was >> added by this changeset: >> >> $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp >> changeset: 5357:510fbd28919c >> user: anoll >> date: Fri Sep 27 10:50:55 2013 +0200 >> summary: 8020151: PSR:PERF Large performance regressions when code cache is filled >> >> The changeset is not small and it looks like two >> OrderAccess::storestore() calls were added (and one >> load_ptr_acquire() was deleted): >> >> $ hg diff -r 5356 -r 5357 | grep OrderAccess >> + OrderAccess::storestore(); >> - nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code); >> + OrderAccess::storestore(); >> >> It could be that the storestore() is matching an existing >> OrderAccess operation or it could have been added in an >> abundance of caution. We definitely need a Compiler team >> person to take a look here. > Unfortunately, I'm also not sure if that barrier is required. Looking at the old RFR thread: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html > > It seems that Igor V. suggested this: > "You definitely need a store-store barrier for non-TSO architectures after the mark_as_seen_on_stack() call on line 1360. Otherwise it still can be reordered by the CPU with respect to the following state assignment. Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores." > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html > > The requested OrderAccess::storestore() was added to nmethod::make_not_entrant_or_zombie() but seems like Albert also added one to NMethodSweeper::mark_active_nmethods(). > > I'll ping Igor, maybe he knows more. > > Thanks, > Tobias From mikael.gerdin at oracle.com Thu Jul 6 13:48:57 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 6 Jul 2017 15:48:57 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> Message-ID: Hi Roman, On 2017-07-05 13:58, Mikael Gerdin wrote: > Hi Roman, > > On 2017-07-03 17:05, Roman Kennke wrote: >> Am 03.07.2017 um 11:13 schrieb Roman Kennke: >>> Am 03.07.2017 um 09:35 schrieb Mikael Gerdin: >>>> Hi Roman, >>>> >>>> On 2017-06-30 18:32, Roman Kennke wrote: >>>>> I came across one problem using this approach: We will have 2 >>>>> instances >>>>> of CollectedHeap around, where there's usually only 1, and some code >>>>> expects only 1. For example, in CollectedHeap constructor, we >>>>> create new >>>>> PerfData variables, and we now create them 2x, which leads to an >>>>> assert >>>>> being thrown. I suspect there is more code like that. >>>>> >>>>> I will attempt to refactor this a little more, maybe it's not that >>>>> bad, >>>>> but it's probably not worth spending too much time on it. >>>> I think refactoring the code to not expect a singleton CollectedHeap >>>> instance is a bit too much. >>>> Perhaps there is another way to share common code between Serial and >>>> CMS but that might require a bit more thought. >>> Yeah, definitely. I hit another difficulty: pretty much the same issues >>> that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now show up >>> with Generation and its subclasses.. >>> >>> How about we push the original patch that I've posted, and work from >>> there? In fact, I *have* found some little things I would change (some >>> more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have >>> overlooked in my first pass...) >> >> So here's the little change (two more places in genCollectedHeap.hpp >> where UseConcMarkSweepGC was used to alter behaviour: >> >> http://cr.openjdk.java.net/~rkennke/8179387/webrev.02/ >> >> >> Ok to push this? I just realized that your change doesn't build on Windows since you didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky about that. /Mikael > > I think this looks like a good step in the right direction! > One thing I noticed is that you can put "enum GCH_strong_roots_tasks" > inside of GenCollectedHeap to avoid tainting the global namespace with > the enum members. Just above the declaration of _process_strong_tasks > seems like an excellent location for the enum declaration :) > > This looks like it's not needed anymore. > bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) { > if (!UseConcMarkSweepGC) { > return false; > } > > /Mikael > >> >> Roman >> From rkennke at redhat.com Thu Jul 6 16:23:39 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 6 Jul 2017 18:23:39 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> Message-ID: <13358626-e399-e352-1711-587416621aac@redhat.com> Am 06.07.2017 um 15:48 schrieb Mikael Gerdin: > Hi Roman, > > On 2017-07-05 13:58, Mikael Gerdin wrote: >> Hi Roman, >> >> On 2017-07-03 17:05, Roman Kennke wrote: >>> Am 03.07.2017 um 11:13 schrieb Roman Kennke: >>>> Am 03.07.2017 um 09:35 schrieb Mikael Gerdin: >>>>> Hi Roman, >>>>> >>>>> On 2017-06-30 18:32, Roman Kennke wrote: >>>>>> I came across one problem using this approach: We will have 2 >>>>>> instances >>>>>> of CollectedHeap around, where there's usually only 1, and some code >>>>>> expects only 1. For example, in CollectedHeap constructor, we >>>>>> create new >>>>>> PerfData variables, and we now create them 2x, which leads to an >>>>>> assert >>>>>> being thrown. I suspect there is more code like that. >>>>>> >>>>>> I will attempt to refactor this a little more, maybe it's not >>>>>> that bad, >>>>>> but it's probably not worth spending too much time on it. >>>>> I think refactoring the code to not expect a singleton CollectedHeap >>>>> instance is a bit too much. >>>>> Perhaps there is another way to share common code between Serial and >>>>> CMS but that might require a bit more thought. >>>> Yeah, definitely. I hit another difficulty: pretty much the same >>>> issues >>>> that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now >>>> show up >>>> with Generation and its subclasses.. >>>> >>>> How about we push the original patch that I've posted, and work from >>>> there? In fact, I *have* found some little things I would change (some >>>> more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have >>>> overlooked in my first pass...) >>> >>> So here's the little change (two more places in genCollectedHeap.hpp >>> where UseConcMarkSweepGC was used to alter behaviour: >>> >>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.02/ >>> >>> >>> Ok to push this? > > I just realized that your change doesn't build on Windows since you > didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky > about that. > /Mikael Uhhh. Ok, here's revision #3 with precompiled added in: http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/ Roman From igor.veresov at oracle.com Thu Jul 6 16:47:01 2017 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 6 Jul 2017 09:47:01 -0700 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> References: <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com> <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> Message-ID: <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> > On Jul 6, 2017, at 3:14 AM, Tobias Hartmann wrote: > > Hi, > > On 05.07.2017 20:30, Daniel D. Daugherty wrote: >> JDK-8132849 is assigned to Tobias; it would be good to get Tobias' >> review of this fix also. > > Thanks for the notification. The sweeper/safepoint changes look good to me! > >> src/share/vm/runtime/sweeper.cpp >> L205: // TODO: Is this really needed? >> L206: OrderAccess::storestore(); >> That's a good question. Looks like that storestore() was >> added by this changeset: >> >> $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp >> changeset: 5357:510fbd28919c >> user: anoll >> date: Fri Sep 27 10:50:55 2013 +0200 >> summary: 8020151: PSR:PERF Large performance regressions when code cache is filled >> >> The changeset is not small and it looks like two >> OrderAccess::storestore() calls were added (and one >> load_ptr_acquire() was deleted): >> >> $ hg diff -r 5356 -r 5357 | grep OrderAccess >> + OrderAccess::storestore(); >> - nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code); >> + OrderAccess::storestore(); >> >> It could be that the storestore() is matching an existing >> OrderAccess operation or it could have been added in an >> abundance of caution. We definitely need a Compiler team >> person to take a look here. > > Unfortunately, I'm also not sure if that barrier is required. Looking at the old RFR thread: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html > > It seems that Igor V. suggested this: > "You definitely need a store-store barrier for non-TSO architectures after the mark_as_seen_on_stack() call on line 1360. Otherwise it still can be reordered by the CPU with respect to the following state assignment. Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores." > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html > > The requested OrderAccess::storestore() was added to nmethod::make_not_entrant_or_zombie() but seems like Albert also added one to NMethodSweeper::mark_active_nmethods(). > > I'll ping Igor, maybe he knows more. I think the reason is explained in the comment: // Must happen before state change. Otherwise we have a race condition in // nmethod::can_not_entrant_be_converted(). I.e., a method can immediately // transition its state from 'not_entrant' to 'zombie' without having to wait // for stack scanning. if (state == not_entrant) { mark_as_seen_on_stack(); OrderAccess::storestore(); } // Change state _state = state; Although can_not_entrant_be_converted() is now called can_convert_to_zombie(). The scenario can so like this: 1. We?re setting the state to not_entrant. But the _state assignment happens before setting the traversal count in mark_as_seen_on_stack(). 2. While we?re doing this, the sweeper scans nmethods and is in process_compiled_method(): } else if (cm->is_not_entrant()) { // If there are no current activations of this method on the // stack we can safely convert it to a zombie method if (cm->can_convert_to_zombie()) { // Clear ICStubs to prevent back patching stubs of zombie or flushed // nmethods during the next safepoint (see ICStub::finalize). { MutexLocker cl(CompiledIC_lock); cm->clear_ic_stubs(); } // Code cache state change is tracked in make_zombie() cm->make_zombie(); So if state change happens before setting the traversal mark, the sweeper can go ahead and make it a zombie. Makes sense? Or am I missing something? igor > > Thanks, > Tobias -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Thu Jul 6 16:53:48 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 6 Jul 2017 18:53:48 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> References: <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com> <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> Message-ID: <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> Am 06.07.2017 um 18:47 schrieb Igor Veresov: > >> On Jul 6, 2017, at 3:14 AM, Tobias Hartmann >> > wrote: >> >> Hi, >> >> On 05.07.2017 20:30, Daniel D. Daugherty wrote: >>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias' >>> review of this fix also. >> >> Thanks for the notification. The sweeper/safepoint changes look good >> to me! >> >>> src/share/vm/runtime/sweeper.cpp >>> L205: // TODO: Is this really needed? >>> L206: OrderAccess::storestore(); >>> That's a good question. Looks like that storestore() was >>> added by this changeset: >>> >>> $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp >>> changeset: 5357:510fbd28919c >>> user: anoll >>> date: Fri Sep 27 10:50:55 2013 +0200 >>> summary: 8020151: PSR:PERF Large performance regressions >>> when code cache is filled >>> >>> The changeset is not small and it looks like two >>> OrderAccess::storestore() calls were added (and one >>> load_ptr_acquire() was deleted): >>> >>> $ hg diff -r 5356 -r 5357 | grep OrderAccess >>> + OrderAccess::storestore(); >>> - nmethod *code = (nmethod >>> *)OrderAccess::load_ptr_acquire(&_code); >>> + OrderAccess::storestore(); >>> >>> It could be that the storestore() is matching an existing >>> OrderAccess operation or it could have been added in an >>> abundance of caution. We definitely need a Compiler team >>> person to take a look here. >> >> Unfortunately, I'm also not sure if that barrier is required. Looking >> at the old RFR thread: >> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html >> >> It seems that Igor V. suggested this: >> "You definitely need a store-store barrier for non-TSO architectures >> after the mark_as_seen_on_stack() call on line 1360. Otherwise it >> still can be reordered by the CPU with respect to the following state >> assignment. Also neither of these state variables are volatile in >> nmethod, so even the compiler may reorder the stores." >> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html >> >> The requested OrderAccess::storestore() was added to >> nmethod::make_not_entrant_or_zombie() but seems like Albert also >> added one to NMethodSweeper::mark_active_nmethods(). >> >> I'll ping Igor, maybe he knows more. > > > I think the reason is explained in the comment: > > // Must happen before state change. Otherwise we have a race > condition in > // nmethod::can_not_entrant_be_converted(). I.e., a method can > immediately > // transition its state from 'not_entrant' to 'zombie' without > having to wait > // for stack scanning. > if (state == not_entrant) { > mark_as_seen_on_stack(); > OrderAccess::storestore(); > } > > // Change state > _state = state; > > Although can_not_entrant_be_converted() is now called > can_convert_to_zombie(). The scenario can so like this: > 1. We?re setting the state to not_entrant. But the _state assignment > happens before setting the traversal count in mark_as_seen_on_stack(). > 2. While we?re doing this, the sweeper scans nmethods and is in > process_compiled_method(): > > } else if (cm->is_not_entrant()) { > // If there are no current activations of this method on the > // stack we can safely convert it to a zombie method > if (cm->can_convert_to_zombie()) { > // Clear ICStubs to prevent back patching stubs of zombie or flushed > // nmethods during the next safepoint (see ICStub::finalize). > { > MutexLocker cl(CompiledIC_lock); > cm->clear_ic_stubs(); > } > // Code cache state change is tracked in make_zombie() > cm->make_zombie(); > > > So if state change happens before setting the traversal mark, the > sweeper can go ahead and make it a zombie. > > > Makes sense? Or am I missing something? I have probably not fully digged the code. As far as I can see: - sweeper thread runs outside safepoint - VMThread (which is doing the nmethod marking in the case that I'm looking at) runs while all other threads (incl. the sweeper) is holding still. In between we have a guaranteed fence(). There should be no need for a storestore() (at least in sweeper.cpp... in nmethod.cpp it seems to actually make sense as you pointed out above). *However* it doesn't really hurt to OrderAccess::storestore() there... so play it conservative and leave it in, as RFR'd in my last patch? Roman > > igor > > >> >> Thanks, >> Tobias > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Jul 6 17:14:38 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 6 Jul 2017 11:14:38 -0600 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> References: <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> Message-ID: On 7/6/17 10:53 AM, Roman Kennke wrote: > Am 06.07.2017 um 18:47 schrieb Igor Veresov: >> >>> On Jul 6, 2017, at 3:14 AM, Tobias Hartmann >>> > wrote: >>> >>> Hi, >>> >>> On 05.07.2017 20:30, Daniel D. Daugherty wrote: >>>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias' >>>> review of this fix also. >>> >>> Thanks for the notification. The sweeper/safepoint changes look good >>> to me! >>> >>>> src/share/vm/runtime/sweeper.cpp >>>> L205: // TODO: Is this really needed? >>>> L206: OrderAccess::storestore(); >>>> That's a good question. Looks like that storestore() was >>>> added by this changeset: >>>> >>>> $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp >>>> changeset: 5357:510fbd28919c >>>> user: anoll >>>> date: Fri Sep 27 10:50:55 2013 +0200 >>>> summary: 8020151: PSR:PERF Large performance regressions >>>> when code cache is filled >>>> >>>> The changeset is not small and it looks like two >>>> OrderAccess::storestore() calls were added (and one >>>> load_ptr_acquire() was deleted): >>>> >>>> $ hg diff -r 5356 -r 5357 | grep OrderAccess >>>> + OrderAccess::storestore(); >>>> - nmethod *code = (nmethod >>>> *)OrderAccess::load_ptr_acquire(&_code); >>>> + OrderAccess::storestore(); >>>> >>>> It could be that the storestore() is matching an existing >>>> OrderAccess operation or it could have been added in an >>>> abundance of caution. We definitely need a Compiler team >>>> person to take a look here. >>> >>> Unfortunately, I'm also not sure if that barrier is required. >>> Looking at the old RFR thread: >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html >>> >>> It seems that Igor V. suggested this: >>> "You definitely need a store-store barrier for non-TSO architectures >>> after the mark_as_seen_on_stack() call on line 1360. Otherwise it >>> still can be reordered by the CPU with respect to the following >>> state assignment. Also neither of these state variables are volatile >>> in nmethod, so even the compiler may reorder the stores." >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html >>> >>> The requested OrderAccess::storestore() was added to >>> nmethod::make_not_entrant_or_zombie() but seems like Albert also >>> added one to NMethodSweeper::mark_active_nmethods(). >>> >>> I'll ping Igor, maybe he knows more. >> >> >> I think the reason is explained in the comment: >> >> // Must happen before state change. Otherwise we have a race condition in >> // nmethod::can_not_entrant_be_converted(). I.e., a method can >> immediately >> // transition its state from 'not_entrant' to 'zombie' without having >> to wait >> // for stack scanning. >> if (state == not_entrant) { >> mark_as_seen_on_stack(); >> OrderAccess::storestore(); >> } >> >> // Change state >> _state = state; >> >> Although can_not_entrant_be_converted() is now called >> can_convert_to_zombie(). The scenario can so like this: >> 1. We?re setting the state to not_entrant. But the _state assignment >> happens before setting the traversal count in mark_as_seen_on_stack(). >> 2. While we?re doing this, the sweeper scans nmethods and is in >> process_compiled_method(): >> >> } else if (cm->is_not_entrant()) { >> // If there are no current activations of this method on the >> // stack we can safely convert it to a zombie method >> if (cm->can_convert_to_zombie()) { >> // Clear ICStubs to prevent back patching stubs of zombie or flushed >> // nmethods during the next safepoint (see ICStub::finalize). >> { >> MutexLocker cl(CompiledIC_lock); >> cm->clear_ic_stubs(); >> } >> // Code cache state change is tracked in make_zombie() >> cm->make_zombie(); >> >> >> So if state change happens before setting the traversal mark, the >> sweeper can go ahead and make it a zombie. >> >> >> Makes sense? Or am I missing something? > > I have probably not fully digged the code. As far as I can see: > - sweeper thread runs outside safepoint > - VMThread (which is doing the nmethod marking in the case that I'm > looking at) runs while all other threads (incl. the sweeper) is > holding still. > > In between we have a guaranteed fence(). > > There should be no need for a storestore() (at least in sweeper.cpp... > in nmethod.cpp it seems to actually make sense as you pointed out > above). *However* it doesn't really hurt to OrderAccess::storestore() > there... so play it conservative and leave it in, as RFR'd in my last > patch? If we are going to have the OrderAccess::storestore() calls, then we have to have a proper comment explaining why they are needed. Unfortunately, the OrderAccess::storestore() call that was added by anoll to src/share/vm/runtime/sweeper.cpp back in 2013 was not properly documented and we're bumping into that with this review. I'm not happy about this change: + ~ParallelSPCleanupThreadClosure() { + // This is here to be consistent with sweeper.cpp NMethodSweeper::mark_active_nmethods(). + // TODO: Is this really needed? + OrderAccess::storestore(); + } because we're adding an OrderAccess::storestore() to be consistent with an OrderAccess::storestore() that's not properly documented which is only increasing the technical debt. So a couple of things above don't make sense to me: > - sweeper thread runs outside safepoint > - VMThread (which is doing the nmethod marking in the case that > I'm looking at) runs while all other threads (incl. the sweeper) > is holding still. and: > There should be no need for a storestore() (at least in sweeper.cpp... If the sweeper thread is running "outside safepoint", then how is the sweeper thread "holding still" while the VMThread is doing the nmethod marking? Those two points are contradictory. If the sweeper thread is indeed executing outside a safepoint, then a storestore() is needed for its memory changes to be seen by the VMThread which is doing things in parallel. That means that the comment that sweeper.cpp doesn't need the storestore() is also contradictory. So what do you mean by this comment: > - sweeper thread runs outside safepoint and once we know that we can figure out the rest... Dan > > Roman > >> >> igor >> >> >>> >>> Thanks, >>> Tobias >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.veresov at oracle.com Thu Jul 6 18:02:27 2017 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 6 Jul 2017 11:02:27 -0700 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> References: <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com> <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> Message-ID: > On Jul 6, 2017, at 9:53 AM, Roman Kennke wrote: > > Am 06.07.2017 um 18:47 schrieb Igor Veresov: >> >>> On Jul 6, 2017, at 3:14 AM, Tobias Hartmann > wrote: >>> >>> Hi, >>> >>> On 05.07.2017 20:30, Daniel D. Daugherty wrote: >>>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias' >>>> review of this fix also. >>> >>> Thanks for the notification. The sweeper/safepoint changes look good to me! >>> >>>> src/share/vm/runtime/sweeper.cpp >>>> L205: // TODO: Is this really needed? >>>> L206: OrderAccess::storestore(); >>>> That's a good question. Looks like that storestore() was >>>> added by this changeset: >>>> >>>> $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp >>>> changeset: 5357:510fbd28919c >>>> user: anoll >>>> date: Fri Sep 27 10:50:55 2013 +0200 >>>> summary: 8020151: PSR:PERF Large performance regressions when code cache is filled >>>> >>>> The changeset is not small and it looks like two >>>> OrderAccess::storestore() calls were added (and one >>>> load_ptr_acquire() was deleted): >>>> >>>> $ hg diff -r 5356 -r 5357 | grep OrderAccess >>>> + OrderAccess::storestore(); >>>> - nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code); >>>> + OrderAccess::storestore(); >>>> >>>> It could be that the storestore() is matching an existing >>>> OrderAccess operation or it could have been added in an >>>> abundance of caution. We definitely need a Compiler team >>>> person to take a look here. >>> >>> Unfortunately, I'm also not sure if that barrier is required. Looking at the old RFR thread: >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html >>> >>> It seems that Igor V. suggested this: >>> "You definitely need a store-store barrier for non-TSO architectures after the mark_as_seen_on_stack() call on line 1360. Otherwise it still can be reordered by the CPU with respect to the following state assignment. Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores." >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html >>> >>> The requested OrderAccess::storestore() was added to nmethod::make_not_entrant_or_zombie() but seems like Albert also added one to NMethodSweeper::mark_active_nmethods(). >>> >>> I'll ping Igor, maybe he knows more. >> >> >> I think the reason is explained in the comment: >> >> // Must happen before state change. Otherwise we have a race condition in >> // nmethod::can_not_entrant_be_converted(). I.e., a method can immediately >> // transition its state from 'not_entrant' to 'zombie' without having to wait >> // for stack scanning. >> if (state == not_entrant) { >> mark_as_seen_on_stack(); >> OrderAccess::storestore(); >> } >> >> // Change state >> _state = state; >> >> Although can_not_entrant_be_converted() is now called can_convert_to_zombie(). The scenario can so like this: >> 1. We?re setting the state to not_entrant. But the _state assignment happens before setting the traversal count in mark_as_seen_on_stack(). >> 2. While we?re doing this, the sweeper scans nmethods and is in process_compiled_method(): >> >> } else if (cm->is_not_entrant()) { >> // If there are no current activations of this method on the >> // stack we can safely convert it to a zombie method >> if (cm->can_convert_to_zombie()) { >> // Clear ICStubs to prevent back patching stubs of zombie or flushed >> // nmethods during the next safepoint (see ICStub::finalize). >> { >> MutexLocker cl(CompiledIC_lock); >> cm->clear_ic_stubs(); >> } >> // Code cache state change is tracked in make_zombie() >> cm->make_zombie(); >> >> >> So if state change happens before setting the traversal mark, the sweeper can go ahead and make it a zombie. >> >> >> Makes sense? Or am I missing something? > > I have probably not fully digged the code. As far as I can see: > - sweeper thread runs outside safepoint > - VMThread (which is doing the nmethod marking in the case that I'm looking at) runs while all other threads (incl. the sweeper) is holding still. > > In between we have a guaranteed fence(). > > There should be no need for a storestore() (at least in sweeper.cpp... in nmethod.cpp it seems to actually make sense as you pointed out above). *However* it doesn't really hurt to OrderAccess::storestore() there... so play it conservative and leave it in, as RFR'd in my last patch? > A method can be made not entrant outside of a safepoint. And as you say sweeper thread runs outside safepoint too. That?s why there is a problem. igor > Roman > >> >> igor >> >> >>> >>> Thanks, >>> Tobias >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Thu Jul 6 18:05:01 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 6 Jul 2017 20:05:01 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> Message-ID: <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> > > I'm not happy about this change: > > + ~ParallelSPCleanupThreadClosure() { > + // This is here to be consistent with sweeper.cpp > NMethodSweeper::mark_active_nmethods(). > + // TODO: Is this really needed? > + OrderAccess::storestore(); > + } > > because we're adding an OrderAccess::storestore() to be consistent > with an OrderAccess::storestore() that's not properly documented > which is only increasing the technical debt. > > So a couple of things above don't make sense to me: > > > - sweeper thread runs outside safepoint > > - VMThread (which is doing the nmethod marking in the case that > > I'm looking at) runs while all other threads (incl. the sweeper) > > is holding still. > > and: > > > There should be no need for a storestore() (at least in sweeper.cpp... Either one or the other are running. Either the VMThread is marking nmethods (during safepoint) or the sweeper threads are running (outside safepoint). Between the two phases, there is a guaranteed OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() should be necessary. >From Igor's comment I can see how it happened though: Apparently there *is* a race in sweeper's own concurrent processing (concurrent with compiler threads, as far as I understand). And there's a call to nmethod::mark_as_seen_on_stack() after which a storestore() is required (as per Igor's explanation). So the logic probably was: we have mark_as_seen_on_stack() followed by storestore() here, so let's also put a storestore() in the other places that call mark_as_seen_on_stack(), one of which happens to be the safepoint cleanup code that we're discussing. (why the storestore() hasn't been put right into mark_as_seen_on_stack() I don't understand). In short, one storestore() really was necessary, the other looks like it has been put there 'for consistency' or just conservatively. But it shouldn't be necessary in the safepoint cleanup code that we're discussing. So what should we do? Remove the storestore() for good? Refactor the code so that both paths at least call the storestore() in the same place? (E.g. make mark_active_nmethods() use the closure and call storestore() in the dtor as proposed?) Roman > > If the sweeper thread is running "outside safepoint", then how is > the sweeper thread "holding still" while the VMThread is doing the > nmethod marking? Those two points are contradictory. > > If the sweeper thread is indeed executing outside a safepoint, then > a storestore() is needed for its memory changes to be seen by the > VMThread which is doing things in parallel. That means that the > comment that sweeper.cpp doesn't need the storestore() is also > contradictory. > > So what do you mean by this comment: > > > - sweeper thread runs outside safepoint > > and once we know that we can figure out the rest... > > Dan > > >> >> Roman >> >>> >>> igor >>> >>> >>>> >>>> Thanks, >>>> Tobias >>> >> > From kim.barrett at oracle.com Thu Jul 6 20:11:47 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 6 Jul 2017 16:11:47 -0400 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> Message-ID: <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com> > On Jul 4, 2017, at 10:00 PM, Kim Barrett wrote: > The lock ranking changes look good. I'm going to retract that. How does these new lock rankings interact with various assertions that rank() == or != Mutex::special? I'm not sure those places handle these new ranks properly. (I'm not sure those places handle Mutex::event rank properly either.) From kim.barrett at oracle.com Thu Jul 6 20:15:43 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 6 Jul 2017 16:15:43 -0400 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com> Message-ID: <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com> > On Jul 6, 2017, at 4:11 PM, Kim Barrett wrote: > >> On Jul 4, 2017, at 10:00 PM, Kim Barrett wrote: >> The lock ranking changes look good. > > I'm going to retract that. > > How does these new lock rankings interact with various assertions that > rank() == or != Mutex::special? I'm not sure those places handle > these new ranks properly. (I'm not sure those places handle > Mutex::event rank properly either.) And maybe this change needs to be discussed on hotspot-dev rather than hotspot-gc-dev. From robbin.ehn at oracle.com Thu Jul 6 21:02:50 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 6 Jul 2017 23:02:50 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> References: <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> Message-ID: <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> Hi, Far down -> On 07/06/2017 08:05 PM, Roman Kennke wrote: > >> >> I'm not happy about this change: >> >> + ~ParallelSPCleanupThreadClosure() { >> + // This is here to be consistent with sweeper.cpp >> NMethodSweeper::mark_active_nmethods(). >> + // TODO: Is this really needed? >> + OrderAccess::storestore(); >> + } >> >> because we're adding an OrderAccess::storestore() to be consistent >> with an OrderAccess::storestore() that's not properly documented >> which is only increasing the technical debt. >> >> So a couple of things above don't make sense to me: >> >>> - sweeper thread runs outside safepoint >>> - VMThread (which is doing the nmethod marking in the case that >>> I'm looking at) runs while all other threads (incl. the sweeper) >>> is holding still. >> >> and: >> >>> There should be no need for a storestore() (at least in sweeper.cpp... > > Either one or the other are running. Either the VMThread is marking > nmethods (during safepoint) or the sweeper threads are running (outside > safepoint). Between the two phases, there is a guaranteed > OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() > should be necessary. > > From Igor's comment I can see how it happened though: Apparently there > *is* a race in sweeper's own concurrent processing (concurrent with > compiler threads, as far as I understand). And there's a call to > nmethod::mark_as_seen_on_stack() after which a storestore() is required > (as per Igor's explanation). So the logic probably was: we have > mark_as_seen_on_stack() followed by storestore() here, so let's also put > a storestore() in the other places that call mark_as_seen_on_stack(), > one of which happens to be the safepoint cleanup code that we're > discussing. (why the storestore() hasn't been put right into > mark_as_seen_on_stack() I don't understand). In short, one storestore() > really was necessary, the other looks like it has been put there 'for > consistency' or just conservatively. But it shouldn't be necessary in > the safepoint cleanup code that we're discussing. > > So what should we do? Remove the storestore() for good? Refactor the > code so that both paths at least call the storestore() in the same > place? (E.g. make mark_active_nmethods() use the closure and call > storestore() in the dtor as proposed?) I took a quick look, maybe I'm missing some stuff but: So there is a slight optimization when not running sweeper to skip compiler barrier/fence in stw. Don't think that matter, so I propose something like: - long stack_traversal_mark() { return _stack_traversal_mark; } - void set_stack_traversal_mark(long l) { _stack_traversal_mark = l; } + long stack_traversal_mark() { return OrderAccess::load_acquire(&_stack_traversal_mark); } + void set_stack_traversal_mark(long l) { OrderAccess::release_store(&_stack_traversal_mark, l); } Maybe make _stack_traversal_mark volatile also, just as a marking that it is concurrent accessed. And remove both storestore. "Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores" Fortunately at least _state is volatile now. I think _state also should use la/rs semantics instead, but that's another story. Thanks, Robbin > > > Roman > > >> >> If the sweeper thread is running "outside safepoint", then how is >> the sweeper thread "holding still" while the VMThread is doing the >> nmethod marking? Those two points are contradictory. >> >> If the sweeper thread is indeed executing outside a safepoint, then >> a storestore() is needed for its memory changes to be seen by the >> VMThread which is doing things in parallel. That means that the >> comment that sweeper.cpp doesn't need the storestore() is also >> contradictory. >> >> So what do you mean by this comment: >> >>> - sweeper thread runs outside safepoint >> >> and once we know that we can figure out the rest... >> >> Dan >> >> >>> >>> Roman >>> >>>> >>>> igor >>>> >>>> >>>>> >>>>> Thanks, >>>>> Tobias >>>> >>> >> > From email.sundarms at gmail.com Thu Jul 6 23:03:17 2017 From: email.sundarms at gmail.com (Sundara Mohan M) Date: Thu, 6 Jul 2017 16:03:17 -0700 Subject: Adaptive size policy and MemoryPoolMXBean Message-ID: Hi, I am trying to understand how will be values returned by memorypoolmxbean change when adaptive size is used For ex, CMS GC memory pool mx bean returned values *Name: Par Eden Space, usage: init = 71630848(69952K) used = 2865272(2798K) committed = 71630848(69952K) max = 286326784(279616K)* *Name: Par Survivor Space, usage: init = 8912896(8704K) used = 0(0K) committed = 8912896(8704K) max = 35782656(34944K)* *Name: CMS Old Gen, usage: init = 178978816(174784K) used = 0(0K) committed = 178978816(174784K) max = 715849728(699072K)* *G1GC *memory pool mx bean returned values *Name: G1 Eden Space, usage: init = 27262976(26624K) used = 0(0K) committed = 27262976(26624K) max = -1(-1K)* *Name: G1 Survivor Space, usage: init = 0(0K) used = 0(0K) committed = 0(0K) max = -1(-1K)* *Name: G1 Old Gen, usage: init = 241172480(235520K) used = 0(0K) committed = 241172480(235520K) max = 1073741824(1048576K)* This is the value returned after starting jvm, will this(specifically committed or max) value updated after adaptive size is applied? Motivation: I am trying to understand how i can use this info (got to know about this from https://techblug.wordpress.com/2011/07/21/detecting-low-memory-in-java-part-2/) detect low memory and drop some objects in my map. Thanks, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From kim.barrett at oracle.com Thu Jul 6 23:29:47 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 6 Jul 2017 19:29:47 -0400 Subject: RFR (M) 8183923: Get rid of FreeBlockDictionary and dithering In-Reply-To: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com> References: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com> Message-ID: > On Jul 6, 2017, at 5:20 AM, Mikael Gerdin wrote: > > Hi all, > > Please review this cleanup inspired by looking at Roman's CMS cleanup :) > > FreeBlockDictionary is an old abstraction for multiple CMS freelist datastructures which never appear to have been implemented, getting rid of it also simplifies some code in Metaspace so it's not all CMS stuff. > > Testing: jprt > Bug: https://bugs.openjdk.java.net/browse/JDK-8183923 > Webrev: http://cr.openjdk.java.net/~mgerdin/8183923/webrev.0/index.html > > Thanks > /Mikael Looks good. From rkennke at redhat.com Fri Jul 7 08:53:44 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 7 Jul 2017 10:53:44 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com> References: <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com> Message-ID: <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com> Am 05.07.2017 um 13:12 schrieb Mikael Gerdin: > Hi Roman, > > On 2017-07-04 20:47, Roman Kennke wrote: >> AdaptiveSizePolicy is not used/called from outside the GCs, and not all >> GCs need them. It makes sense to remove it from the CollectedHeap and >> CollectorPolicy interfaces and move them down to the actual subclasses >> that used them. >> >> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only >> used/implemented in the parallel GC. Also, I made this class AllStatic >> (was StackObj) >> >> Tested by running hotspot_gc jtreg tests without regressions. >> >> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ > > Please correct me if I'm wrong here but it looks like all the non-G1 > collectors set the _should_clear_all_soft_refs based on > gc_overhead_limit_near. > Perhaps the ClearedAllSoftRefs scoped object could be modified to only > work with GenCollectorPolicy derived policies (which include parallel > *shrugs*) and G1 should just stop worrying about _all_soft_refs_clear. > Looking closer, I can't even find G1 code looking at that member so > maybe it, too, should be moved to GenCollectorPolicy? I can't find any place where should_clear_all_soft_refs() would become true for G1. And, as you mention, G1 doesn't even look at all_soft_refs_clear() either. I removed those parts from G1, and moved all soft_refs stuff down to GenCollectorPolicy. I also changed the way the casting accessors as_generation_policy() etc work: the as_* accessors now crash with ShouldNotReachHere() when called for the wrong policy type, and the is_* accessors now return constant true/false based on their type (so that it doesn't crash with ShouldNotReachHere() ..). I think this is more useful than the way it's been done before. http://cr.openjdk.java.net/~rkennke/8179268/webrev.01/ Tested by: hotspot_gc jtreg tests. What do you think? Roman From rkennke at redhat.com Fri Jul 7 09:10:46 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 7 Jul 2017 11:10:46 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com> <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> Message-ID: <2b4ea576-5133-4d5e-6fdb-1d60f40ec037@redhat.com> > > I'm not happy about this change: > > + ~ParallelSPCleanupThreadClosure() { > + // This is here to be consistent with sweeper.cpp > NMethodSweeper::mark_active_nmethods(). > + // TODO: Is this really needed? > + OrderAccess::storestore(); > + } > > because we're adding an OrderAccess::storestore() to be consistent > with an OrderAccess::storestore() that's not properly documented > which is only increasing the technical debt. > > So a couple of things above don't make sense to me: > > > - sweeper thread runs outside safepoint > > - VMThread (which is doing the nmethod marking in the case that > > I'm looking at) runs while all other threads (incl. the sweeper) > > is holding still. > > and: > > > There should be no need for a storestore() (at least in sweeper.cpp... Either one or the other are running. Either the VMThread is marking nmethods (during safepoint) or the sweeper threads are running (outside safepoint). Between the two phases, there is a guaranteed OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() should be necessary. >From Igor's comment I can see how it happened though: Apparently there *is* a race in sweeper's own concurrent processing (concurrent with compiler threads, as far as I understand). And there's a call to nmethod::mark_as_seen_on_stack() after which a storestore() is required (as per Igor's explanation). So the logic probably was: we have mark_as_seen_on_stack() followed by storestore() here, so let's also put a storestore() in the other places that call mark_as_seen_on_stack(), one of which happens to be the safepoint cleanup code that we're discussing. (why the storestore() hasn't been put right into mark_as_seen_on_stack() I don't understand). In short, one storestore() really was necessary, the other looks like it has been put there 'for consistency' or just conservatively. But it shouldn't be necessary in the safepoint cleanup code that we're discussing. So what should we do? Remove the storestore() for good? Refactor the code so that both paths at least call the storestore() in the same place? (E.g. make mark_active_nmethods() use the closure and call storestore() in the dtor as proposed?) Roman > > If the sweeper thread is running "outside safepoint", then how is > the sweeper thread "holding still" while the VMThread is doing the > nmethod marking? Those two points are contradictory. > > If the sweeper thread is indeed executing outside a safepoint, then > a storestore() is needed for its memory changes to be seen by the > VMThread which is doing things in parallel. That means that the > comment that sweeper.cpp doesn't need the storestore() is also > contradictory. > > So what do you mean by this comment: > > > - sweeper thread runs outside safepoint > > and once we know that we can figure out the rest... > > Dan > > >> >> Roman >> >>> >>> igor >>> >>> >>>> >>>> Thanks, >>>> Tobias >>> >> > From rkennke at redhat.com Fri Jul 7 10:51:38 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 7 Jul 2017 12:51:38 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> References: <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com> <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> Message-ID: <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> Hi Robbin, > > Far down -> > > On 07/06/2017 08:05 PM, Roman Kennke wrote: >> >>> >>> I'm not happy about this change: >>> >>> + ~ParallelSPCleanupThreadClosure() { >>> + // This is here to be consistent with sweeper.cpp >>> NMethodSweeper::mark_active_nmethods(). >>> + // TODO: Is this really needed? >>> + OrderAccess::storestore(); >>> + } >>> >>> because we're adding an OrderAccess::storestore() to be consistent >>> with an OrderAccess::storestore() that's not properly documented >>> which is only increasing the technical debt. >>> >>> So a couple of things above don't make sense to me: >>> >>>> - sweeper thread runs outside safepoint >>>> - VMThread (which is doing the nmethod marking in the case that >>>> I'm looking at) runs while all other threads (incl. the sweeper) >>>> is holding still. >>> >>> and: >>> >>>> There should be no need for a storestore() (at least in sweeper.cpp... >> >> Either one or the other are running. Either the VMThread is marking >> nmethods (during safepoint) or the sweeper threads are running (outside >> safepoint). Between the two phases, there is a guaranteed >> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() >> should be necessary. >> >> From Igor's comment I can see how it happened though: Apparently there >> *is* a race in sweeper's own concurrent processing (concurrent with >> compiler threads, as far as I understand). And there's a call to >> nmethod::mark_as_seen_on_stack() after which a storestore() is required >> (as per Igor's explanation). So the logic probably was: we have >> mark_as_seen_on_stack() followed by storestore() here, so let's also put >> a storestore() in the other places that call mark_as_seen_on_stack(), >> one of which happens to be the safepoint cleanup code that we're >> discussing. (why the storestore() hasn't been put right into >> mark_as_seen_on_stack() I don't understand). In short, one storestore() >> really was necessary, the other looks like it has been put there 'for >> consistency' or just conservatively. But it shouldn't be necessary in >> the safepoint cleanup code that we're discussing. >> >> So what should we do? Remove the storestore() for good? Refactor the >> code so that both paths at least call the storestore() in the same >> place? (E.g. make mark_active_nmethods() use the closure and call >> storestore() in the dtor as proposed?) > > I took a quick look, maybe I'm missing some stuff but: > > So there is a slight optimization when not running sweeper to skip > compiler barrier/fence in stw. > > Don't think that matter, so I propose something like: > - long stack_traversal_mark() { return > _stack_traversal_mark; } > - void set_stack_traversal_mark(long l) { > _stack_traversal_mark = l; } > + long stack_traversal_mark() { return > OrderAccess::load_acquire(&_stack_traversal_mark); } > + void set_stack_traversal_mark(long l) { > OrderAccess::release_store(&_stack_traversal_mark, l); } > > Maybe make _stack_traversal_mark volatile also, just as a marking that > it is concurrent accessed. > And remove both storestore. > > "Also neither of these state variables are volatile in nmethod, so > even the compiler may reorder the stores" > Fortunately at least _state is volatile now. > > I think _state also should use la/rs semantics instead, but that's > another story. Like this? http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ Roman From erik.helin at oracle.com Fri Jul 7 11:16:40 2017 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 7 Jul 2017 13:16:40 +0200 Subject: RFR (S): 8183226: Remembered set summarization accesses not fully initialized java thread DCQS In-Reply-To: <1499083970.2802.33.camel@oracle.com> References: <1499083970.2802.33.camel@oracle.com> Message-ID: <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com> On 07/03/2017 02:12 PM, Thomas Schatzl wrote: > Hi all, Hi Thomas, > can I get reviews for the following change that breaks some > dependency cycle in g1remset initialization to fix some (at this time > benign) bug when printing remembered set summarization information? > > The problem is that G1Remset initializes its internal remembered set > summarization helper data structure in the constructor, which accesses > some DCQS members before we call the initialize methods on the various > global DCQS'es in G1CollectedHeap::initialize(). > By splitting the initialization of the remembered set summarization > into an extra method, this one can be called at the very end of > G1CollectedHeap::initialize(), thus breaking the dependency. I think there is an easier way to achieve this :) The default constructor for G1RemSetSummary sets up almost all fields, and we make it really set up _all_ fields, then I believe we are good: - G1RemSetSummary::_num_vtimes can be set up in the constructor, because the number of entries only depends on ConcurrentG1Refine::thread_num(), which is a static function that only return G1ConcRefinementThreads. - G1RemSetSummary::_rs_threads_vtimes can be allocated in the constructor. - The value for _rs_threads_vtimes can be initialized to 0, since the accumulated virtual time for the each concurrent refinement thread should be 0 (since they haven't even started yet). - Same reasoning as above goes for _sampling_thread_vtime. - _rem_set can be NULL With the above changes, G1RemSet will call the default constructor (same as it currently does). The call to _prev_period_summary.initialize() will be removed from the G1RemSet constructor. With the above changes, G1RemSetSummary::G1RemSetSummary() has no dependencies on any other class, and is still initialized to the correct values. I think this is all that is needed to solve this problem. The rest, below this line, is just existing code that could really benefit from a cleanup :) The G1RemSetSummary::initialize method is no longer needed, G1RemSetSummary can now instead have a constructor taking a G1RemSet* as argument. That constructor will do what G1RemSetSummary::initialize does today. In G1RemSet::print_periodic_summary_info, the code can then look like: G1RemSetSummary current(this); _prev_period_summary.subtract(¤t); For extra, extra bonus points, we should make G1RemSetSummary::subtract_from work the other way around, so that the above code reads: G1RemSetSummary current(this); current.subtract(_prev_period_summary); // current -= prev instead of what the code does today: prev.subtract_from(current); // prev = current - prev which to me reads completely backwards :) Finally, it would be very nice for G1RemSetSummary to get a proper copy constructor, so that the last line in print_periodic_summary: _prev_period_summary.set(¤t); can just become: _prev_period_summary = current; (G1RemSetSummary::set is just a copy-constructor in disguise) You don't need to do all the cleanups, but I think having a fully functioning default constructor is a better way to solve this problem, rather than shuffling the call to initialize around. What do you think? Thanks, Erik > Benign because the values accessed at that time have the same values as > the values after initialization. > > This also allows for grouping together the initialization of > G1RemSet/DCQS/G1ConcurrentRefine related data structures more easily in > G1CollectedHeap::initialize(). > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183226 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183226/webrev/ > Testing: > local testing running remembered set summarization manually, jprt > > Thanks, > Thomas > From robbin.ehn at oracle.com Fri Jul 7 11:23:30 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 7 Jul 2017 13:23:30 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> References: <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> Message-ID: <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> Hi Roman, On 07/07/2017 12:51 PM, Roman Kennke wrote: > Hi Robbin, > >> >> Far down -> >> >> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>> >>>> >>>> I'm not happy about this change: >>>> >>>> + ~ParallelSPCleanupThreadClosure() { >>>> + // This is here to be consistent with sweeper.cpp >>>> NMethodSweeper::mark_active_nmethods(). >>>> + // TODO: Is this really needed? >>>> + OrderAccess::storestore(); >>>> + } >>>> >>>> because we're adding an OrderAccess::storestore() to be consistent >>>> with an OrderAccess::storestore() that's not properly documented >>>> which is only increasing the technical debt. >>>> >>>> So a couple of things above don't make sense to me: >>>> >>>>> - sweeper thread runs outside safepoint >>>>> - VMThread (which is doing the nmethod marking in the case that >>>>> I'm looking at) runs while all other threads (incl. the sweeper) >>>>> is holding still. >>>> >>>> and: >>>> >>>>> There should be no need for a storestore() (at least in sweeper.cpp... >>> >>> Either one or the other are running. Either the VMThread is marking >>> nmethods (during safepoint) or the sweeper threads are running (outside >>> safepoint). Between the two phases, there is a guaranteed >>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() >>> should be necessary. >>> >>> From Igor's comment I can see how it happened though: Apparently there >>> *is* a race in sweeper's own concurrent processing (concurrent with >>> compiler threads, as far as I understand). And there's a call to >>> nmethod::mark_as_seen_on_stack() after which a storestore() is required >>> (as per Igor's explanation). So the logic probably was: we have >>> mark_as_seen_on_stack() followed by storestore() here, so let's also put >>> a storestore() in the other places that call mark_as_seen_on_stack(), >>> one of which happens to be the safepoint cleanup code that we're >>> discussing. (why the storestore() hasn't been put right into >>> mark_as_seen_on_stack() I don't understand). In short, one storestore() >>> really was necessary, the other looks like it has been put there 'for >>> consistency' or just conservatively. But it shouldn't be necessary in >>> the safepoint cleanup code that we're discussing. >>> >>> So what should we do? Remove the storestore() for good? Refactor the >>> code so that both paths at least call the storestore() in the same >>> place? (E.g. make mark_active_nmethods() use the closure and call >>> storestore() in the dtor as proposed?) >> >> I took a quick look, maybe I'm missing some stuff but: >> >> So there is a slight optimization when not running sweeper to skip >> compiler barrier/fence in stw. >> >> Don't think that matter, so I propose something like: >> - long stack_traversal_mark() { return >> _stack_traversal_mark; } >> - void set_stack_traversal_mark(long l) { >> _stack_traversal_mark = l; } >> + long stack_traversal_mark() { return >> OrderAccess::load_acquire(&_stack_traversal_mark); } >> + void set_stack_traversal_mark(long l) { >> OrderAccess::release_store(&_stack_traversal_mark, l); } >> >> Maybe make _stack_traversal_mark volatile also, just as a marking that >> it is concurrent accessed. >> And remove both storestore. >> >> "Also neither of these state variables are volatile in nmethod, so >> even the compiler may reorder the stores" >> Fortunately at least _state is volatile now. >> >> I think _state also should use la/rs semantics instead, but that's >> another story. > > Like this? > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ > Yes, exactly, I like this! Dan? Igor ? Tobias? Thanks Roman! BTW I'm going on vacation (5w) in a few hours, but I will follow this thread/changeset to the end! /Robbin > > Roman > From erik.helin at oracle.com Fri Jul 7 12:23:21 2017 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 7 Jul 2017 14:23:21 +0200 Subject: RFR (M) 8183923: Get rid of FreeBlockDictionary and dithering In-Reply-To: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com> References: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com> Message-ID: On 07/06/2017 11:20 AM, Mikael Gerdin wrote: > Hi all, > > Please review this cleanup inspired by looking at Roman's CMS cleanup :) > > FreeBlockDictionary is an old abstraction for multiple CMS freelist > datastructures which never appear to have been implemented, getting rid > of it also simplifies some code in Metaspace so it's not all CMS stuff. > > Testing: jprt > Bug: https://bugs.openjdk.java.net/browse/JDK-8183923 > Webrev: http://cr.openjdk.java.net/~mgerdin/8183923/webrev.0/index.html Looks good, Reviewed. Thanks for cleaning this up Mikael! Erik > Thanks > /Mikael From erik.helin at oracle.com Fri Jul 7 12:35:21 2017 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 7 Jul 2017 14:35:21 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <13358626-e399-e352-1711-587416621aac@redhat.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> Message-ID: <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> On 07/06/2017 06:23 PM, Roman Kennke wrote: >>>> Ok to push this? >> >> I just realized that your change doesn't build on Windows since you >> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky >> about that. >> /Mikael > > Uhhh. > Ok, here's revision #3 with precompiled added in: > > http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/ > Hi Roman, I just started looking :) I think GenCollectedHeap::gc_prologue and GenCollectedHeap::gc_epilogue should be virtual, and always_do_update_barrier = UseConcMarkSweepGC moved down CMSHeap::gc_epilogue. What do you think? Thanks, Erik > Roman > From erik.helin at oracle.com Fri Jul 7 13:07:02 2017 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 7 Jul 2017 15:07:02 +0200 Subject: RFR (XS): 8183397: Ensure consistent closure filtering during evacuation In-Reply-To: <1499329701.2760.3.camel@oracle.com> References: <1499081088.2802.29.camel@oracle.com> <64943738-9d9f-0d88-b44e-9a9ec0812f33@oracle.com> <1499329701.2760.3.camel@oracle.com> Message-ID: <9c538d19-e9e9-cb61-640a-7476d2e0c725@oracle.com> On 07/06/2017 10:28 AM, Thomas Schatzl wrote: > Hi Erik, > > On Thu, 2017-07-06 at 10:20 +0200, Erik Helin wrote: >> On 07/03/2017 01:24 PM, Thomas Schatzl wrote: >>> >>> Hi all, >> Hi Thomas, >> >>> >>> can I have reviews for this change that fixes an observation that >>> has >>> been made recently by Erik, i.e. that the "else" part of several >>> evacuation closures inconsistently filters out non-cross-region >>> references before checking whether the referenced object is a >>> humongous >>> or ext region. >>> >>> This causes somewhat hard to diagnose performance issues, and >>> earlier >>> filtering does not hurt if done anyway. >>> >>> (Note that the current way of checking in all but the UpdateRS >>> closure >>> using HeapRegion::is_in_same_region() seems optimal. The only >>> reason >>> why the other way in the UpdateRS closure is better because the >>> code >>> needs the "to" HeapRegion pointer anyway) >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8183397 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8183397/webrev/ >> - } else if (in_cset_state.is_humongous()) { >> + } else { >> + if (in_cset_state.is_humongous()) { >> >> Why change `else if` to `else { if (...) {` here? Does it result in >> the >> compiler generating faster code for this case? > > no. It only makes this do_oop_*() method look similar in structure to > our do_oop_*() methods in the closures. > > I.e. > > if (in_cset.state.is_in_cset()) { > // do stuff for refs into cset > } else { > // expanding handle_non_cset_obj_common() > if (state.is_humongous()) { > } else ... > } > > I felt this improves overall readability, but this may only be because > I have been working in this code a lot recently. I can revert this > change. Yeah, I suspected this was your reasoning. IMO, the code is a bit too spread out for this to work here, a reader of g1ParScanThreadState.inline.hpp might not be aware of the idioms used is g1OopClosures.inline.hpp. So, for me, please use `else if` in g1ParScanThreadState.inline.hpp :) I do not need to re-review that change. Great work Thomas, thanks! Erik > Thanks for your review, > Thomas > From rkennke at redhat.com Fri Jul 7 13:21:06 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 7 Jul 2017 15:21:06 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> Message-ID: <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> Am 07.07.2017 um 14:35 schrieb Erik Helin: > On 07/06/2017 06:23 PM, Roman Kennke wrote: >>>>> Ok to push this? >>> >>> I just realized that your change doesn't build on Windows since you >>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky >>> about that. >>> /Mikael >> >> Uhhh. >> Ok, here's revision #3 with precompiled added in: >> >> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/ >> > > Hi Roman, > > I just started looking :) I think GenCollectedHeap::gc_prologue and > GenCollectedHeap::gc_epilogue should be virtual, and > always_do_update_barrier = UseConcMarkSweepGC moved down > CMSHeap::gc_epilogue. > > What do you think? Yes, I have seen that. My original plan was to leave it as is because I know that Erik ?. is working on a big barrier set refactoring that would remove this code anyway. However, it doesn't really matter, here's the cleaned up patch: http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/ Roman From erik.osterlund at oracle.com Fri Jul 7 15:17:39 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 7 Jul 2017 17:17:39 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com> <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com> Message-ID: <595FA613.7090306@oracle.com> Hi Kim, Added hotspot-dev as requested. To answer your worries we must first understand what invariant these checks for 'special' locks seek to achieve. The invariant is, AFAIK, that locks with a rank of 'special' and below (now including 'access' as well as 'event' that was already there before) must *not* check for safepoints when grabbed from JavaThreads. Safepoint checks translate to performing ThreadBlockInVM which must *not* be performed in 'special' or below ranked locks. The reason this is necessary, is that those locks must be usable from safepoint-unsafe places, e.g. leaf calls, where we can not yield to the safepoint synchronizer. I believe that must have been what the name 'special' originated from, correct me if I'm wrong. Since locking in safepoint-unsafe places potentially blocks the safepoint synchronizer, a deadlock would arise if a thread, 'Thread A', acquires a special lock with Java/VM thread state. But that special lock is held by another thread, 'Thread B', that has blocked on a non-special lock, because it yielded to the safepoint synchronizer on the VM thread, that is concurrently synchronizing a safepoint. However, the safepoint synchronizer is blocked waiting for the thread acquiring the special lock to yield. But it never will. In the end, 'Thread A' waits for 'Thread B' to release the special lock, and 'Thread B' waits for the VMThread that is safepoint synchronizing, and the VMThread is waiting for 'Thread A' - a deadlock. With the provided invariant however, it is impossible to acquire non-safepoint non-special locks while already holding 'special' and below locks. Therefore, by checking for the condition of the invariant, we will be certain such a deadlock can not happen, as that would violate the usual lock rank ordering that we all know about - one can not acquire a lock that has a rank higher than locks already acquired. First of all, before we delve deeper into this rabbit hole. Note that if our new 'access' locks are used properly, e.g. do not perform safepoint checks or call code that needs to have a safepoint-safe state under the lock, we are good. The G1 barrier code conforms to those restrictions and the code has since forever been called in leaf calls while in Java thread state, without oop maps, making it a not safepoint-safe state. So that is all good. The remaining question to answer is, had we done crazy things while holding those access locks, would the deadlock detection system have detected that? To answer that question we must examine whether that invariant holds or not. There are three paths for locking a Mutex/Monitor, try_lock(), lock() and lock_without_safepoint_check(). Among these, only lock() violates the invariant for 'special' and below locks taken from a JavaThread. It is the only locking operation that performs a safepoint check. try_lock() instead returns false if the lock could not be acquired, and lock_without_safepoint_check does not check for safepoints. MutexLocker and MutexLockerEx are abstractions around Mutex that boil down to calls to either lock() or lock_without_safepoint_check(). So if lock() catches the broken invariant, all locking in hotspot will somehow catch it. Let's examine what happens in lock(). share/vm/runtime/mutex.cpp:933:21: assert(rank() > Mutex::special, "Potential deadlock with special or lesser rank mutex"); This is the check in Monitor::lock() *with* safepoint check. It definitely catches illegal uses of lock() on 'special', 'event' and 'access' ranked locks on JavaThreads. Since we always catch misuse of special and below ranked locks here, the deadlock detection system works correctly even for 'event' and the new 'access' ranked locks. The other checks are mostly redundant and will eventually boil down to this check. Examples: share/vm/runtime/mutexLocker.hpp:165:29: assert(mutex->rank() != Mutex::special share/vm/runtime/mutexLocker.hpp:173:29: assert(mutex->rank() != Mutex::special, These two are constructors for MutexLocker which will call lock(). Therefore, this check is redundant. The 'event' and 'access' ranks will both miss this redundant check, but subsequently run into the check in lock(), which is the one that matters and still catches the broken invariant. share/vm/runtime/mutexLocker.hpp:208:30: assert(mutex->rank() > Mutex::special || no_safepoint_check, This check in MutexLockerEx checks that either the lock should be over special or it must not check for safepoints. This works as intended with 'event' and 'access' locks. They are forced to perform a lock without safepoint check, and hence not enter the lock() path of Mutex. However, if they did, it would still be redundantly sanity checked there too. share/vm/runtime/mutex.cpp:1384:23: || rank() == Mutex::special, "wrong thread state for using locks"); share/vm/runtime/mutex.cpp:1389:30: debug_only(if (rank() != Mutex::special) \ These two check checks are found in Monitor::check_prelock_state(). Which is called from lock() and try_lock(). As for lock, we already concluded that using lock() *at all* on a special lock from a JavaThread will be found out and an assert will trigger. So these checks for special locks seem to be a bit redundant. As for the try_lock() case, they even seem wrong. Checking for valid safepoint state for a locking operation that can't block seems wrong. But that is invariant of whether the lock is special or not. It just should not check for safepoint-safe states on try_lock(). share/vm/runtime/thread.cpp:903:27: cur->rank() == Mutex::special) { This check is in Thread::check_for_valid_safepoint_state() where it walks all the monitors and makes sure that we don't have any of {Threads_lock, Compile_lock, VMOperationRequest_lock, VMOperationQueue_lock} and allow_vm_block() or it contains any special lock. This is performed when e.g. allocating etc. This check should arguably check for special *and below* ranked locks that should not be acquired while in safepoint-safe. However, those access locks return true on allow_vm_block(), and therefore will correctly be detected as dangerous had we done crazy things under these locks. All in all, I believe that the deadlock detecion system has some redundant, and some confusing checks that involve the lock rank Mutex::special. But I do believe that it works and would detect deadlocks, but could do with some reworking to make it more explicit. And that is invariant of the new access rank and applies equally to the event rank. However, since these access locks play well with the current deadlock detection as they do not do anything illegal, and even if use of these locks did indeed do illegal things, it would still be detected by the deadlock detection system, it is reasonable to say that refactoring the deadlock detection system is a separate RFE? Specifically, clarifying the deadlock detection system by removing redundant checks, not checking for safepoint-safe state in try_lock as well as explicitly listing special and below locks as illegal when verifying Thread::check_for_valid_safepoint_state(), regardles of whether allow_vm_block() is true or not. Sounds like a separate RFE to me! Thanks, /Erik On 2017-07-06 22:15, Kim Barrett wrote: >> On Jul 6, 2017, at 4:11 PM, Kim Barrett wrote: >> >>> On Jul 4, 2017, at 10:00 PM, Kim Barrett wrote: >>> The lock ranking changes look good. >> I'm going to retract that. >> >> How does these new lock rankings interact with various assertions that >> rank() == or != Mutex::special? I'm not sure those places handle >> these new ranks properly. (I'm not sure those places handle >> Mutex::event rank properly either.) > And maybe this change needs to be discussed on hotspot-dev rather than hotspot-gc-dev. > From igor.veresov at oracle.com Fri Jul 7 18:09:01 2017 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 7 Jul 2017 11:09:01 -0700 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> References: <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com> <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com> <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> Message-ID: > On Jul 7, 2017, at 4:23 AM, Robbin Ehn wrote: > > Hi Roman, > > On 07/07/2017 12:51 PM, Roman Kennke wrote: >> Hi Robbin, >>> >>> Far down -> >>> >>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>> >>>>> >>>>> I'm not happy about this change: >>>>> >>>>> + ~ParallelSPCleanupThreadClosure() { >>>>> + // This is here to be consistent with sweeper.cpp >>>>> NMethodSweeper::mark_active_nmethods(). >>>>> + // TODO: Is this really needed? >>>>> + OrderAccess::storestore(); >>>>> + } >>>>> >>>>> because we're adding an OrderAccess::storestore() to be consistent >>>>> with an OrderAccess::storestore() that's not properly documented >>>>> which is only increasing the technical debt. >>>>> >>>>> So a couple of things above don't make sense to me: >>>>> >>>>>> - sweeper thread runs outside safepoint >>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>> I'm looking at) runs while all other threads (incl. the sweeper) >>>>>> is holding still. >>>>> >>>>> and: >>>>> >>>>>> There should be no need for a storestore() (at least in sweeper.cpp... >>>> >>>> Either one or the other are running. Either the VMThread is marking >>>> nmethods (during safepoint) or the sweeper threads are running (outside >>>> safepoint). Between the two phases, there is a guaranteed >>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() >>>> should be necessary. >>>> >>>> From Igor's comment I can see how it happened though: Apparently there >>>> *is* a race in sweeper's own concurrent processing (concurrent with >>>> compiler threads, as far as I understand). And there's a call to >>>> nmethod::mark_as_seen_on_stack() after which a storestore() is required >>>> (as per Igor's explanation). So the logic probably was: we have >>>> mark_as_seen_on_stack() followed by storestore() here, so let's also put >>>> a storestore() in the other places that call mark_as_seen_on_stack(), >>>> one of which happens to be the safepoint cleanup code that we're >>>> discussing. (why the storestore() hasn't been put right into >>>> mark_as_seen_on_stack() I don't understand). In short, one storestore() >>>> really was necessary, the other looks like it has been put there 'for >>>> consistency' or just conservatively. But it shouldn't be necessary in >>>> the safepoint cleanup code that we're discussing. >>>> >>>> So what should we do? Remove the storestore() for good? Refactor the >>>> code so that both paths at least call the storestore() in the same >>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>> storestore() in the dtor as proposed?) >>> >>> I took a quick look, maybe I'm missing some stuff but: >>> >>> So there is a slight optimization when not running sweeper to skip >>> compiler barrier/fence in stw. >>> >>> Don't think that matter, so I propose something like: >>> - long stack_traversal_mark() { return >>> _stack_traversal_mark; } >>> - void set_stack_traversal_mark(long l) { >>> _stack_traversal_mark = l; } >>> + long stack_traversal_mark() { return >>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>> + void set_stack_traversal_mark(long l) { >>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>> >>> Maybe make _stack_traversal_mark volatile also, just as a marking that >>> it is concurrent accessed. >>> And remove both storestore. >>> >>> "Also neither of these state variables are volatile in nmethod, so >>> even the compiler may reorder the stores" >>> Fortunately at least _state is volatile now. >>> >>> I think _state also should use la/rs semantics instead, but that's >>> another story. >> Like this? >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >> > > Yes, exactly, I like this! > Dan? Igor ? Tobias? > That seems correct. igor > Thanks Roman! > > BTW I'm going on vacation (5w) in a few hours, but I will follow this thread/changeset to the end! > > /Robbin > >> Roman -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Sat Jul 8 02:46:09 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 7 Jul 2017 20:46:09 -0600 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> Message-ID: <1c976ae5-2893-9e7c-d588-1c1e4da447e4@oracle.com> On 7/7/17 12:09 PM, Igor Veresov wrote: > >> On Jul 7, 2017, at 4:23 AM, Robbin Ehn > > wrote: >> >> Hi Roman, >> >> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>> Hi Robbin, >>>> >>>> Far down -> >>>> >>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>> >>>>>> >>>>>> I'm not happy about this change: >>>>>> >>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>> + // This is here to be consistent with sweeper.cpp >>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>> + // TODO: Is this really needed? >>>>>> + OrderAccess::storestore(); >>>>>> + } >>>>>> >>>>>> because we're adding an OrderAccess::storestore() to be consistent >>>>>> with an OrderAccess::storestore() that's not properly documented >>>>>> which is only increasing the technical debt. >>>>>> >>>>>> So a couple of things above don't make sense to me: >>>>>> >>>>>>> - sweeper thread runs outside safepoint >>>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>>> I'm looking at) runs while all other threads (incl. the sweeper) >>>>>>> is holding still. >>>>>> >>>>>> and: >>>>>> >>>>>>> There should be no need for a storestore() (at least in >>>>>>> sweeper.cpp... >>>>> >>>>> Either one or the other are running. Either the VMThread is marking >>>>> nmethods (during safepoint) or the sweeper threads are running >>>>> (outside >>>>> safepoint). Between the two phases, there is a guaranteed >>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() >>>>> should be necessary. >>>>> >>>>> From Igor's comment I can see how it happened though: Apparently >>>>> there >>>>> *is* a race in sweeper's own concurrent processing (concurrent with >>>>> compiler threads, as far as I understand). And there's a call to >>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>> required >>>>> (as per Igor's explanation). So the logic probably was: we have >>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>> also put >>>>> a storestore() in the other places that call mark_as_seen_on_stack(), >>>>> one of which happens to be the safepoint cleanup code that we're >>>>> discussing. (why the storestore() hasn't been put right into >>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>> storestore() >>>>> really was necessary, the other looks like it has been put there 'for >>>>> consistency' or just conservatively. But it shouldn't be necessary in >>>>> the safepoint cleanup code that we're discussing. >>>>> >>>>> So what should we do? Remove the storestore() for good? Refactor the >>>>> code so that both paths at least call the storestore() in the same >>>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>>> storestore() in the dtor as proposed?) >>>> >>>> I took a quick look, maybe I'm missing some stuff but: >>>> >>>> So there is a slight optimization when not running sweeper to skip >>>> compiler barrier/fence in stw. >>>> >>>> Don't think that matter, so I propose something like: >>>> - long stack_traversal_mark() { return >>>> _stack_traversal_mark; } >>>> - void set_stack_traversal_mark(long l) { >>>> _stack_traversal_mark = l; } >>>> + long stack_traversal_mark() { return >>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>> + void set_stack_traversal_mark(long l) { >>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>> >>>> Maybe make _stack_traversal_mark volatile also, just as a marking that >>>> it is concurrent accessed. >>>> And remove both storestore. >>>> >>>> "Also neither of these state variables are volatile in nmethod, so >>>> even the compiler may reorder the stores" >>>> Fortunately at least _state is volatile now. >>>> >>>> I think _state also should use la/rs semantics instead, but that's >>>> another story. >>> Like this? >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>> >>> >> >> Yes, exactly, I like this! >> Dan? Igor ? Tobias? >> > > That seems correct. > > igor I concur. And it gets rid of my complaint about the uncommented OrderAccess::storestore(). The deltas since webrev.10 (the last one I reviewed fully): src/share/vm/code/nmethod.hpp No comments. src/share/vm/runtime/sweeper.cpp No comments. src/share/vm/runtime/vmStructs.cpp No comments. Thumbs up! Again, very nice work on this change! Dan > >> Thanks Roman! >> >> BTW I'm going on vacation (5w) in a few hours, but I will follow this >> thread/changeset to the end! >> >> /Robbin >> >>> Roman > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Mon Jul 10 10:38:50 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 10 Jul 2017 12:38:50 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com> <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> Message-ID: <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> Ok, so I guess I need a sponsor for this now: http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ Roman Am 07.07.2017 um 20:09 schrieb Igor Veresov: > >> On Jul 7, 2017, at 4:23 AM, Robbin Ehn > > wrote: >> >> Hi Roman, >> >> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>> Hi Robbin, >>>> >>>> Far down -> >>>> >>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>> >>>>>> >>>>>> I'm not happy about this change: >>>>>> >>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>> + // This is here to be consistent with sweeper.cpp >>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>> + // TODO: Is this really needed? >>>>>> + OrderAccess::storestore(); >>>>>> + } >>>>>> >>>>>> because we're adding an OrderAccess::storestore() to be consistent >>>>>> with an OrderAccess::storestore() that's not properly documented >>>>>> which is only increasing the technical debt. >>>>>> >>>>>> So a couple of things above don't make sense to me: >>>>>> >>>>>>> - sweeper thread runs outside safepoint >>>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>>> I'm looking at) runs while all other threads (incl. the sweeper) >>>>>>> is holding still. >>>>>> >>>>>> and: >>>>>> >>>>>>> There should be no need for a storestore() (at least in >>>>>>> sweeper.cpp... >>>>> >>>>> Either one or the other are running. Either the VMThread is marking >>>>> nmethods (during safepoint) or the sweeper threads are running >>>>> (outside >>>>> safepoint). Between the two phases, there is a guaranteed >>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() >>>>> should be necessary. >>>>> >>>>> From Igor's comment I can see how it happened though: Apparently >>>>> there >>>>> *is* a race in sweeper's own concurrent processing (concurrent with >>>>> compiler threads, as far as I understand). And there's a call to >>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>> required >>>>> (as per Igor's explanation). So the logic probably was: we have >>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>> also put >>>>> a storestore() in the other places that call mark_as_seen_on_stack(), >>>>> one of which happens to be the safepoint cleanup code that we're >>>>> discussing. (why the storestore() hasn't been put right into >>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>> storestore() >>>>> really was necessary, the other looks like it has been put there 'for >>>>> consistency' or just conservatively. But it shouldn't be necessary in >>>>> the safepoint cleanup code that we're discussing. >>>>> >>>>> So what should we do? Remove the storestore() for good? Refactor the >>>>> code so that both paths at least call the storestore() in the same >>>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>>> storestore() in the dtor as proposed?) >>>> >>>> I took a quick look, maybe I'm missing some stuff but: >>>> >>>> So there is a slight optimization when not running sweeper to skip >>>> compiler barrier/fence in stw. >>>> >>>> Don't think that matter, so I propose something like: >>>> - long stack_traversal_mark() { return >>>> _stack_traversal_mark; } >>>> - void set_stack_traversal_mark(long l) { >>>> _stack_traversal_mark = l; } >>>> + long stack_traversal_mark() { return >>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>> + void set_stack_traversal_mark(long l) { >>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>> >>>> Maybe make _stack_traversal_mark volatile also, just as a marking that >>>> it is concurrent accessed. >>>> And remove both storestore. >>>> >>>> "Also neither of these state variables are volatile in nmethod, so >>>> even the compiler may reorder the stores" >>>> Fortunately at least _state is volatile now. >>>> >>>> I think _state also should use la/rs semantics instead, but that's >>>> another story. >>> Like this? >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>> >>> >> >> Yes, exactly, I like this! >> Dan? Igor ? Tobias? >> > > That seems correct. > > igor > >> Thanks Roman! >> >> BTW I'm going on vacation (5w) in a few hours, but I will follow this >> thread/changeset to the end! >> >> /Robbin >> >>> Roman > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Mon Jul 10 12:15:47 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 10 Jul 2017 14:15:47 +0200 Subject: RFR (S): 8183226: Remembered set summarization accesses not fully initialized java thread DCQS In-Reply-To: <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com> References: <1499083970.2802.33.camel@oracle.com> <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com> Message-ID: <1499688947.2793.21.camel@oracle.com> Hi Erik (and Stefan), ? thanks for your review. On Fri, 2017-07-07 at 13:16 +0200, Erik Helin wrote: > On 07/03/2017 02:12 PM, Thomas Schatzl wrote: > > > > Hi all, > Hi Thomas, > > > > > ? can I get reviews for the following change that breaks some > > dependency cycle in g1remset initialization to fix some (at this > > time benign) bug when printing remembered set summarization > > information? > > > > The problem is that G1Remset initializes its internal remembered > > set summarization helper data structure in the constructor, which > > accesses some DCQS members before we call the initialize methods on > > the various global DCQS'es in G1CollectedHeap::initialize(). > > By splitting the initialization of the remembered set summarization > > into an extra method, this one can be called at the very end of > > G1CollectedHeap::initialize(), thus breaking the dependency. > I think there is an easier way to achieve this :) The default? > constructor for G1RemSetSummary sets up almost all fields, and we > make? > it really set up _all_ fields, then I believe we are good: > - G1RemSetSummary::_num_vtimes can be set up in the constructor, > ???because the number of entries only depends on > ???ConcurrentG1Refine::thread_num(), which is a static function that > ???only return G1ConcRefinementThreads. > - G1RemSetSummary::_rs_threads_vtimes can be allocated in the > ???constructor. > - The value for _rs_threads_vtimes can be initialized to 0, since the > ???accumulated virtual time for the each concurrent refinement thread > ???should be 0 (since they haven't even started yet). > - Same reasoning as above goes for _sampling_thread_vtime. > - _rem_set can be NULL > > With the above changes, G1RemSet will call the default constructor > (same as it currently does). The call to > _prev_period_summary.initialize() will be removed from the G1RemSet > constructor. > > With the above changes, G1RemSetSummary::G1RemSetSummary() has no? > dependencies on any other class, and is still initialized to the > correct values. I think this is all that is needed to solve this > problem. Fine with me. > > The rest, below this line, is just existing code that could really? > benefit from a cleanup :) > > The G1RemSetSummary::initialize method is no longer needed,? > G1RemSetSummary can now instead have a constructor taking a G1RemSet* > as argument. That constructor will do what > G1RemSetSummary::initialize does today. Unfortunately, no. We can't pass "this" in the constructor as the compilers will complain about possible use of uninitialized class (or so). But we can always get the G1RemSet pointer from the global variables, as implemented. > In G1RemSet::print_periodic_summary_info, the code can then look > like: > > ???G1RemSetSummary current(this); > ???_prev_period_summary.subtract(¤t); > > For extra, extra bonus points, we should make? > G1RemSetSummary::subtract_from work the other way around, so that the > above code reads: > > ???G1RemSetSummary current(this); > ???current.subtract(_prev_period_summary); // current -= prev > > instead of what the code does today: > > ???prev.subtract_from(current); // prev = current - prev > > which to me reads completely backwards :) > I think there has been some reason I do not remember right now why this has been done that way. But I agree. > Finally, it would be very nice for G1RemSetSummary to get a proper > copy constructor, so that the last line in print_periodic_summary: > > ???_prev_period_summary.set(¤t); > > can just become: > > ???_prev_period_summary = current; > > (G1RemSetSummary::set is just a copy-constructor in disguise) I remember trying to avoid adding a copy constructor for fear of being "too complicated" and unusual for Hotspot code. > You don't need to do all the cleanups, but I think having a fully? > functioning default constructor is a better way to solve this > problem, rather than shuffling the call to initialize around. What do > you think? Let's defer the other suggested cleanups to a different CR. In the following webrev I also added StefanJ's suggestion to extract concurrent refinement initialization into a separate method. (I do not really understand why that method is actually returning an error code: all error conditions in ConcurrentG1Refine call vm_shutdown_during_initialization() anyway - even that seems superfluous: failing to allocate memory shuts down the VM already). Webrevs: http://cr.openjdk.java.net/~tschatzl/8183226/webrev.0_to_1/?(diff) http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1/?(full) Thanks, ? Thomas From per.liden at oracle.com Mon Jul 10 12:23:47 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 10 Jul 2017 14:23:47 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com> References: <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com> <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com> Message-ID: <26a925e4-295c-0178-9586-6ddf96a64a54@oracle.com> Hi Roman, On 2017-07-07 10:53, Roman Kennke wrote: > Am 05.07.2017 um 13:12 schrieb Mikael Gerdin: >> Hi Roman, >> >> On 2017-07-04 20:47, Roman Kennke wrote: >>> AdaptiveSizePolicy is not used/called from outside the GCs, and not all >>> GCs need them. It makes sense to remove it from the CollectedHeap and >>> CollectorPolicy interfaces and move them down to the actual subclasses >>> that used them. >>> >>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only >>> used/implemented in the parallel GC. Also, I made this class AllStatic >>> (was StackObj) Thanks for cleaning this up. May I suggest that the changes related to adaptive size policy is kept in one patch and the soft reference clearing stuff in another. >>> >>> Tested by running hotspot_gc jtreg tests without regressions. >>> >>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ >> >> Please correct me if I'm wrong here but it looks like all the non-G1 >> collectors set the _should_clear_all_soft_refs based on >> gc_overhead_limit_near. >> Perhaps the ClearedAllSoftRefs scoped object could be modified to only >> work with GenCollectorPolicy derived policies (which include parallel >> *shrugs*) and G1 should just stop worrying about _all_soft_refs_clear. >> Looking closer, I can't even find G1 code looking at that member so >> maybe it, too, should be moved to GenCollectorPolicy? > I can't find any place where should_clear_all_soft_refs() would become > true for G1. For G1 it becomes true when calling WB_FullGC, so your patch changes the behavior for G1 here. WB_FullGC is meant to clear soft refs, but I looked through the tests and can't find any that currently depend on this behavior (but I could have missed it). So, I see two options here: 1) We change the behavior of WB_FullGC to not guarantee any clearing of soft refs, in which case WB_FullGC should never call set_should_clear_all_soft_refs() for any GC. Having WB_FullGC clear soft refs in GCs but not others seems arbitrary and I can't see the value in that. or 2) We keep the current behavior of WB_FullGC (i.e. always clear soft refs). This of course makes the move of set_should_clear_all_soft_refs() to GenCollectorPolicy problematic. We could consider changing CollectedHeap::collect() to also take a "bool clear_soft_ref", or we could say that it's up to each collector to do the right thing when they get called with GCCause::_wb_full_gc. cheers, Per > And, as you mention, G1 doesn't even look at > all_soft_refs_clear() either. I removed those parts from G1, and moved > all soft_refs stuff down to GenCollectorPolicy. > > I also changed the way the casting accessors as_generation_policy() etc > work: the as_* accessors now crash with ShouldNotReachHere() when called > for the wrong policy type, and the is_* accessors now return constant > true/false based on their type (so that it doesn't crash with > ShouldNotReachHere() ..). I think this is more useful than the way it's > been done before. > > http://cr.openjdk.java.net/~rkennke/8179268/webrev.01/ > > > > Tested by: hotspot_gc jtreg tests. > > What do you think? > > Roman > From thomas.schatzl at oracle.com Mon Jul 10 12:52:34 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 10 Jul 2017 14:52:34 +0200 Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure In-Reply-To: <08286762-411b-3079-9802-814c806af946@oracle.com> References: <1499156663.2761.6.camel@oracle.com> <08286762-411b-3079-9802-814c806af946@oracle.com> Message-ID: <1499691154.2793.26.camel@oracle.com> Hi Erik, ? thanks for your review. On Thu, 2017-07-06 at 14:52 +0200, Erik Helin wrote: > Hi Thomas, > > On 07/04/2017 10:24 AM, Thomas Schatzl wrote: > > > > Hi all, > > > > ? can I get reviews for this change that renames and cleans up the > > use > > of RefineCardTableEntryClosure in the code? > > > > RefineCardTableEntryClosure is the closure that is applied by the > > concurrent refinement threads. This change renames it slightly to > > indicate its use (G1RefineCardConcurrentlyClosure) and moves it to > > the G1RemSet files close to the closure that we use for > > refinement/Update RS during GC. > great cleanup! Looking at the code, what do you think about moving? > G1RefineCardConcurrentlyClosure into concurrentG1RefineThread.cpp > (and make it a private class to ConcurrentG1RefineThread)? AFAICS,? > ConcurrentG1RefineThread is the only code using this closure. > There are also other users of that closure, e.g. the DCQS's need a reference to it during initialization. However by moving the?G1RefineCardConcurrentlyClosure and some refactoring there are (imho) some gains in encapsulation as we discussed. > If we do it this way, then we can actually make? > DirtyCardQueueSet::apply_closure_to_completed_buffer a template > method, ?taking the Closure a template, as in: > template > bool apply_closure_to_completed_buffer(Closure* cl, > ????????????????????????????????????????uint worker_i, > ????????????????????????????????????????size_t stop_at, > ????????????????????????????????????????bool during_pause) > This means that closures could get inlined, which doesn't mean that > much for G1RefineCardConcurrentlyClosure, but could give a small > boost for G1RefineCardClosure (for that to work,? > G1CollectedHeap::iterate_dirty_card_closure must take a? > G1RefineCardClosure, but that is ok, because that is the only closure > type we pass to that method). > > Also, you do not need the forward declaration in G1CollectedHeap, it? > will not make use of this closure then :) > > If you want to "go the extra mile", then you can also pass a > G1RemSet* as an argument to the G1RefineCardConcurrentlyClosure > constructor and store it in a field, to avoid accessing the > G1CollectedHeap via the singleton: > G1CollectedHeap::heap()->g1_rem_set()- > >refine_card_concurrently(card_ptr,? > worker_i); (plus, G1RefineCardConcurrentlyClosure only needs a > G1RemSet* pointer anyway ;)) I think these perf improvements should be targeted in a different CR. :) The change already doubled in size... Webrevs for current changes: http://cr.openjdk.java.net/~tschatzl/8183128/webrev.0_to_1 (diff) http://cr.openjdk.java.net/~tschatzl/8183128/webrev.1 (full) Thanks, ? Thomas From thomas.schatzl at oracle.com Mon Jul 10 13:01:20 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 10 Jul 2017 15:01:20 +0200 Subject: RFR: 8177544: Restructure G1 Full GC code In-Reply-To: References: <62d1f02b-1fc0-ffcf-b8e0-e88ebacecebe@oracle.com> <1497346566.2829.33.camel@oracle.com> Message-ID: <1499691680.2793.29.camel@oracle.com> Hi Stefan, On Wed, 2017-06-14 at 16:45 +0200, Stefan Johansson wrote: > Thanks Thomas for reviewing, > > On 2017-06-13 11:36, Thomas Schatzl wrote: > > > > Hi, > > > > ???thanks for your hard work on the parallel full gc that starts > > with this refactoring :) > :) > > > > On Thu, 2017-06-08 at 14:35 +0200, Stefan Johansson wrote: > > > > > > Hi, > > > > > > Please review this enhancement: > > > https://bugs.openjdk.java.net/browse/JDK-8177544 > > > > > > Webrev: > > > http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.00/ > > > > > [... lots of suggested changes from me...] Thanks for these changes. > > > > Actually, if it were for me, I would put the whole full gc setup > > and > > teardown into a separate class/file. > > > > Have public gc_prologue()/collect()/gc_epilogue() methods where > > gc_prologue() is the first part of do_full_collection_inner() until > > application of the G1SerialCollector, collect() the instantiation > > and application of G1SerialCollector, and gc_epilogue() the > > remainder. > > > > E.g. in G1CollectedHeap we only have the calls to these three > > methods (there is no need to have all three). > > > > At least I think it would help a lot if all that full gc stuff > > would be separate physically from do-all-G1CollectedHeap. > > With the G1FullGCScope there is almost no reference to > > G1CollectedHeap afaics. > > > > (There is _allocator->init_mutator_alloc_region() call) > I see your point and I think it would be good. But as we discussed > over chat, might be something to look at once everything else in this > area is done. Will create a RFE for this. Yes, that's fine. > > > > > ???- g1CollectedHeap.hpp: please try to sort the definitions of the > > new methods in order of calling them. > Done. > > Here are updated webrevs: > Full: http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.01/ > Inc: http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.00-01/ > Looks good to me. Sorry for the late reply. Thanks, ? Thomas From erik.helin at oracle.com Mon Jul 10 13:13:10 2017 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 10 Jul 2017 15:13:10 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> Message-ID: On 07/07/2017 03:21 PM, Roman Kennke wrote: > Am 07.07.2017 um 14:35 schrieb Erik Helin: >> On 07/06/2017 06:23 PM, Roman Kennke wrote: >>>>>> Ok to push this? >>>> >>>> I just realized that your change doesn't build on Windows since you >>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky >>>> about that. >>>> /Mikael >>> >>> Uhhh. >>> Ok, here's revision #3 with precompiled added in: >>> >>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/ >>> >> >> Hi Roman, >> >> I just started looking :) I think GenCollectedHeap::gc_prologue and >> GenCollectedHeap::gc_epilogue should be virtual, and >> always_do_update_barrier = UseConcMarkSweepGC moved down >> CMSHeap::gc_epilogue. >> >> What do you think? > > Yes, I have seen that. My original plan was to leave it as is because I > know that Erik ?. is working on a big barrier set refactoring that would > remove this code anyway. However, it doesn't really matter, here's the > cleaned up patch: > > http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/ > A few comments: cmsHeap.hpp: - you are missing quite a few #includes, but it works since genCollectedHeap.hpp #includes a whole lot of stuff. Not necessary to fix now, because the "missing #include" will start to pop up when someone tries to break apart GenCollectedHeap into smaller pieces. - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they be private in CMSHeap? - there are two `private:` blocks, please use only one `private:` block. - one extra newline here: 32 class CMSHeap : public GenCollectedHeap { 33 - one extra newline here: 46 47 cmsHeap.cpp: - one extra newline here: 36 CMSHeap::CMSHeap(GenCollectorPolicy *policy) : GenCollectedHeap(policy) { 37 - one extra newline here: 65 66 - do you need to use `this` here? 87 this->GenCollectedHeap::print_on_error(st); Isn't it enough to just GenCollectedHeap::print_on_error(st)? - one extra newline here: 92 bool CMSHeap::create_cms_collector() { 93 - this is pre-existing, but since we are copying code, do we want to clean it up? 104 if (collector == NULL || !collector->completed_initialization()) { 105 if (collector) { 106 delete collector; // Be nice in embedded situation 107 } 108 vm_shutdown_during_initialization("Could not create CMS collector"); 109 return false; 110 } The collector == NULL check is not needed here. CMSCollector derives from CHeapObj and CHeapObj::operator new will by default do vm_exit_out_of_memory if the returned memory is NULL. The check can just be: if (!collector->completed_initialization()) { vm_shutdown_during_initialization("Could not create CMS collector"); return false; } return true; - maybe skip the // success comment here: 111 return true; // success - is it possible to end up in CMSHeap::should_do_concurrent_full_gc() if we are not using CMS? As in: 123 bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) { 124 if (!UseConcMarkSweepGC) { 125 return false; 126 } - one extra newline here: 135 136 genCollectedHeap.hpp: - I don't think you have to make _skip_header_HeapWords protected. Instead I think we can skip_header_HeapWords() virtual, make it return 0 in GenCollectedHeap and return CMSCollector::skip_header_HeapWords in CMSHeap and just remove the _ skip_header_HeapWords variable. - do you really need #ifdef ASSERT around check_gen_kinds? - can you make GCH_strong_roots_tasks a protected enum in GenCollectedHeap? As in class GenCollectedHeap : public CollectedHeap { protected: enum StrongRootTasks { GCH_PS_Universe_oops_do, }; }; Have you though about vmStructs.cpp, does it need any changes? Thanks, Erik > Roman > From rkennke at redhat.com Mon Jul 10 13:36:29 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 10 Jul 2017 15:36:29 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <26a925e4-295c-0178-9586-6ddf96a64a54@oracle.com> References: <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com> <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com> <26a925e4-295c-0178-9586-6ddf96a64a54@oracle.com> Message-ID: <0263e88c-0618-1c54-7a51-f8163e4b7e09@redhat.com> Am 10.07.2017 um 14:23 schrieb Per Liden: > Hi Roman, > > On 2017-07-07 10:53, Roman Kennke wrote: >> Am 05.07.2017 um 13:12 schrieb Mikael Gerdin: >>> Hi Roman, >>> >>> On 2017-07-04 20:47, Roman Kennke wrote: >>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not >>>> all >>>> GCs need them. It makes sense to remove it from the CollectedHeap and >>>> CollectorPolicy interfaces and move them down to the actual subclasses >>>> that used them. >>>> >>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>> only >>>> used/implemented in the parallel GC. Also, I made this class AllStatic >>>> (was StackObj) > > Thanks for cleaning this up. > > May I suggest that the changes related to adaptive size policy is kept > in one patch and the soft reference clearing stuff in another. Ok... so we can go back to review the first revision of the patch and deal with the softrefs stuff in a followup? http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ > > For G1 it becomes true when calling WB_FullGC, so your patch changes > the behavior for G1 here. WB_FullGC is meant to clear soft refs, but I > looked through the tests and can't find any that currently depend on > this behavior (but I could have missed it). So, I see two options here: > > 1) We change the behavior of WB_FullGC to not guarantee any clearing > of soft refs, in which case WB_FullGC should never call > set_should_clear_all_soft_refs() for any GC. Having WB_FullGC clear > soft refs in GCs but not others seems arbitrary and I can't see the > value in that. > > or > > 2) We keep the current behavior of WB_FullGC (i.e. always clear soft > refs). This of course makes the move of > set_should_clear_all_soft_refs() to GenCollectorPolicy problematic. We > could consider changing CollectedHeap::collect() to also take a "bool > clear_soft_ref", or we could say that it's up to each collector to do > the right thing when they get called with GCCause::_wb_full_gc. Ok. I'd argue it's up to the GC. I am not totally famiiar with the WB stuff, but I'd expect it to do something similar to what would happen if applications call the usual API, which is, in this case, System.gc(), which goes through JVM_GC() which in turn calls heap->collect() *without* setting the set_should_clear_all_soft_refs(). Right? In any case, if we don't want this stuff under this enhancement ID, then we'll discuss it under the followup ID, right? Roman From erik.helin at oracle.com Mon Jul 10 13:37:00 2017 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 10 Jul 2017 15:37:00 +0200 Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure In-Reply-To: <1499691154.2793.26.camel@oracle.com> References: <1499156663.2761.6.camel@oracle.com> <08286762-411b-3079-9802-814c806af946@oracle.com> <1499691154.2793.26.camel@oracle.com> Message-ID: <02a0acb5-2632-d7d7-18d6-41242c4d9dac@oracle.com> On 07/10/2017 02:52 PM, Thomas Schatzl wrote: > ... > > I think these perf improvements should be targeted in a different CR. > :) The change already doubled in size... Alright, let me take care of that, once you have pushed this :) > Webrevs for current changes: > http://cr.openjdk.java.net/~tschatzl/8183128/webrev.0_to_1 (diff) > http://cr.openjdk.java.net/~tschatzl/8183128/webrev.1 (full) Looks very good, thank you Thomas! Reviewed! Erik > Thanks, > Thomas > From stefan.johansson at oracle.com Mon Jul 10 13:52:17 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 10 Jul 2017 15:52:17 +0200 Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure In-Reply-To: <02a0acb5-2632-d7d7-18d6-41242c4d9dac@oracle.com> References: <1499156663.2761.6.camel@oracle.com> <08286762-411b-3079-9802-814c806af946@oracle.com> <1499691154.2793.26.camel@oracle.com> <02a0acb5-2632-d7d7-18d6-41242c4d9dac@oracle.com> Message-ID: <799ec7a0-9c28-4ba8-4541-8611754667c0@oracle.com> Hi Thomas, On 2017-07-10 15:37, Erik Helin wrote: > On 07/10/2017 02:52 PM, Thomas Schatzl wrote: >> ... > > >> I think these perf improvements should be targeted in a different CR. >> :) The change already doubled in size... > > Alright, let me take care of that, once you have pushed this :) > >> Webrevs for current changes: >> http://cr.openjdk.java.net/~tschatzl/8183128/webrev.0_to_1 (diff) >> http://cr.openjdk.java.net/~tschatzl/8183128/webrev.1 (full) > > Looks very good, thank you Thomas! Reviewed! Looks good, StefanJ > Erik > >> Thanks, >> Thomas >> From per.liden at oracle.com Mon Jul 10 13:59:31 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 10 Jul 2017 15:59:31 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <0263e88c-0618-1c54-7a51-f8163e4b7e09@redhat.com> References: <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com> <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com> <26a925e4-295c-0178-9586-6ddf96a64a54@oracle.com> <0263e88c-0618-1c54-7a51-f8163e4b7e09@redhat.com> Message-ID: <7c177480-a74c-45b8-af92-47b1d4cbb46e@oracle.com> Hi, On 2017-07-10 15:36, Roman Kennke wrote: > Am 10.07.2017 um 14:23 schrieb Per Liden: >> Hi Roman, >> >> On 2017-07-07 10:53, Roman Kennke wrote: >>> Am 05.07.2017 um 13:12 schrieb Mikael Gerdin: >>>> Hi Roman, >>>> >>>> On 2017-07-04 20:47, Roman Kennke wrote: >>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not >>>>> all >>>>> GCs need them. It makes sense to remove it from the CollectedHeap and >>>>> CollectorPolicy interfaces and move them down to the actual subclasses >>>>> that used them. >>>>> >>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>>> only >>>>> used/implemented in the parallel GC. Also, I made this class AllStatic >>>>> (was StackObj) >> >> Thanks for cleaning this up. >> >> May I suggest that the changes related to adaptive size policy is kept >> in one patch and the soft reference clearing stuff in another. > > Ok... so we can go back to review the first revision of the patch and > deal with the softrefs stuff in a followup? Sounds good, I'll reply to your first mail separately. > > http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ > > > > >> >> For G1 it becomes true when calling WB_FullGC, so your patch changes >> the behavior for G1 here. WB_FullGC is meant to clear soft refs, but I >> looked through the tests and can't find any that currently depend on >> this behavior (but I could have missed it). So, I see two options here: >> >> 1) We change the behavior of WB_FullGC to not guarantee any clearing >> of soft refs, in which case WB_FullGC should never call >> set_should_clear_all_soft_refs() for any GC. Having WB_FullGC clear >> soft refs in GCs but not others seems arbitrary and I can't see the >> value in that. >> >> or >> >> 2) We keep the current behavior of WB_FullGC (i.e. always clear soft >> refs). This of course makes the move of >> set_should_clear_all_soft_refs() to GenCollectorPolicy problematic. We >> could consider changing CollectedHeap::collect() to also take a "bool >> clear_soft_ref", or we could say that it's up to each collector to do >> the right thing when they get called with GCCause::_wb_full_gc. > > Ok. > I'd argue it's up to the GC. I'm fine with that, as long as we make sure all GCs actually do the same thing so that the meaning of GCCause::_wb_full_gc doesn't differs from GC to GC. > I am not totally famiiar with the WB stuff, > but I'd expect it to do something similar to what would happen if > applications call the usual API, which is, in this case, System.gc(), > which goes through JVM_GC() which in turn calls heap->collect() > *without* setting the set_should_clear_all_soft_refs(). Right? The WB interface is for whitebox testing, i.e. an interface for tests that need to tell the GC to do something more specific than just "System.gc()". For example, "do a young GC" (WB_YoungGC) or "clear all soft refs and do a full GC" (WB_FullGC). > > In any case, if we don't want this stuff under this enhancement ID, then > we'll discuss it under the followup ID, right? Sounds good! Thanks! /Per > > Roman > From rkennke at redhat.com Mon Jul 10 14:10:59 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 10 Jul 2017 16:10:59 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> Message-ID: <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> Am 10.07.2017 um 15:13 schrieb Erik Helin: > On 07/07/2017 03:21 PM, Roman Kennke wrote: >> Am 07.07.2017 um 14:35 schrieb Erik Helin: >>> On 07/06/2017 06:23 PM, Roman Kennke wrote: >>>>>>> Ok to push this? >>>>> >>>>> I just realized that your change doesn't build on Windows since you >>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really >>>>> picky >>>>> about that. >>>>> /Mikael >>>> >>>> Uhhh. >>>> Ok, here's revision #3 with precompiled added in: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/ >>>> >>> >>> Hi Roman, >>> >>> I just started looking :) I think GenCollectedHeap::gc_prologue and >>> GenCollectedHeap::gc_epilogue should be virtual, and >>> always_do_update_barrier = UseConcMarkSweepGC moved down >>> CMSHeap::gc_epilogue. >>> >>> What do you think? >> >> Yes, I have seen that. My original plan was to leave it as is because I >> know that Erik ?. is working on a big barrier set refactoring that would >> remove this code anyway. However, it doesn't really matter, here's the >> cleaned up patch: >> >> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/ >> > > A few comments: > > cmsHeap.hpp: > - you are missing quite a few #includes, but it works since > genCollectedHeap.hpp #includes a whole lot of stuff. Not necessary to > fix now, because the "missing #include" will start to pop up when > someone tries to break apart GenCollectedHeap into smaller pieces. Right. I always try to minimize includes, especially in header files (they are bound to proliferate later anyway). In addition to that, if a class is only referenced as pointer, I avoid includes and use forward class definition instead. > > - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they > be private in CMSHeap? They are virtual and protected in GenCollectedHeap and called by GenCollectedHeap. Makes sense to also make them protected in CMSHeap? Or am I missing something? > - there are two `private:` blocks, please use only one `private:` > block. > Fixed. > - one extra newline here: > 32 class CMSHeap : public GenCollectedHeap { > 33 > > - one extra newline here: > 46 > 47 > > cmsHeap.cpp: > - one extra newline here: > 36 CMSHeap::CMSHeap(GenCollectorPolicy *policy) : > GenCollectedHeap(policy) { > 37 > > - one extra newline here: > 65 > 66 > Removed all of them. > - do you need to use `this` here? > 87 this->GenCollectedHeap::print_on_error(st); > > Isn't it enough to just GenCollectedHeap::print_on_error(st)? Yes, it is. Just a habit of mine to make it more readable (to me). Fixed it. > - one extra newline here: > 92 bool CMSHeap::create_cms_collector() { > 93 Fixed. > - this is pre-existing, but since we are copying code, do we want to > clean it up? > 104 if (collector == NULL || > !collector->completed_initialization()) { > 105 if (collector) { > 106 delete collector; // Be nice in embedded situation > 107 } > 108 vm_shutdown_during_initialization("Could not create CMS > collector"); > 109 return false; > 110 } > > The collector == NULL check is not needed here. CMSCollector derives > from CHeapObj and CHeapObj::operator new will by default do > vm_exit_out_of_memory if the returned memory is NULL. The check can > just be: > > if (!collector->completed_initialization()) { > vm_shutdown_during_initialization("Could not create CMS collector"); > return false; > } > return true; > Ok, good point. Fixed. > - maybe skip the // success comment here: > 111 return true; // success That was probably pre-existing too. Should be thankful that it did not say return true; // return true :-P > - is it possible to end up in CMSHeap::should_do_concurrent_full_gc() > if we are not using CMS? As in: > 123 bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) { > 124 if (!UseConcMarkSweepGC) { > 125 return false; > 126 } > Duh. Fixed. > - one extra newline here: > 135 > 136 > > genCollectedHeap.hpp: > - I don't think you have to make _skip_header_HeapWords protected. > Instead I think we can skip_header_HeapWords() virtual, make it > return 0 in GenCollectedHeap and return > CMSCollector::skip_header_HeapWords in CMSHeap and just remove the _ > skip_header_HeapWords variable. Great catch! I love it when refactoring leads to simplifications... Fixed. > - do you really need #ifdef ASSERT around check_gen_kinds? > No, not really. > - can you make GCH_strong_roots_tasks a protected enum in > GenCollectedHeap? As in > class GenCollectedHeap : public CollectedHeap { > protected: > enum StrongRootTasks { > GCH_PS_Universe_oops_do, > }; > }; > Good idea. Done. > Have you though about vmStructs.cpp, does it need any changes? No. I don't really know what needs to go in there. I added: declare_constant(CollectedHeap::CMSHeap) \ just so that it's there next to the other heap types. Not sure what else may be needed, if anything? http://cr.openjdk.java.net/~rkennke/8179387/webrev.05/ Better now? Roman From per.liden at oracle.com Mon Jul 10 14:54:04 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 10 Jul 2017 16:54:04 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: References: Message-ID: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> Hi, On 2017-07-04 20:47, Roman Kennke wrote: > AdaptiveSizePolicy is not used/called from outside the GCs, and not all > GCs need them. It makes sense to remove it from the CollectedHeap and > CollectorPolicy interfaces and move them down to the actual subclasses > that used them. > > I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only > used/implemented in the parallel GC. Also, I made this class AllStatic > (was StackObj) AdaptiveSizePolicyOutput::print() is actually called from runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine with moving it, but we should have the proper #includes in java.cpp. (Your patch doesn't actually build in its current form. I suspect you're using precompiled headers which have a tendency to hide a lot of errors caused by missing includes) > > Tested by running hotspot_gc jtreg tests without regressions. > > http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ collectorPolicy.hpp: -------------------- 258 void cleared_all_soft_refs(); Please declare this virtual too (that's the best we can do to signal intent until we have C++11/override) collectorPolicy.cpp: -------------------- 224 this->CollectorPolicy::cleared_all_soft_refs(); Please remove "this->" to match the super-call style used in other places in this file. Btw, I can sponsor the patch if you want. cheers, Per > > > Roman > From rkennke at redhat.com Mon Jul 10 16:35:40 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 10 Jul 2017 18:35:40 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> Message-ID: <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> Hi Per, thanks for the review! > >> AdaptiveSizePolicy is not used/called from outside the GCs, and not all >> GCs need them. It makes sense to remove it from the CollectedHeap and >> CollectorPolicy interfaces and move them down to the actual subclasses >> that used them. >> >> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only >> used/implemented in the parallel GC. Also, I made this class AllStatic >> (was StackObj) > > AdaptiveSizePolicyOutput::print() is actually called from > runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine > with moving it, but we should have the proper #includes in java.cpp. > > (Your patch doesn't actually build in its current form. I suspect > you're using precompiled headers which have a tendency to hide a lot > of errors caused by missing includes) > I added the include. >> >> Tested by running hotspot_gc jtreg tests without regressions. >> >> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ > > collectorPolicy.hpp: > -------------------- > 258 void cleared_all_soft_refs(); > > Please declare this virtual too (that's the best we can do to signal > intent until we have C++11/override) > Ok. > > collectorPolicy.cpp: > -------------------- > 224 this->CollectorPolicy::cleared_all_soft_refs(); > > Please remove "this->" to match the super-call style used in other > places in this file. ok. > > Btw, I can sponsor the patch if you want. Find the updated webrev here: http://cr.openjdk.java.net/~rkennke/8179268/webrev.03/ Cheers, Roman > > cheers, > Per > >> >> >> Roman >> From robbin.ehn at oracle.com Mon Jul 10 18:50:15 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 10 Jul 2017 20:50:15 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> References: <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> Message-ID: <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> I'll start a push now. /Robbin On 2017-07-10 12:38, Roman Kennke wrote: > Ok, so I guess I need a sponsor for this now: > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ > > > Roman > > Am 07.07.2017 um 20:09 schrieb Igor Veresov: >> >>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >> > wrote: >>> >>> Hi Roman, >>> >>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>> Hi Robbin, >>>>> >>>>> Far down -> >>>>> >>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>> >>>>>>> >>>>>>> I'm not happy about this change: >>>>>>> >>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>> + // TODO: Is this really needed? >>>>>>> + OrderAccess::storestore(); >>>>>>> + } >>>>>>> >>>>>>> because we're adding an OrderAccess::storestore() to be consistent >>>>>>> with an OrderAccess::storestore() that's not properly documented >>>>>>> which is only increasing the technical debt. >>>>>>> >>>>>>> So a couple of things above don't make sense to me: >>>>>>> >>>>>>>> - sweeper thread runs outside safepoint >>>>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>>>> I'm looking at) runs while all other threads (incl. the sweeper) >>>>>>>> is holding still. >>>>>>> >>>>>>> and: >>>>>>> >>>>>>>> There should be no need for a storestore() (at least in >>>>>>>> sweeper.cpp... >>>>>> >>>>>> Either one or the other are running. Either the VMThread is marking >>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>> (outside >>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() >>>>>> should be necessary. >>>>>> >>>>>> From Igor's comment I can see how it happened though: Apparently >>>>>> there >>>>>> *is* a race in sweeper's own concurrent processing (concurrent with >>>>>> compiler threads, as far as I understand). And there's a call to >>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>> required >>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>> also put >>>>>> a storestore() in the other places that call mark_as_seen_on_stack(), >>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>> discussing. (why the storestore() hasn't been put right into >>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>> storestore() >>>>>> really was necessary, the other looks like it has been put there 'for >>>>>> consistency' or just conservatively. But it shouldn't be necessary in >>>>>> the safepoint cleanup code that we're discussing. >>>>>> >>>>>> So what should we do? Remove the storestore() for good? Refactor the >>>>>> code so that both paths at least call the storestore() in the same >>>>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>>>> storestore() in the dtor as proposed?) >>>>> >>>>> I took a quick look, maybe I'm missing some stuff but: >>>>> >>>>> So there is a slight optimization when not running sweeper to skip >>>>> compiler barrier/fence in stw. >>>>> >>>>> Don't think that matter, so I propose something like: >>>>> - long stack_traversal_mark() { return >>>>> _stack_traversal_mark; } >>>>> - void set_stack_traversal_mark(long l) { >>>>> _stack_traversal_mark = l; } >>>>> + long stack_traversal_mark() { return >>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>> + void set_stack_traversal_mark(long l) { >>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>> >>>>> Maybe make _stack_traversal_mark volatile also, just as a marking that >>>>> it is concurrent accessed. >>>>> And remove both storestore. >>>>> >>>>> "Also neither of these state variables are volatile in nmethod, so >>>>> even the compiler may reorder the stores" >>>>> Fortunately at least _state is volatile now. >>>>> >>>>> I think _state also should use la/rs semantics instead, but that's >>>>> another story. >>>> Like this? >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>> >>>> >>> >>> Yes, exactly, I like this! >>> Dan? Igor ? Tobias? >>> >> >> That seems correct. >> >> igor >> >>> Thanks Roman! >>> >>> BTW I'm going on vacation (5w) in a few hours, but I will follow this >>> thread/changeset to the end! >>> >>> /Robbin >>> >>>> Roman >> > From robbin.ehn at oracle.com Mon Jul 10 19:22:59 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 10 Jul 2017 21:22:59 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> References: <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com> <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> Message-ID: <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> Hi, unfortunately the push failed on 32-bit. (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) I do not have anytime to look at this, so here is the error. /Robbin make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In member function 'long int nmethod::stack_traversal_mark()': /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: note: candidates are: In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: note: static jint OrderAccess::load_acquire(const volatile jint*) /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: note: no known conversion for argument 1 from 'volatile long int*' to 'const volatile jint* {aka const volatile int*}' In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: note: static juint OrderAccess::load_acquire(const volatile juint*) /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: note: no known conversion for argument 1 from 'volatile long int*' to 'const volatile juint* {aka const volatile unsigned int*}' In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In member function 'void nmethod::set_stack_traversal_mark(long int)': /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: error: call of overloaded 'release_store(volatile long int*, long int&)' is ambiguous /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: note: candidates are: In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: note: static void OrderAccess::release_store(volatile jint*, jint) /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: note: no known conversion for argument 1 from 'volatile long int*' to 'volatile jint* {aka volatile int*}' /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: note: static void OrderAccess::release_store(volatile juint*, juint) /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: note: no known conversion for argument 1 from 'volatile long int*' to 'volatile juint* {aka volatile unsigned int*}' On 2017-07-10 20:50, Robbin Ehn wrote: > I'll start a push now. > > /Robbin > > On 2017-07-10 12:38, Roman Kennke wrote: >> Ok, so I guess I need a sponsor for this now: >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >> >> >> Roman >> >> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>> >>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>> > wrote: >>>> >>>> Hi Roman, >>>> >>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>> Hi Robbin, >>>>>> >>>>>> Far down -> >>>>>> >>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>> >>>>>>>> >>>>>>>> I'm not happy about this change: >>>>>>>> >>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>> + // TODO: Is this really needed? >>>>>>>> + OrderAccess::storestore(); >>>>>>>> + } >>>>>>>> >>>>>>>> because we're adding an OrderAccess::storestore() to be consistent >>>>>>>> with an OrderAccess::storestore() that's not properly documented >>>>>>>> which is only increasing the technical debt. >>>>>>>> >>>>>>>> So a couple of things above don't make sense to me: >>>>>>>> >>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>> sweeper) >>>>>>>>> is holding still. >>>>>>>> >>>>>>>> and: >>>>>>>> >>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>> sweeper.cpp... >>>>>>> >>>>>>> Either one or the other are running. Either the VMThread is marking >>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>> (outside >>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore() >>>>>>> should be necessary. >>>>>>> >>>>>>> From Igor's comment I can see how it happened though: Apparently >>>>>>> there >>>>>>> *is* a race in sweeper's own concurrent processing (concurrent with >>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>> required >>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>> also put >>>>>>> a storestore() in the other places that call >>>>>>> mark_as_seen_on_stack(), >>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>> storestore() >>>>>>> really was necessary, the other looks like it has been put there >>>>>>> 'for >>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>> necessary in >>>>>>> the safepoint cleanup code that we're discussing. >>>>>>> >>>>>>> So what should we do? Remove the storestore() for good? Refactor the >>>>>>> code so that both paths at least call the storestore() in the same >>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>>>>> storestore() in the dtor as proposed?) >>>>>> >>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>> >>>>>> So there is a slight optimization when not running sweeper to skip >>>>>> compiler barrier/fence in stw. >>>>>> >>>>>> Don't think that matter, so I propose something like: >>>>>> - long stack_traversal_mark() { return >>>>>> _stack_traversal_mark; } >>>>>> - void set_stack_traversal_mark(long l) { >>>>>> _stack_traversal_mark = l; } >>>>>> + long stack_traversal_mark() { return >>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>> + void set_stack_traversal_mark(long l) { >>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>> >>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>> that >>>>>> it is concurrent accessed. >>>>>> And remove both storestore. >>>>>> >>>>>> "Also neither of these state variables are volatile in nmethod, so >>>>>> even the compiler may reorder the stores" >>>>>> Fortunately at least _state is volatile now. >>>>>> >>>>>> I think _state also should use la/rs semantics instead, but that's >>>>>> another story. >>>>> Like this? >>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>> >>>>> >>>> >>>> Yes, exactly, I like this! >>>> Dan? Igor ? Tobias? >>>> >>> >>> That seems correct. >>> >>> igor >>> >>>> Thanks Roman! >>>> >>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this >>>> thread/changeset to the end! >>>> >>>> /Robbin >>>> >>>>> Roman >>> >> From rkennke at redhat.com Mon Jul 10 20:07:59 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 10 Jul 2017 22:07:59 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> References: <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> Message-ID: <266bd634-b1a5-0f93-733a-22faf5e785f3@redhat.com> Ugh. I changed the field and accessors and a few related entries (vmStructs..) to jlong. I am doing this blindly... I have no way to test 32bit here. It does build for me ;-) http://cr.openjdk.java.net/~rkennke/8180932/webrev.13/ Roman Am 10.07.2017 um 21:22 schrieb Robbin Ehn: > Hi, unfortunately the push failed on 32-bit. > > (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) > > I do not have anytime to look at this, so here is the error. > > /Robbin > > make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' > make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In > member function 'long int nmethod::stack_traversal_mark()': > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: > error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: > note: candidates are: > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: > note: static jint OrderAccess::load_acquire(const volatile jint*) > > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: > note: no known conversion for argument 1 from 'volatile long int*' > to 'const volatile jint* {aka const volatile int*}' > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: > note: static juint OrderAccess::load_acquire(const volatile juint*) > > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: > note: no known conversion for argument 1 from 'volatile long int*' > to 'const volatile juint* {aka const volatile unsigned int*}' > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In > member function 'void nmethod::set_stack_traversal_mark(long int)': > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: > error: call of overloaded 'release_store(volatile long int*, long > int&)' is ambiguous > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: > note: candidates are: > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: > note: static void OrderAccess::release_store(volatile jint*, jint) > > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: > note: no known conversion for argument 1 from 'volatile long int*' > to 'volatile jint* {aka volatile int*}' > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: > note: static void OrderAccess::release_store(volatile juint*, juint) > > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: > note: no known conversion for argument 1 from 'volatile long int*' > to 'volatile juint* {aka volatile unsigned int*}' > > On 2017-07-10 20:50, Robbin Ehn wrote: >> I'll start a push now. >> >> /Robbin >> >> On 2017-07-10 12:38, Roman Kennke wrote: >>> Ok, so I guess I need a sponsor for this now: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>> >>> >>> Roman >>> >>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>> >>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>> > wrote: >>>>> >>>>> Hi Roman, >>>>> >>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>> Hi Robbin, >>>>>>> >>>>>>> Far down -> >>>>>>> >>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> I'm not happy about this change: >>>>>>>>> >>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>> + // TODO: Is this really needed? >>>>>>>>> + OrderAccess::storestore(); >>>>>>>>> + } >>>>>>>>> >>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>> consistent >>>>>>>>> with an OrderAccess::storestore() that's not properly documented >>>>>>>>> which is only increasing the technical debt. >>>>>>>>> >>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>> >>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>> sweeper) >>>>>>>>>> is holding still. >>>>>>>>> >>>>>>>>> and: >>>>>>>>> >>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>> sweeper.cpp... >>>>>>>> >>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>> marking >>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>> (outside >>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>> storestore() >>>>>>>> should be necessary. >>>>>>>> >>>>>>>> From Igor's comment I can see how it happened though: Apparently >>>>>>>> there >>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>> with >>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>> required >>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>> also put >>>>>>>> a storestore() in the other places that call >>>>>>>> mark_as_seen_on_stack(), >>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>> storestore() >>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>> 'for >>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>> necessary in >>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>> >>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>> Refactor the >>>>>>>> code so that both paths at least call the storestore() in the same >>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>>>>>> storestore() in the dtor as proposed?) >>>>>>> >>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>> >>>>>>> So there is a slight optimization when not running sweeper to skip >>>>>>> compiler barrier/fence in stw. >>>>>>> >>>>>>> Don't think that matter, so I propose something like: >>>>>>> - long stack_traversal_mark() { return >>>>>>> _stack_traversal_mark; } >>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>> _stack_traversal_mark = l; } >>>>>>> + long stack_traversal_mark() { return >>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>> >>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>> that >>>>>>> it is concurrent accessed. >>>>>>> And remove both storestore. >>>>>>> >>>>>>> "Also neither of these state variables are volatile in nmethod, so >>>>>>> even the compiler may reorder the stores" >>>>>>> Fortunately at least _state is volatile now. >>>>>>> >>>>>>> I think _state also should use la/rs semantics instead, but that's >>>>>>> another story. >>>>>> Like this? >>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>> >>>>>> >>>>> >>>>> Yes, exactly, I like this! >>>>> Dan? Igor ? Tobias? >>>>> >>>> >>>> That seems correct. >>>> >>>> igor >>>> >>>>> Thanks Roman! >>>>> >>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this >>>>> thread/changeset to the end! >>>>> >>>>> /Robbin >>>>> >>>>>> Roman >>>> >>> From shade at redhat.com Mon Jul 10 20:14:07 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 10 Jul 2017 22:14:07 +0200 Subject: RFC: Epsilon GC JEP Message-ID: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> Hi, I would like to solicit feedback on Epsilon GC JEP: https://bugs.openjdk.java.net/browse/JDK-8174901 http://openjdk.java.net/jeps/8174901 The JEP text should be pretty self-contained, but we can certainly add more points after the discussion happens. For the last few months, there were quite a few instances where Epsilon proved a good vehicle to do GC performance research, especially on object locality and code generation fronts. I think it also serves as the trivial target for Erik's/Roman's GC interface work. The implementation and tests are there in the Sandbox, for those who are curious. Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From kim.barrett at oracle.com Mon Jul 10 21:33:09 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 10 Jul 2017 17:33:09 -0400 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <595FA613.7090306@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com> <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com> <595FA613.7090306@oracle.com> Message-ID: <5B0584AA-C49D-4426-AD02-3D23AF45F0CD@oracle.com> On 2017-07-06 22:15, Kim Barrett wrote: >> On Jul 6, 2017, at 4:11 PM, Kim Barrett wrote: >> >>> On Jul 4, 2017, at 10:00 PM, Kim Barrett wrote: >>> The lock ranking changes look good. >> I'm going to retract that. >> >> How does these new lock rankings interact with various assertions that >> rank() == or != Mutex::special? I'm not sure those places handle >> these new ranks properly. (I'm not sure those places handle >> Mutex::event rank properly either.) > On Jul 7, 2017, at 11:17 AM, Erik ?sterlund wrote: > > [?] > All in all, I believe that the deadlock detecion system has some redundant, and some confusing checks that involve the lock rank Mutex::special. But I do believe that it works and would detect deadlocks, but could do with some reworking to make it more explicit. And that is invariant of the new access rank and applies equally to the event rank. > > However, since these access locks play well with the current deadlock detection as they do not do anything illegal, and even if use of these locks did indeed do illegal things, it would still be detected by the deadlock detection system, it is reasonable to say that refactoring the deadlock detection system is a separate RFE? > > Specifically, clarifying the deadlock detection system by removing redundant checks, not checking for safepoint-safe state in try_lock as well as explicitly listing special and below locks as illegal when verifying Thread::check_for_valid_safepoint_state(), regardles of whether allow_vm_block() is true or not. Sounds like a separate RFE to me! Thanks for the additional analysis. I agree that so long as one does what one is supposed to (e.g. these locks always need to avoid safepoint checks), there won't be any undesired assertions. And I also agree there won't be any bad consequences (e.g. incorrect code possibly slipping through) from misuse, though the indicative failures might not always be where one might prefer. I don't think the redundant checks are necessarily bad, as they make it more obvious to future readers what the requirements are at various levels. However, I agree it should be a separate RFE to do some cleanup in this area, particularly where [non-]equality with Mutex::special ought to be an ordered comparison. From kim.barrett at oracle.com Tue Jul 11 02:19:04 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 10 Jul 2017 22:19:04 -0400 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <595CBE40.5050603@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> <595CBE40.5050603@oracle.com> Message-ID: <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com> > On Jul 5, 2017, at 6:24 AM, Erik ?sterlund wrote: > On 2017-07-05 04:00, Kim Barrett wrote: >>> On Jun 26, 2017, at 9:34 AM, Erik ?sterlund wrote: >>> >>> Hi, >>> >>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 >> ------------------------------------------------------------------------------ >> src/share/vm/gc/g1/ptrQueue.cpp >> Removing unlock / relock around >> 78 qset()->enqueue_complete_buffer(node); >> >> I would prefer that this part of this changeset not be made at this >> time. >> >> This part isn't necessary for the main point of this changeset. It's >> a cleanup that is enabled by the lock rank changes, where the rank >> changes are required for other reasons. > > Okay. > >> It also at least conflicts with, and probably breaks, a pending change >> of mine. (I have a largish stack of patches in this area that didn't >> quite make it into JDK 9 before the original FC date, and which I've >> been (all too slowly) trying to work my way through and bring into JDK >> 10.) > > I agree that it would be possible to just correct the ranks while allowing the spaghetti synchronization code to remain in the code base. Here are some comments about that to me not so attractive idea: > 1) I would really like to get rid of that code, because I think it is poor synchronization practice and its stated reason for existence is gone now. > 2) I have to do *something* about that part in the current change, otherwise the comment motivating its existence will be incorrect after my lock rank change. There is no longer a legitimate motivation for doing that unlock and re-lock. So we have the choice of making a new made up motivation why we do this anyway, or to remove it. For me the choice is easily to remove it. Or leave it be for now, to avoid knowingly creating more work for someone else by inflicting merge conflicts or other breakage on them. (But see below.) If the occasional out of date comment was the worst of the problems we faced, that would be pretty fabulous. > 3) If some new actual motivation for dropping that lock arises later on down the road (which I am dubious about), then I do not see an issue with simply re-adding it then, when/if that becomes necessary again, with a new corresponding motivation added in appropriately. > > As far as your new changes go, I am curious what they do to motivate unlocking/re-locking this shared queue lock again. As outlined in my recent email to Thomas, we do not hold either of these queue locks when concurrent refinement helper code is called from GC barriers invoked from JavaThreads, even with my new changes. If it is in this code path that you will perform more work (just speculating), then that should be invariant of this cleanup. One possibility I was thinking of was the buffer filtering step. I mis-remembered and thought that wasn't done for the (locked) shared queues, and that one of my pending changes was to change that. (It's been over a year since I worked on those changes, and haven't had time to really page them back in.) But I now see that we already do the filtering of the shared SATB queue (dirty card queues don't presently have any filtering, but might in the future) while holding its lock. This suggests a potential (though seemingly hard to avoid) fragility resulting from the lowered lock rank. The present SATB filtering doesn't seem to acquire any locks, but it's a non-trivial amount of code spread over multiple files, so would be easy to miss something or break it in that respect. Reducing the lock ranks requires being very careful with the SATB filtering code. The "mutator" help for dirty card queue processing is not presently done for the shared queue, but I think could be today. I'm less sure about that with lowered queue lock ranks; I *think* there aren't any relevant locks there (other than the very rare shared queue lock in refine_card_concurrently), but that's a substantially larger and more complex amount of code than SATB queue filtering. It looks like something along this line is part of my pending changes. That would certainly be broken by the proposed removal of the temporary unlocking. At the time I was working on it, it seemed like having that little unlocking dance simplified things elsewhere. I can cope with the merge conflict (especially since it *is* a merge conflict and not silent breakage that I may have forgotten about by the time I get back to it), though I would prefer not to have to. (I can also think of some reasons why this might not be worth doing or even a bad idea, and don't recall right now what I may have done to address those.) But while looking at the mutator helper, I realized there may be a different problem. Lowering these lock ranks may not be sufficient to allow enqueue in "arbitrary" lock contexts. The difficulty is that in the mutator help case (only applies for dirty card queue right now, and currently only if a Java thread dealing with its thread-local queue), the allocation of the temporary worker_id is done under the CBL lock (which is ok), but if there isn't a free worker_id, it *waits* for one, and that's not ok in an arbitrary lock context. Right now, we should not be able to hit that wait while not holding "critial" locks, because the present CBL rank is too high to (safely) be in enqueue in such a context. But lowering the CBL rank is not sufficient to enqueue while holding critical locks; that potential wait also needs to be eliminated. (This is assuming there's a place where a Java thread can need an enqueue while holding a critical lock. I don't have such a place in mind, but proving it can never happen now or in the future seems hard, and contrary to the intent of the proposed lock rank changes.) Eliminating that wait doesn't need to be part of this change, but seems like it might be required before taking advantage of the change to move some potentially enqueuing operations. It shouldn't be too hard to eliminate the wait, but it's a somewhat fundamental behavioral change. The present mechanism places a real choke hold on the mutator when concurrent refinement can't keep up. Without a blocking operation in there, the mutator could overwhelm concurrent refinement, leading to longer pauses. Not that said choke hold is all that pleasant either. From per.liden at oracle.com Tue Jul 11 06:34:21 2017 From: per.liden at oracle.com (Per Liden) Date: Tue, 11 Jul 2017 08:34:21 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> Message-ID: Hi, On 2017-07-10 18:35, Roman Kennke wrote: > Hi Per, > > thanks for the review! > >> >>> AdaptiveSizePolicy is not used/called from outside the GCs, and not all >>> GCs need them. It makes sense to remove it from the CollectedHeap and >>> CollectorPolicy interfaces and move them down to the actual subclasses >>> that used them. >>> >>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only >>> used/implemented in the parallel GC. Also, I made this class AllStatic >>> (was StackObj) >> >> AdaptiveSizePolicyOutput::print() is actually called from >> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine >> with moving it, but we should have the proper #includes in java.cpp. >> >> (Your patch doesn't actually build in its current form. I suspect >> you're using precompiled headers which have a tendency to hide a lot >> of errors caused by missing includes) >> > I added the include. > >>> >>> Tested by running hotspot_gc jtreg tests without regressions. >>> >>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ >> >> collectorPolicy.hpp: >> -------------------- >> 258 void cleared_all_soft_refs(); >> >> Please declare this virtual too (that's the best we can do to signal >> intent until we have C++11/override) >> > Ok. > >> >> collectorPolicy.cpp: >> -------------------- >> 224 this->CollectorPolicy::cleared_all_soft_refs(); >> >> Please remove "this->" to match the super-call style used in other >> places in this file. > > ok. > > >> >> Btw, I can sponsor the patch if you want. > > Find the updated webrev here: > > http://cr.openjdk.java.net/~rkennke/8179268/webrev.03/ > Looks good! (Awaiting a second review before I can push) cheers, Per > > Cheers, > Roman > >> >> cheers, >> Per >> >>> >>> >>> Roman >>> > From thomas.schatzl at oracle.com Tue Jul 11 07:27:54 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 11 Jul 2017 09:27:54 +0200 Subject: RFR (S): 8183226: Remembered set summarization accesses not fully initialized java thread DCQS In-Reply-To: <1499688947.2793.21.camel@oracle.com> References: <1499083970.2802.33.camel@oracle.com> <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com> <1499688947.2793.21.camel@oracle.com> Message-ID: <1499758074.3483.4.camel@oracle.com> Hi again, On Mon, 2017-07-10 at 14:15 +0200, Thomas Schatzl wrote: > Hi Erik (and Stefan), > > ? thanks for your review. > > On Fri, 2017-07-07 at 13:16 +0200, Erik Helin wrote: > > > > On 07/03/2017 02:12 PM, Thomas Schatzl wrote: > > > ? can I get reviews for the following change that breaks some > > > dependency cycle in g1remset initialization to fix some (at this > > > time benign) bug when printing remembered set summarization > > > information? > > > > > > The problem is that G1Remset initializes its internal remembered > > > [...] > > You don't need to do all the cleanups, but I think having a fully? > > functioning default constructor is a better way to solve this > > problem, rather than shuffling the call to initialize around. What > > do > > you think? > Let's defer the other suggested cleanups to a different CR. > > In the following webrev I also added StefanJ's suggestion to extract > concurrent refinement initialization into a separate method. > (I do not really understand why that method is actually returning an > error code: all error conditions in ConcurrentG1Refine call > vm_shutdown_during_initialization() anyway - even that seems > superfluous: failing to allocate memory shuts down the VM already). > > Webrevs: > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.0_to_1/?(diff) > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1/?(full) > Erik pointed out that by having two constructors, one taking a G1RemSet, we can save a few more lines of code, avoiding the G1RemSetSummary::initialize() method completely. :) Here is an implementation of this idea. Webrevs: http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1_to_2/?(diff) http://cr.openjdk.java.net/~tschatzl/8183226/webrev.2/?(full) Thanks, ? Thomas From stefan.johansson at oracle.com Tue Jul 11 08:05:00 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 11 Jul 2017 10:05:00 +0200 Subject: RFR (S): 8183226: Remembered set summarization accesses not fully initialized java thread DCQS In-Reply-To: <1499758074.3483.4.camel@oracle.com> References: <1499083970.2802.33.camel@oracle.com> <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com> <1499688947.2793.21.camel@oracle.com> <1499758074.3483.4.camel@oracle.com> Message-ID: On 2017-07-11 09:27, Thomas Schatzl wrote: > Hi again, > > On Mon, 2017-07-10 at 14:15 +0200, Thomas Schatzl wrote: >> Hi Erik (and Stefan), >> >> thanks for your review. >> >> On Fri, 2017-07-07 at 13:16 +0200, Erik Helin wrote: >>> On 07/03/2017 02:12 PM, Thomas Schatzl wrote: >>>> can I get reviews for the following change that breaks some >>>> dependency cycle in g1remset initialization to fix some (at this >>>> time benign) bug when printing remembered set summarization >>>> information? >>>> >>>> The problem is that G1Remset initializes its internal remembered >>>> [...] >>> You don't need to do all the cleanups, but I think having a fully >>> functioning default constructor is a better way to solve this >>> problem, rather than shuffling the call to initialize around. What >>> do >>> you think? >> Let's defer the other suggested cleanups to a different CR. >> >> In the following webrev I also added StefanJ's suggestion to extract >> concurrent refinement initialization into a separate method. >> (I do not really understand why that method is actually returning an >> error code: all error conditions in ConcurrentG1Refine call >> vm_shutdown_during_initialization() anyway - even that seems >> superfluous: failing to allocate memory shuts down the VM already). >> >> Webrevs: >> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.0_to_1/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1/ (full) >> > Erik pointed out that by having two constructors, one taking a > G1RemSet, we can save a few more lines of code, avoiding the > G1RemSetSummary::initialize() method completely. :) > > Here is an implementation of this idea. > > Webrevs: > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1_to_2/ (diff) > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.2/ (full) Looks good, StefanJ > Thanks, > Thomas > From erik.osterlund at oracle.com Tue Jul 11 10:28:44 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 11 Jul 2017 12:28:44 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <5B0584AA-C49D-4426-AD02-3D23AF45F0CD@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com> <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com> <595FA613.7090306@oracle.com> <5B0584AA-C49D-4426-AD02-3D23AF45F0CD@oracle.com> Message-ID: <5964A85C.6080807@oracle.com> On 2017-07-10 23:33, Kim Barrett wrote: > On 2017-07-06 22:15, Kim Barrett wrote: >>> On Jul 6, 2017, at 4:11 PM, Kim Barrett wrote: >>> >>>> On Jul 4, 2017, at 10:00 PM, Kim Barrett wrote: >>>> The lock ranking changes look good. >>> I'm going to retract that. >>> >>> How does these new lock rankings interact with various assertions that >>> rank() == or != Mutex::special? I'm not sure those places handle >>> these new ranks properly. (I'm not sure those places handle >>> Mutex::event rank properly either.) >> On Jul 7, 2017, at 11:17 AM, Erik ?sterlund wrote: >> >> [?] >> All in all, I believe that the deadlock detecion system has some redundant, and some confusing checks that involve the lock rank Mutex::special. But I do believe that it works and would detect deadlocks, but could do with some reworking to make it more explicit. And that is invariant of the new access rank and applies equally to the event rank. >> >> However, since these access locks play well with the current deadlock detection as they do not do anything illegal, and even if use of these locks did indeed do illegal things, it would still be detected by the deadlock detection system, it is reasonable to say that refactoring the deadlock detection system is a separate RFE? >> >> Specifically, clarifying the deadlock detection system by removing redundant checks, not checking for safepoint-safe state in try_lock as well as explicitly listing special and below locks as illegal when verifying Thread::check_for_valid_safepoint_state(), regardles of whether allow_vm_block() is true or not. Sounds like a separate RFE to me! > Thanks for the additional analysis. I agree that so long as one does > what one is supposed to (e.g. these locks always need to avoid > safepoint checks), there won't be any undesired assertions. And I > also agree there won't be any bad consequences (e.g. incorrect code > possibly slipping through) from misuse, though the indicative failures > might not always be where one might prefer. > > I don't think the redundant checks are necessarily bad, as they make > it more obvious to future readers what the requirements are at various > levels. However, I agree it should be a separate RFE to do some > cleanup in this area, particularly where [non-]equality with > Mutex::special ought to be an ordered comparison. I am glad we agree in this area. Thanks for reading through the analysis. /Erik From erik.osterlund at oracle.com Tue Jul 11 12:07:55 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 11 Jul 2017 14:07:55 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> <595CBE40.5050603@oracle.com> <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com> Message-ID: <5964BF9B.4010309@oracle.com> On 2017-07-11 04:19, Kim Barrett wrote: >> On Jul 5, 2017, at 6:24 AM, Erik ?sterlund wrote: >> On 2017-07-05 04:00, Kim Barrett wrote: >>>> On Jun 26, 2017, at 9:34 AM, Erik ?sterlund wrote: >>>> >>>> Hi, >>>> >>>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/ >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703 >>> ------------------------------------------------------------------------------ >>> src/share/vm/gc/g1/ptrQueue.cpp >>> Removing unlock / relock around >>> 78 qset()->enqueue_complete_buffer(node); >>> >>> I would prefer that this part of this changeset not be made at this >>> time. >>> >>> This part isn't necessary for the main point of this changeset. It's >>> a cleanup that is enabled by the lock rank changes, where the rank >>> changes are required for other reasons. >> Okay. >> >>> It also at least conflicts with, and probably breaks, a pending change >>> of mine. (I have a largish stack of patches in this area that didn't >>> quite make it into JDK 9 before the original FC date, and which I've >>> been (all too slowly) trying to work my way through and bring into JDK >>> 10.) >> I agree that it would be possible to just correct the ranks while allowing the spaghetti synchronization code to remain in the code base. Here are some comments about that to me not so attractive idea: >> 1) I would really like to get rid of that code, because I think it is poor synchronization practice and its stated reason for existence is gone now. >> 2) I have to do *something* about that part in the current change, otherwise the comment motivating its existence will be incorrect after my lock rank change. There is no longer a legitimate motivation for doing that unlock and re-lock. So we have the choice of making a new made up motivation why we do this anyway, or to remove it. For me the choice is easily to remove it. > Or leave it be for now, to avoid knowingly creating more work for > someone else by inflicting merge conflicts or other breakage on them. > (But see below.) If the occasional out of date comment was the worst > of the problems we faced, that would be pretty fabulous. A two line merge conflict after over a year of dormancy though... ;) >> 3) If some new actual motivation for dropping that lock arises later on down the road (which I am dubious about), then I do not see an issue with simply re-adding it then, when/if that becomes necessary again, with a new corresponding motivation added in appropriately. >> >> As far as your new changes go, I am curious what they do to motivate unlocking/re-locking this shared queue lock again. As outlined in my recent email to Thomas, we do not hold either of these queue locks when concurrent refinement helper code is called from GC barriers invoked from JavaThreads, even with my new changes. If it is in this code path that you will perform more work (just speculating), then that should be invariant of this cleanup. > One possibility I was thinking of was the buffer filtering step. I > mis-remembered and thought that wasn't done for the (locked) shared > queues, and that one of my pending changes was to change that. (It's > been over a year since I worked on those changes, and haven't had time > to really page them back in.) But I now see that we already do the > filtering of the shared SATB queue (dirty card queues don't presently > have any filtering, but might in the future) while holding its lock. > > This suggests a potential (though seemingly hard to avoid) fragility > resulting from the lowered lock rank. Note that this does not matter for JavaThreads (including compiler threads), for concurrent refinement threads or concurrent marking threads, nor does it matter for any thread when marking is not active. So it seems to me that the worst consequence of this is possibly worse latency for operations coinciding in time with concurrent marking, that have large amounts of mutations or resurrections, and are not performed by JavaThreads (including compiler threads) or GC threads (that are performing the concurrent marking) or concurrent refinement threads (that have nothing to do with SATB), that are running concurrently with each other. That does not seem to be a huge problem in my book. If it was, and an unknown bunch of non-JavaThreads are heavily mutating or resurrecting objects concurrent to marking, such that contention is inflicted on the shared queue lock for the shared SATB queue, then the right solution for that seems to be to give such threads their own local queue, rather than to reduce the time spent under the surprisingly hot shared queue lock. > > The present SATB filtering doesn't seem to acquire any locks, but it's > a non-trivial amount of code spread over multiple files, so would be > easy to miss something or break it in that respect. Reducing the lock > ranks requires being very careful with the SATB filtering code. IMO, adding any lock into the SATB barrier which is used all over hotspot in some very shady places arguably requires being very careful regardless of my changes. So I am going to assume whoever does that for whatever reason is going to be careful. > The "mutator" help for dirty card queue processing is not presently > done for the shared queue, but I think could be today. I'm less sure > about that with lowered queue lock ranks; I *think* there aren't any > relevant locks there (other than the very rare shared queue lock in > refine_card_concurrently), but that's a substantially larger and more > complex amount of code than SATB queue filtering. As discussed with Thomas earlier in this thread, there are indeed locks blocking this. The HeapRegionRemSet::_m lock is currently a leaf lock. If collaborative refinement was to be performed on non-Java threads (and non-concurrent refinement threads), then this lock would have to decrease to the access rank first. But we concluded that warrants a new RFE with separate analysis. As with the SATB queues though, I do not know what threads would be causing such trouble? It is not JavaThreads (including compiler threads), concurrent refinement threads, concurrent marking threads. That does not leave us with a whole lot of threads to cause that contention on the shared queue lock. And as with the SATB queues, if there are such threads that cause such contention on the shared queue lock, then the right fix seems to be to give them their own local queue and stop taking the shared queue lock in the first place. > It looks like > something along this line is part of my pending changes. That would > certainly be broken by the proposed removal of the temporary > unlocking. At the time I was working on it, it seemed like having > that little unlocking dance simplified things elsewhere. I can cope > with the merge conflict (especially since it *is* a merge conflict and > not silent breakage that I may have forgotten about by the time I get > back to it), though I would prefer not to have to. (I can also think > of some reasons why this might not be worth doing or even a bad idea, > and don't recall right now what I may have done to address those.) This is why I wanted to know if you are certain this is truly going to be a problem or not. Since this all seems rather uncertain, would you say it is reasonable that you take that two line merge conflict down the road if you eventually require putting the unlock/lock sequence back again? > But while looking at the mutator helper, I realized there may be a > different problem. Lowering these lock ranks may not be sufficient to > allow enqueue in "arbitrary" lock contexts. The difficulty is that in > the mutator help case (only applies for dirty card queue right now, > and currently only if a Java thread dealing with its thread-local > queue), the allocation of the temporary worker_id is done under the > CBL lock (which is ok), but if there isn't a free worker_id, it > *waits* for one, and that's not ok in an arbitrary lock context. > Right now, we should not be able to hit that wait while not holding > "critial" locks, because the present CBL rank is too high to (safely) > be in enqueue in such a context. But lowering the CBL rank is not > sufficient to enqueue while holding critical locks; that potential > wait also needs to be eliminated. (This is assuming there's a place > where a Java thread can need an enqueue while holding a critical lock. > I don't have such a place in mind, but proving it can never happen now > or in the future seems hard, and contrary to the intent of the > proposed lock rank changes.) I agree that in order to be able to freely perform object stores under special locks, we would have to stop waiting on the cbl monitor when claiming worker IDs in the helper part of the post write barrier. That is a good observation. On the same list of requirements for that to happen, the HeapRegionRemSet::_m monitor would have to change rank to "access" as previously mentioned. > Eliminating that wait doesn't need to be part of this change, but > seems like it might be required before taking advantage of the change > to move some potentially enqueuing operations. Agreed. > It shouldn't be too hard to eliminate the wait, but it's a somewhat > fundamental behavioral change. The present mechanism places a real > choke hold on the mutator when concurrent refinement can't keep up. > Without a blocking operation in there, the mutator could overwhelm > concurrent refinement, leading to longer pauses. Not that said choke > hold is all that pleasant either. Yes this mechanism seems to need some love indeed. Thanks for reviewing! /Erik From stefan.johansson at oracle.com Tue Jul 11 14:17:19 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 11 Jul 2017 16:17:19 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> Message-ID: <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> Hi Roman, On 2017-07-11 08:34, Per Liden wrote: > Hi, > > On 2017-07-10 18:35, Roman Kennke wrote: >> Hi Per, >> >> thanks for the review! >> >>> >>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not >>>> all >>>> GCs need them. It makes sense to remove it from the CollectedHeap and >>>> CollectorPolicy interfaces and move them down to the actual subclasses >>>> that used them. >>>> >>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>> only >>>> used/implemented in the parallel GC. Also, I made this class AllStatic >>>> (was StackObj) >>> >>> AdaptiveSizePolicyOutput::print() is actually called from >>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine >>> with moving it, but we should have the proper #includes in java.cpp. >>> >>> (Your patch doesn't actually build in its current form. I suspect >>> you're using precompiled headers which have a tendency to hide a lot >>> of errors caused by missing includes) >>> >> I added the include. >> >>>> >>>> Tested by running hotspot_gc jtreg tests without regressions. >>>> >>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ >>> >>> collectorPolicy.hpp: >>> -------------------- >>> 258 void cleared_all_soft_refs(); >>> >>> Please declare this virtual too (that's the best we can do to signal >>> intent until we have C++11/override) >>> >> Ok. >> >>> >>> collectorPolicy.cpp: >>> -------------------- >>> 224 this->CollectorPolicy::cleared_all_soft_refs(); >>> >>> Please remove "this->" to match the super-call style used in other >>> places in this file. >> >> ok. >> >> >>> >>> Btw, I can sponsor the patch if you want. >> >> Find the updated webrev here: >> >> http://cr.openjdk.java.net/~rkennke/8179268/webrev.03/ >> > > > Looks good! > This looks good to me too, Stefan > (Awaiting a second review before I can push) > > cheers, > Per > >> >> Cheers, >> Roman >> >>> >>> cheers, >>> Per >>> >>>> >>>> >>>> Roman >>>> >> From stefan.johansson at oracle.com Tue Jul 11 14:37:22 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 11 Jul 2017 16:37:22 +0200 Subject: RFR: 8177544: Restructure G1 Full GC code In-Reply-To: <1499691680.2793.29.camel@oracle.com> References: <62d1f02b-1fc0-ffcf-b8e0-e88ebacecebe@oracle.com> <1497346566.2829.33.camel@oracle.com> <1499691680.2793.29.camel@oracle.com> Message-ID: On 2017-07-10 15:01, Thomas Schatzl wrote: > ... >> I see your point and I think it would be good. But as we discussed >> over chat, might be something to look at once everything else in this >> area is done. Will create a RFE for this. > Yes, that's fine. > >>> - g1CollectedHeap.hpp: please try to sort the definitions of the >>> new methods in order of calling them. >> Done. >> >> Here are updated webrevs: >> Full: http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.01/ >> Inc: http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.00-01/ >> > Looks good to me. Sorry for the late reply. Thanks for reviewing Thomas! No problem, I might not push this before getting back from vacation anyways. Thanks, Stefan > > Thanks, > Thomas > From kishor.kharbas at intel.com Wed Jul 12 01:40:18 2017 From: kishor.kharbas at intel.com (Kharbas, Kishor) Date: Wed, 12 Jul 2017 01:40:18 +0000 Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory devices In-Reply-To: References: Message-ID: Greetings, I have an updated patch for JEP https://bugs.openjdk.java.net/browse/JDK-8171181 at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.05 This patch fixes the bugs pointed earlier and other suggestions to make the code less intrusive. I have also sent this to 'hotspot-runtime-dev' mailing list (included below). I would appreciate comments and feedback. Thanks Kishor From: Kharbas, Kishor Sent: Monday, July 10, 2017 1:53 PM To: hotspot-runtime-dev at openjdk.java.net Cc: Kharbas, Kishor Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory devices Hello all! I have an updated patch for https://bugs.openjdk.java.net/browse/JDK-8171181 at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.05 I have lost the old email chain so had to start a fresh one. The archived conversation can be found at - http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-March/022733.html 1. I have worked on all the comments and fixed the bugs. Mainly bugs fixed are related to sigprocmask() and changed the implementation such that 'fd' is not passed all the way down the call stack. Thus minimizing function signature changes. 2. Patch supports all OS'es. Consolidated all Posix compliant OS's implementation in os_posix.cpp. 3. The patch is tested on Windows and Linux. Working on testing it on other OS'es. Let me know if this version looks clean and correct. Thanks Kishor -------------- next part -------------- An HTML attachment was scrubbed... URL: From per.liden at oracle.com Wed Jul 12 06:44:43 2017 From: per.liden at oracle.com (Per Liden) Date: Wed, 12 Jul 2017 08:44:43 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> Message-ID: Hi Roman, On 2017-07-11 16:17, Stefan Johansson wrote: > Hi Roman, > > On 2017-07-11 08:34, Per Liden wrote: >> Hi, >> >> On 2017-07-10 18:35, Roman Kennke wrote: >>> Hi Per, >>> >>> thanks for the review! >>> >>>> >>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not >>>>> all >>>>> GCs need them. It makes sense to remove it from the CollectedHeap and >>>>> CollectorPolicy interfaces and move them down to the actual subclasses >>>>> that used them. >>>>> >>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>>> only >>>>> used/implemented in the parallel GC. Also, I made this class AllStatic >>>>> (was StackObj) >>>> >>>> AdaptiveSizePolicyOutput::print() is actually called from >>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine >>>> with moving it, but we should have the proper #includes in java.cpp. I just realized that this doesn't build on linux-i586, which builds a minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for now. (Use --with-target-bits=32 --with-jvm-variants=minimal when test building for linux-i586) cheers, Per >>>> >>>> (Your patch doesn't actually build in its current form. I suspect >>>> you're using precompiled headers which have a tendency to hide a lot >>>> of errors caused by missing includes) >>>> >>> I added the include. >>> >>>>> >>>>> Tested by running hotspot_gc jtreg tests without regressions. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/ >>>> >>>> collectorPolicy.hpp: >>>> -------------------- >>>> 258 void cleared_all_soft_refs(); >>>> >>>> Please declare this virtual too (that's the best we can do to signal >>>> intent until we have C++11/override) >>>> >>> Ok. >>> >>>> >>>> collectorPolicy.cpp: >>>> -------------------- >>>> 224 this->CollectorPolicy::cleared_all_soft_refs(); >>>> >>>> Please remove "this->" to match the super-call style used in other >>>> places in this file. >>> >>> ok. >>> >>> >>>> >>>> Btw, I can sponsor the patch if you want. >>> >>> Find the updated webrev here: >>> >>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.03/ >>> >> >> >> Looks good! >> > This looks good to me too, > Stefan >> (Awaiting a second review before I can push) >> >> cheers, >> Per >> >>> >>> Cheers, >>> Roman >>> >>>> >>>> cheers, >>>> Per >>>> >>>>> >>>>> >>>>> Roman >>>>> >>> > From erik.helin at oracle.com Wed Jul 12 10:09:16 2017 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 12 Jul 2017 12:09:16 +0200 Subject: RFR (S): 8183226: Remembered set summarization accesses not fully initialized java thread DCQS In-Reply-To: <1499758074.3483.4.camel@oracle.com> References: <1499083970.2802.33.camel@oracle.com> <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com> <1499688947.2793.21.camel@oracle.com> <1499758074.3483.4.camel@oracle.com> Message-ID: <362056fe-f621-5e9d-a16b-13b51d9a550b@oracle.com> On 07/11/2017 09:27 AM, Thomas Schatzl wrote: > Webrevs: > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1_to_2/ (diff) > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.2/ (full) Looks good, Reviewed. Thanks, Erik > Thanks, > Thomas > From rkennke at redhat.com Wed Jul 12 10:47:41 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 12 Jul 2017 12:47:41 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> Message-ID: <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com> Am 12.07.2017 um 08:44 schrieb Per Liden: > Hi Roman, > > On 2017-07-11 16:17, Stefan Johansson wrote: >> Hi Roman, >> >> On 2017-07-11 08:34, Per Liden wrote: >>> Hi, >>> >>> On 2017-07-10 18:35, Roman Kennke wrote: >>>> Hi Per, >>>> >>>> thanks for the review! >>>> >>>>> >>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not >>>>>> all >>>>>> GCs need them. It makes sense to remove it from the CollectedHeap >>>>>> and >>>>>> CollectorPolicy interfaces and move them down to the actual >>>>>> subclasses >>>>>> that used them. >>>>>> >>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>>>> only >>>>>> used/implemented in the parallel GC. Also, I made this class >>>>>> AllStatic >>>>>> (was StackObj) >>>>> >>>>> AdaptiveSizePolicyOutput::print() is actually called from >>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine >>>>> with moving it, but we should have the proper #includes in java.cpp. > > I just realized that this doesn't build on linux-i586, which builds a > minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not > include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef > INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I > suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for > now. I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to be able to call ParallelScavengeHeap::heap(), or else defeats the purpose of this patch by requiring CollectedHeap to still carry size_policy().. which we don't want. In addition to that, if I try to include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp around AdaptiveSizePolicyOutput seems like the lesser evil... done so here: http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/ Ok? The incremental diff between 03 and 04: diff --git a/src/share/vm/runtime/java.cpp b/src/share/vm/runtime/java.cpp --- a/src/share/vm/runtime/java.cpp +++ b/src/share/vm/runtime/java.cpp @@ -487,7 +487,10 @@ ClassLoaderDataGraph::dump_on(log.trace_stream()); } } + +#if INCLUDE_ALL_GCS AdaptiveSizePolicyOutput::print(); +#endif if (PrintBytecodeHistogram) { BytecodeHistogram::print(); Roman From thomas.schatzl at oracle.com Wed Jul 12 12:13:03 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 12 Jul 2017 14:13:03 +0200 Subject: RFR (S): 8183121: Add information about scanned and skipped cards during UpdateRS Message-ID: <1499861583.6693.3.camel@oracle.com> Hi all, ? can I have reviews for this small change that adds some information about how many cards were scanned/skipped during Update RS. This information is much better than just the number of processed buffers, although I kept them for now. This change is based on Erik's changes for JDK-8183539. CR: https://bugs.openjdk.java.net/browse/JDK-8183121 Webrev: http://cr.openjdk.java.net/~tschatzl/8183121/webrev Testing: jprt, test case Thanks, ? Thomas From thomas.schatzl at oracle.com Wed Jul 12 12:15:47 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 12 Jul 2017 14:15:47 +0200 Subject: RFR (XS): 8183538: UpdateRS phase should claim cards Message-ID: <1499861747.6693.6.camel@oracle.com> Hi all, ? please review this small change that adds claiming of cards in the update rs phase so that scan rs does not rescan them. CR: https://bugs.openjdk.java.net/browse/JDK-8183538 Webrev: http://cr.openjdk.java.net/~tschatzl/8183538/webrev/ Testing: jprt Thanks, ? Thomas From thomas.schatzl at oracle.com Wed Jul 12 12:16:13 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 12 Jul 2017 14:16:13 +0200 Subject: RFR (S): 8183226: Remembered set summarization accesses not fully initialized java thread DCQS In-Reply-To: <362056fe-f621-5e9d-a16b-13b51d9a550b@oracle.com> References: <1499083970.2802.33.camel@oracle.com> <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com> <1499688947.2793.21.camel@oracle.com> <1499758074.3483.4.camel@oracle.com> <362056fe-f621-5e9d-a16b-13b51d9a550b@oracle.com> Message-ID: <1499861773.6693.7.camel@oracle.com> Hi Erik, Stefan, On Wed, 2017-07-12 at 12:09 +0200, Erik Helin wrote: > On 07/11/2017 09:27 AM, Thomas Schatzl wrote: > > > > Webrevs: > > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1_to_2/ (diff) > > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.2/ (full) > Looks good, Reviewed. > ? thanks for your reviews. Thomas From stefan.johansson at oracle.com Wed Jul 12 12:20:08 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 12 Jul 2017 14:20:08 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com> References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com> Message-ID: <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com> Hi Roman, On 2017-07-12 12:47, Roman Kennke wrote: > Am 12.07.2017 um 08:44 schrieb Per Liden: >> Hi Roman, >> >> On 2017-07-11 16:17, Stefan Johansson wrote: >>> Hi Roman, >>> >>> On 2017-07-11 08:34, Per Liden wrote: >>>> Hi, >>>> >>>> On 2017-07-10 18:35, Roman Kennke wrote: >>>>> Hi Per, >>>>> >>>>> thanks for the review! >>>>> >>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not >>>>>>> all >>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap >>>>>>> and >>>>>>> CollectorPolicy interfaces and move them down to the actual >>>>>>> subclasses >>>>>>> that used them. >>>>>>> >>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>>>>> only >>>>>>> used/implemented in the parallel GC. Also, I made this class >>>>>>> AllStatic >>>>>>> (was StackObj) >>>>>> AdaptiveSizePolicyOutput::print() is actually called from >>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine >>>>>> with moving it, but we should have the proper #includes in java.cpp. >> I just realized that this doesn't build on linux-i586, which builds a >> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not >> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef >> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I >> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for >> now. > I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to > be able to call ParallelScavengeHeap::heap(), or else defeats the > purpose of this patch by requiring CollectedHeap to still carry > size_policy().. which we don't want. In addition to that, if I try to > include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting > freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp > around AdaptiveSizePolicyOutput seems like the lesser evil... done so here: > > http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/ > > > Ok? I'm no big fan of having #if INCLUDE_ALL_GCS if it can be avoided. A few lines below the call to AdaptiveSizePolicyOutput::print(), we call Universe::heap()->print_tracing_info(). I think we could move AdaptiveSizePolicyOutput::print() into ParallelScavengeHeap::print_tracing_info() without running into any problems. What do you think about that solution? Thanks, Stefan > > The incremental diff between 03 and 04: > > diff --git a/src/share/vm/runtime/java.cpp b/src/share/vm/runtime/java.cpp > --- a/src/share/vm/runtime/java.cpp > +++ b/src/share/vm/runtime/java.cpp > @@ -487,7 +487,10 @@ > ClassLoaderDataGraph::dump_on(log.trace_stream()); > } > } > + > +#if INCLUDE_ALL_GCS > AdaptiveSizePolicyOutput::print(); > +#endif > > if (PrintBytecodeHistogram) { > BytecodeHistogram::print(); > > Roman From per.liden at oracle.com Wed Jul 12 12:48:03 2017 From: per.liden at oracle.com (Per Liden) Date: Wed, 12 Jul 2017 14:48:03 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com> References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com> <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com> Message-ID: <0481fae0-e5bb-1b51-a37e-b5b40f4cbaec@oracle.com> On 2017-07-12 14:20, Stefan Johansson wrote: > Hi Roman, > > On 2017-07-12 12:47, Roman Kennke wrote: >> Am 12.07.2017 um 08:44 schrieb Per Liden: >>> Hi Roman, >>> >>> On 2017-07-11 16:17, Stefan Johansson wrote: >>>> Hi Roman, >>>> >>>> On 2017-07-11 08:34, Per Liden wrote: >>>>> Hi, >>>>> >>>>> On 2017-07-10 18:35, Roman Kennke wrote: >>>>>> Hi Per, >>>>>> >>>>>> thanks for the review! >>>>>> >>>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not >>>>>>>> all >>>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap >>>>>>>> and >>>>>>>> CollectorPolicy interfaces and move them down to the actual >>>>>>>> subclasses >>>>>>>> that used them. >>>>>>>> >>>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>>>>>> only >>>>>>>> used/implemented in the parallel GC. Also, I made this class >>>>>>>> AllStatic >>>>>>>> (was StackObj) >>>>>>> AdaptiveSizePolicyOutput::print() is actually called from >>>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine >>>>>>> with moving it, but we should have the proper #includes in java.cpp. >>> I just realized that this doesn't build on linux-i586, which builds a >>> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not >>> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef >>> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I >>> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for >>> now. >> I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to >> be able to call ParallelScavengeHeap::heap(), or else defeats the >> purpose of this patch by requiring CollectedHeap to still carry >> size_policy().. which we don't want. In addition to that, if I try to >> include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting >> freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp >> around AdaptiveSizePolicyOutput seems like the lesser evil... done so >> here: >> >> http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/ >> >> >> Ok? > I'm no big fan of having #if INCLUDE_ALL_GCS if it can be avoided. A few > lines below the call to AdaptiveSizePolicyOutput::print(), we call > Universe::heap()->print_tracing_info(). I think we could move > AdaptiveSizePolicyOutput::print() into > ParallelScavengeHeap::print_tracing_info() without running into any > problems. > > What do you think about that solution? That sounds like a slightly better approach. cheers, Per > > Thanks, > Stefan >> >> The incremental diff between 03 and 04: >> >> diff --git a/src/share/vm/runtime/java.cpp >> b/src/share/vm/runtime/java.cpp >> --- a/src/share/vm/runtime/java.cpp >> +++ b/src/share/vm/runtime/java.cpp >> @@ -487,7 +487,10 @@ >> ClassLoaderDataGraph::dump_on(log.trace_stream()); >> } >> } >> + >> +#if INCLUDE_ALL_GCS >> AdaptiveSizePolicyOutput::print(); >> +#endif >> if (PrintBytecodeHistogram) { >> BytecodeHistogram::print(); >> >> Roman > From rkennke at redhat.com Wed Jul 12 13:32:47 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 12 Jul 2017 15:32:47 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> References: <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com> <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> Message-ID: Hi Robbin and all, I fixed the 32bit failures by using jlong in all relevant places: http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ then Robbin found another problem. SafepointCleanupTest started to fail, because "mark nmethods" is no longer printed. This made me think that we're not measuring the conflated (and possibly parallelized) deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with "safepoint cleanup tasks" which measures the total duration of safepoint cleanup. We can't reasonably measure a possibly parallel and conflated pass standalone, but we can measure all and by subtrating all the other subphases, get an idea how long deflation and nmethod marking take up. http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ The full webrev is now: http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ Hope that's all ;-) Roman Am 10.07.2017 um 21:22 schrieb Robbin Ehn: > Hi, unfortunately the push failed on 32-bit. > > (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) > > I do not have anytime to look at this, so here is the error. > > /Robbin > > make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' > make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In > member function 'long int nmethod::stack_traversal_mark()': > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: > error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: > note: candidates are: > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: > note: static jint OrderAccess::load_acquire(const volatile jint*) > > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: > note: no known conversion for argument 1 from 'volatile long int*' > to 'const volatile jint* {aka const volatile int*}' > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: > note: static juint OrderAccess::load_acquire(const volatile juint*) > > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: > note: no known conversion for argument 1 from 'volatile long int*' > to 'const volatile juint* {aka const volatile unsigned int*}' > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In > member function 'void nmethod::set_stack_traversal_mark(long int)': > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: > error: call of overloaded 'release_store(volatile long int*, long > int&)' is ambiguous > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: > note: candidates are: > In file included from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, > from > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: > note: static void OrderAccess::release_store(volatile jint*, jint) > > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: > note: no known conversion for argument 1 from 'volatile long int*' > to 'volatile jint* {aka volatile int*}' > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: > note: static void OrderAccess::release_store(volatile juint*, juint) > > /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: > note: no known conversion for argument 1 from 'volatile long int*' > to 'volatile juint* {aka volatile unsigned int*}' > > On 2017-07-10 20:50, Robbin Ehn wrote: >> I'll start a push now. >> >> /Robbin >> >> On 2017-07-10 12:38, Roman Kennke wrote: >>> Ok, so I guess I need a sponsor for this now: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>> >>> >>> Roman >>> >>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>> >>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>> > wrote: >>>>> >>>>> Hi Roman, >>>>> >>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>> Hi Robbin, >>>>>>> >>>>>>> Far down -> >>>>>>> >>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> I'm not happy about this change: >>>>>>>>> >>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>> + // TODO: Is this really needed? >>>>>>>>> + OrderAccess::storestore(); >>>>>>>>> + } >>>>>>>>> >>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>> consistent >>>>>>>>> with an OrderAccess::storestore() that's not properly documented >>>>>>>>> which is only increasing the technical debt. >>>>>>>>> >>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>> >>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>> sweeper) >>>>>>>>>> is holding still. >>>>>>>>> >>>>>>>>> and: >>>>>>>>> >>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>> sweeper.cpp... >>>>>>>> >>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>> marking >>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>> (outside >>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>> storestore() >>>>>>>> should be necessary. >>>>>>>> >>>>>>>> From Igor's comment I can see how it happened though: Apparently >>>>>>>> there >>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>> with >>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>> required >>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>> also put >>>>>>>> a storestore() in the other places that call >>>>>>>> mark_as_seen_on_stack(), >>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>> storestore() >>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>> 'for >>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>> necessary in >>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>> >>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>> Refactor the >>>>>>>> code so that both paths at least call the storestore() in the same >>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>>>>>> storestore() in the dtor as proposed?) >>>>>>> >>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>> >>>>>>> So there is a slight optimization when not running sweeper to skip >>>>>>> compiler barrier/fence in stw. >>>>>>> >>>>>>> Don't think that matter, so I propose something like: >>>>>>> - long stack_traversal_mark() { return >>>>>>> _stack_traversal_mark; } >>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>> _stack_traversal_mark = l; } >>>>>>> + long stack_traversal_mark() { return >>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>> >>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>> that >>>>>>> it is concurrent accessed. >>>>>>> And remove both storestore. >>>>>>> >>>>>>> "Also neither of these state variables are volatile in nmethod, so >>>>>>> even the compiler may reorder the stores" >>>>>>> Fortunately at least _state is volatile now. >>>>>>> >>>>>>> I think _state also should use la/rs semantics instead, but that's >>>>>>> another story. >>>>>> Like this? >>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>> >>>>>> >>>>> >>>>> Yes, exactly, I like this! >>>>> Dan? Igor ? Tobias? >>>>> >>>> >>>> That seems correct. >>>> >>>> igor >>>> >>>>> Thanks Roman! >>>>> >>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this >>>>> thread/changeset to the end! >>>>> >>>>> /Robbin >>>>> >>>>>> Roman >>>> >>> From rkennke at redhat.com Wed Jul 12 13:58:12 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 12 Jul 2017 15:58:12 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com> References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com> <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com> Message-ID: <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com> Am 12.07.2017 um 14:20 schrieb Stefan Johansson: > Hi Roman, > > On 2017-07-12 12:47, Roman Kennke wrote: >> Am 12.07.2017 um 08:44 schrieb Per Liden: >>> Hi Roman, >>> >>> On 2017-07-11 16:17, Stefan Johansson wrote: >>>> Hi Roman, >>>> >>>> On 2017-07-11 08:34, Per Liden wrote: >>>>> Hi, >>>>> >>>>> On 2017-07-10 18:35, Roman Kennke wrote: >>>>>> Hi Per, >>>>>> >>>>>> thanks for the review! >>>>>> >>>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and >>>>>>>> not >>>>>>>> all >>>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap >>>>>>>> and >>>>>>>> CollectorPolicy interfaces and move them down to the actual >>>>>>>> subclasses >>>>>>>> that used them. >>>>>>>> >>>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>>>>>> only >>>>>>>> used/implemented in the parallel GC. Also, I made this class >>>>>>>> AllStatic >>>>>>>> (was StackObj) >>>>>>> AdaptiveSizePolicyOutput::print() is actually called from >>>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine >>>>>>> with moving it, but we should have the proper #includes in >>>>>>> java.cpp. >>> I just realized that this doesn't build on linux-i586, which builds a >>> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not >>> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef >>> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I >>> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for >>> now. >> I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to >> be able to call ParallelScavengeHeap::heap(), or else defeats the >> purpose of this patch by requiring CollectedHeap to still carry >> size_policy().. which we don't want. In addition to that, if I try to >> include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting >> freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp >> around AdaptiveSizePolicyOutput seems like the lesser evil... done so >> here: >> >> http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/ >> >> >> Ok? > I'm no big fan of having #if INCLUDE_ALL_GCS if it can be avoided. A > few lines below the call to AdaptiveSizePolicyOutput::print(), we call > Universe::heap()->print_tracing_info(). I think we could move > AdaptiveSizePolicyOutput::print() into > ParallelScavengeHeap::print_tracing_info() without running into any > problems. > > What do you think about that solution? That's a very good idea!! It alters behaviour slightly (will print adaptive size policy stuff in hs_err now) but I think that's for the better. Incremental: http://cr.openjdk.java.net/~rkennke/8179268/webrev.05.diff/ Full: http://cr.openjdk.java.net/~rkennke/8179268/webrev.05 Good now? Thanks, Roman -------------- next part -------------- An HTML attachment was scrubbed... URL: From per.liden at oracle.com Wed Jul 12 14:19:44 2017 From: per.liden at oracle.com (Per Liden) Date: Wed, 12 Jul 2017 16:19:44 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com> References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com> <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com> <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com> Message-ID: Hi, On 2017-07-12 15:58, Roman Kennke wrote: > Am 12.07.2017 um 14:20 schrieb Stefan Johansson: >> Hi Roman, >> >> On 2017-07-12 12:47, Roman Kennke wrote: >>> Am 12.07.2017 um 08:44 schrieb Per Liden: >>>> Hi Roman, >>>> >>>> On 2017-07-11 16:17, Stefan Johansson wrote: >>>>> Hi Roman, >>>>> >>>>> On 2017-07-11 08:34, Per Liden wrote: >>>>>> Hi, >>>>>> >>>>>> On 2017-07-10 18:35, Roman Kennke wrote: >>>>>>> Hi Per, >>>>>>> >>>>>>> thanks for the review! >>>>>>> >>>>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and >>>>>>>>> not >>>>>>>>> all >>>>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap >>>>>>>>> and >>>>>>>>> CollectorPolicy interfaces and move them down to the actual >>>>>>>>> subclasses >>>>>>>>> that used them. >>>>>>>>> >>>>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's >>>>>>>>> only >>>>>>>>> used/implemented in the parallel GC. Also, I made this class >>>>>>>>> AllStatic >>>>>>>>> (was StackObj) >>>>>>>> AdaptiveSizePolicyOutput::print() is actually called from >>>>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine >>>>>>>> with moving it, but we should have the proper #includes in >>>>>>>> java.cpp. >>>> I just realized that this doesn't build on linux-i586, which builds a >>>> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not >>>> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef >>>> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I >>>> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for >>>> now. >>> I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to >>> be able to call ParallelScavengeHeap::heap(), or else defeats the >>> purpose of this patch by requiring CollectedHeap to still carry >>> size_policy().. which we don't want. In addition to that, if I try to >>> include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting >>> freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp >>> around AdaptiveSizePolicyOutput seems like the lesser evil... done so >>> here: >>> >>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/ >>> >>> >>> Ok? >> I'm no big fan of having #if INCLUDE_ALL_GCS if it can be avoided. A >> few lines below the call to AdaptiveSizePolicyOutput::print(), we call >> Universe::heap()->print_tracing_info(). I think we could move >> AdaptiveSizePolicyOutput::print() into >> ParallelScavengeHeap::print_tracing_info() without running into any >> problems. >> >> What do you think about that solution? > > That's a very good idea!! It alters behaviour slightly (will print > adaptive size policy stuff in hs_err now) but I think that's for the better. > > Incremental: > http://cr.openjdk.java.net/~rkennke/8179268/webrev.05.diff/ > > Full: > http://cr.openjdk.java.net/~rkennke/8179268/webrev.05 > > > Good now? Looks good. Not sure I follow your comment on hs_err. The adaptive size policy stuff prints to the normal log (with gc+ergo=debug). Before pushing I'll take the liberty of removing the extra space you added to the end of ParallelScavengeHeap::print_tracing_info(). 586 AdaptiveSizePolicyOutput::print(); 587 588 } Stefan, ok to push? cheers, Per > > Thanks, > Roman > From stefan.johansson at oracle.com Wed Jul 12 14:24:59 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 12 Jul 2017 16:24:59 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com> <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com> <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com> Message-ID: <5a3ad9f8-38d6-43a2-e4ec-65b225fd6492@oracle.com> On 2017-07-12 16:19, Per Liden wrote: > Hi, > > On 2017-07-12 15:58, Roman Kennke wrote: >> >> That's a very good idea!! It alters behaviour slightly (will print >> adaptive size policy stuff in hs_err now) but I think that's for the >> better. >> >> Incremental: >> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05.diff/ >> >> Full: >> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05 >> >> >> Good now? > > Looks good. Not sure I follow your comment on hs_err. The adaptive > size policy stuff prints to the normal log (with gc+ergo=debug). > > Before pushing I'll take the liberty of removing the extra space you > added to the end of ParallelScavengeHeap::print_tracing_info(). > > 586 AdaptiveSizePolicyOutput::print(); > 587 > 588 } > > Stefan, ok to push? > Yes, this looks great! Thanks for cleaning this up Roman, Stefan > cheers, > Per > >> >> Thanks, >> Roman >> From per.liden at oracle.com Wed Jul 12 20:15:41 2017 From: per.liden at oracle.com (Per Liden) Date: Wed, 12 Jul 2017 22:15:41 +0200 Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces CollectorPolicy and CollectedHeap In-Reply-To: <5a3ad9f8-38d6-43a2-e4ec-65b225fd6492@oracle.com> References: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com> <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com> <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com> <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com> <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com> <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com> <5a3ad9f8-38d6-43a2-e4ec-65b225fd6492@oracle.com> Message-ID: <9a257b72-2303-4f1b-7db3-9657f32e1e5f@oracle.com> This has now been pushed to jdk10/hs. cheers, Per On 07/12/2017 04:24 PM, Stefan Johansson wrote: > > > On 2017-07-12 16:19, Per Liden wrote: >> Hi, >> >> On 2017-07-12 15:58, Roman Kennke wrote: >>> >>> That's a very good idea!! It alters behaviour slightly (will print >>> adaptive size policy stuff in hs_err now) but I think that's for the >>> better. >>> >>> Incremental: >>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05.diff/ >>> >>> Full: >>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05 >>> >>> >>> Good now? >> >> Looks good. Not sure I follow your comment on hs_err. The adaptive >> size policy stuff prints to the normal log (with gc+ergo=debug). >> >> Before pushing I'll take the liberty of removing the extra space you >> added to the end of ParallelScavengeHeap::print_tracing_info(). >> >> 586 AdaptiveSizePolicyOutput::print(); >> 587 >> 588 } >> >> Stefan, ok to push? >> > Yes, this looks great! > > Thanks for cleaning this up Roman, > Stefan >> cheers, >> Per >> >>> >>> Thanks, >>> Roman >>> > From robbin.ehn at oracle.com Wed Jul 12 20:39:59 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 12 Jul 2017 22:39:59 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> Message-ID: <743ff172-88b6-30dc-8808-a4a97be4a571@oracle.com> On 2017-07-12 15:32, Roman Kennke wrote: > Hi Robbin and all, > > I fixed the 32bit failures by using jlong in all relevant places: > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ > Looks good! > > then Robbin found another problem. SafepointCleanupTest started to fail, > because "mark nmethods" is no longer printed. This made me think that > we're not measuring the conflated (and possibly parallelized) > deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with > "safepoint cleanup tasks" which measures the total duration of safepoint > cleanup. We can't reasonably measure a possibly parallel and conflated > pass standalone, but we can measure all and by subtrating all the other > subphases, get an idea how long deflation and nmethod marking take up. > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ > Looks good and thanks for fixing It's time to ship this, can we have a second review please! /Robbin > > The full webrev is now: > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ > > > Hope that's all ;-) > > Roman > > Am 10.07.2017 um 21:22 schrieb Robbin Ehn: >> Hi, unfortunately the push failed on 32-bit. >> >> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) >> >> I do not have anytime to look at this, so here is the error. >> >> /Robbin >> >> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' >> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >> member function 'long int nmethod::stack_traversal_mark()': >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >> error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >> note: candidates are: >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >> note: static jint OrderAccess::load_acquire(const volatile jint*) >> >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >> note: no known conversion for argument 1 from 'volatile long int*' >> to 'const volatile jint* {aka const volatile int*}' >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >> note: static juint OrderAccess::load_acquire(const volatile juint*) >> >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >> note: no known conversion for argument 1 from 'volatile long int*' >> to 'const volatile juint* {aka const volatile unsigned int*}' >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >> member function 'void nmethod::set_stack_traversal_mark(long int)': >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >> error: call of overloaded 'release_store(volatile long int*, long >> int&)' is ambiguous >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >> note: candidates are: >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >> note: static void OrderAccess::release_store(volatile jint*, jint) >> >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >> note: no known conversion for argument 1 from 'volatile long int*' >> to 'volatile jint* {aka volatile int*}' >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >> note: static void OrderAccess::release_store(volatile juint*, juint) >> >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >> note: no known conversion for argument 1 from 'volatile long int*' >> to 'volatile juint* {aka volatile unsigned int*}' >> >> On 2017-07-10 20:50, Robbin Ehn wrote: >>> I'll start a push now. >>> >>> /Robbin >>> >>> On 2017-07-10 12:38, Roman Kennke wrote: >>>> Ok, so I guess I need a sponsor for this now: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>> >>>> >>>> Roman >>>> >>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>>> >>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>>> > wrote: >>>>>> >>>>>> Hi Roman, >>>>>> >>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>>> Hi Robbin, >>>>>>>> >>>>>>>> Far down -> >>>>>>>> >>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> I'm not happy about this change: >>>>>>>>>> >>>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>>> + // TODO: Is this really needed? >>>>>>>>>> + OrderAccess::storestore(); >>>>>>>>>> + } >>>>>>>>>> >>>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>>> consistent >>>>>>>>>> with an OrderAccess::storestore() that's not properly documented >>>>>>>>>> which is only increasing the technical debt. >>>>>>>>>> >>>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>>> >>>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>>> sweeper) >>>>>>>>>>> is holding still. >>>>>>>>>> >>>>>>>>>> and: >>>>>>>>>> >>>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>>> sweeper.cpp... >>>>>>>>> >>>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>>> marking >>>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>>> (outside >>>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>>> storestore() >>>>>>>>> should be necessary. >>>>>>>>> >>>>>>>>> From Igor's comment I can see how it happened though: Apparently >>>>>>>>> there >>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>>> with >>>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>>> required >>>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>>> also put >>>>>>>>> a storestore() in the other places that call >>>>>>>>> mark_as_seen_on_stack(), >>>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>>> storestore() >>>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>>> 'for >>>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>>> necessary in >>>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>>> >>>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>>> Refactor the >>>>>>>>> code so that both paths at least call the storestore() in the same >>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>>>>>>> storestore() in the dtor as proposed?) >>>>>>>> >>>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>>> >>>>>>>> So there is a slight optimization when not running sweeper to skip >>>>>>>> compiler barrier/fence in stw. >>>>>>>> >>>>>>>> Don't think that matter, so I propose something like: >>>>>>>> - long stack_traversal_mark() { return >>>>>>>> _stack_traversal_mark; } >>>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>>> _stack_traversal_mark = l; } >>>>>>>> + long stack_traversal_mark() { return >>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>>> >>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>>> that >>>>>>>> it is concurrent accessed. >>>>>>>> And remove both storestore. >>>>>>>> >>>>>>>> "Also neither of these state variables are volatile in nmethod, so >>>>>>>> even the compiler may reorder the stores" >>>>>>>> Fortunately at least _state is volatile now. >>>>>>>> >>>>>>>> I think _state also should use la/rs semantics instead, but that's >>>>>>>> another story. >>>>>>> Like this? >>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>> >>>>>>> >>>>>> >>>>>> Yes, exactly, I like this! >>>>>> Dan? Igor ? Tobias? >>>>>> >>>>> >>>>> That seems correct. >>>>> >>>>> igor >>>>> >>>>>> Thanks Roman! >>>>>> >>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this >>>>>> thread/changeset to the end! >>>>>> >>>>>> /Robbin >>>>>> >>>>>>> Roman >>>>> >>>> > From email.sundarms at gmail.com Thu Jul 13 02:11:41 2017 From: email.sundarms at gmail.com (Sundara Mohan M) Date: Wed, 12 Jul 2017 19:11:41 -0700 Subject: High Reference Processing/Object Copy time Message-ID: Hi, I am observing a odd behaviour (very high ref proc time once) with G1GC gc log snippet, flags used Java HotSpot(TM) 64-Bit Server VM (25.112-b15) for linux-amd64 JRE (1.8.0_112-b15), built on Sep 22 2016 21:10:53 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8) Memory: 4k page, physical 132290100k(1065596k free), swap 132120572k(131992000k free) CommandLine flags: -XX:G1MaxNewSizePercent=30 -XX:G1OldCSetRegionThresholdPercent=20 -XX:GCLogFileSize=20971520 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=out-of-memory-heap-dump -XX:InitialHeapSize=33285996544 -XX:MaxGCPauseMillis=500 -XX:MaxHeapSize=33285996544 -XX:MetaspaceSize=536870912 -XX:NumberOfGCLogFiles=20 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+UnlockExperimentalVMOptions -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation -XX:+UseStringDeduplication .... 2017-07-12T17:02:40.227+0000: 77743.943: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 104857600 bytes, new threshold 2 (max 15) - age 1: 38456192 bytes, 38456192 total - age 2: 86746408 bytes, 125202600 total 77743.943: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 149039, predicted base time: 374.57 ms, remaining time: 125.43 ms, target pause time: 50 0.00 ms] 77743.943: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 174 regions, survivors: 24 regions, predicted young region time: 1277.98 ms] 77743.943: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 174 regions, survivors: 24 regions, old: 0 regions, predicted pause time: 1652.55 ms, target paus e time: 500.00 ms] 77751.132: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 21147680768 bytes, allocation reques t: 0 bytes, threshold: 14978698425 bytes (45.00 %), source: end of GC] , 7.1891696 secs] [Parallel Time: 2253.1 ms, GC Workers: 13] [GC Worker Start (ms): Min: 77743943.2, Avg: 77743943.3, Max: 77743943.4, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.7, Avg: 3.5, Max: 6.5, Diff: 4.8, Sum: 44.9] [Update RS (ms): Min: 39.2, Avg: 42.4, Max: 45.1, Diff: 5.9, Sum: 551.8] [Processed Buffers: Min: 26, Avg: 57.4, Max: 78, Diff: 52, Sum: 746] [Scan RS (ms): Min: 1.8, Avg: 3.7, Max: 4.5, Diff: 2.7, Sum: 47.5] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] * [Object Copy (ms): Min: 2198.1, Avg: 2198.7, Max: 2202.7, Diff: 4.6, Sum: 28583.3]* [Termination (ms): Min: 0.0, Avg: 4.5, Max: 4.9, Diff: 4.9, Sum: 58.4] [Termination Attempts: Min: 1, Avg: 16.7, Max: 28, Diff: 27, Sum: 217] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.4] [GC Worker Total (ms): Min: 2252.7, Avg: 2252.8, Max: 2252.9, Diff: 0.2, Sum: 29286.3] [GC Worker End (ms): Min: 77746196.1, Avg: 77746196.1, Max: 77746196.1, Diff: 0.0] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [String Dedup Fixup: 167.7 ms, GC Workers: 13] [Queue Fixup (ms): Min: 0.0, Avg: 0.4, Max: 1.2, Diff: 1.2, Sum: 5.1] [Table Fixup (ms): Min: 165.5, Avg: 165.9, Max: 166.3, Diff: 0.9, Sum: 2156.9] [Clear CT: 1.5 ms] [Other: 4766.8 ms] [Choose CSet: 0.0 ms] * [Ref Proc: 4763.9 ms]* [Ref Enq: 0.8 ms] [Redirty Cards: 0.7 ms] [Humongous Register: 0.2 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.4 ms] * [Eden: 1392.0M(1392.0M)->0.0B(1440.0M) Survivors: 192.0M->144.0M Heap: 20.8G(31.0G)->19.6G(31.0G)]* * [Times: user=22.82 sys=13.83, real=7.19 secs]* *Question* 1. Is there a way to find out why Ref Proc took 4.7 s at this instance only? all other instances it was less than a second. 2. Why object copy took 2.1s even though young gen region size is 1.3G in this case and there was not much garbage collected in this case. 3. Why is this happening occasionally and is there a way to enable more logs when this happens. Thanks, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecki at zusammenkunft.net Thu Jul 13 04:58:51 2017 From: ecki at zusammenkunft.net (Bernd Eckenfels) Date: Thu, 13 Jul 2017 04:58:51 +0000 Subject: High Reference Processing/Object Copy time In-Reply-To: References: Message-ID: The sys time is very high in this snippet, how to the other snippets compare? Did you turn off transparent huge pages (THP) in your OS and is there no swapping happening? BTW: this is more a discussion for the user mailing list. Gruss Bernd -- http://bernd.eckenfels.net ________________________________ From: hotspot-gc-dev on behalf of Sundara Mohan M Sent: Thursday, July 13, 2017 4:11:41 AM To: hotspot-gc-dev at openjdk.java.net Subject: High Reference Processing/Object Copy time Hi, I am observing a odd behaviour (very high ref proc time once) with G1GC gc log snippet, flags used Java HotSpot(TM) 64-Bit Server VM (25.112-b15) for linux-amd64 JRE (1.8.0_112-b15), built on Sep 22 2016 21:10:53 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8) Memory: 4k page, physical 132290100k(1065596k free), swap 132120572k(131992000k free) CommandLine flags: -XX:G1MaxNewSizePercent=30 -XX:G1OldCSetRegionThresholdPercent=20 -XX:GCLogFileSize=20971520 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=out-of-memory-heap-dump -XX:InitialHeapSize=33285996544 -XX:MaxGCPauseMillis=500 -XX:MaxHeapSize=33285996544 -XX:MetaspaceSize=536870912 -XX:NumberOfGCLogFiles=20 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+UnlockExperimentalVMOptions -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation -XX:+UseStringDeduplication .... 2017-07-12T17:02:40.227+0000: 77743.943: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 104857600 bytes, new threshold 2 (max 15) - age 1: 38456192 bytes, 38456192 total - age 2: 86746408 bytes, 125202600 total 77743.943: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 149039, predicted base time: 374.57 ms, remaining time: 125.43 ms, target pause time: 50 0.00 ms] 77743.943: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 174 regions, survivors: 24 regions, predicted young region time: 1277.98 ms] 77743.943: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 174 regions, survivors: 24 regions, old: 0 regions, predicted pause time: 1652.55 ms, target paus e time: 500.00 ms] 77751.132: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 21147680768 bytes, allocation reques t: 0 bytes, threshold: 14978698425 bytes (45.00 %), source: end of GC] , 7.1891696 secs] [Parallel Time: 2253.1 ms, GC Workers: 13] [GC Worker Start (ms): Min: 77743943.2, Avg: 77743943.3, Max: 77743943.4, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.7, Avg: 3.5, Max: 6.5, Diff: 4.8, Sum: 44.9] [Update RS (ms): Min: 39.2, Avg: 42.4, Max: 45.1, Diff: 5.9, Sum: 551.8] [Processed Buffers: Min: 26, Avg: 57.4, Max: 78, Diff: 52, Sum: 746] [Scan RS (ms): Min: 1.8, Avg: 3.7, Max: 4.5, Diff: 2.7, Sum: 47.5] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 2198.1, Avg: 2198.7, Max: 2202.7, Diff: 4.6, Sum: 28583.3] [Termination (ms): Min: 0.0, Avg: 4.5, Max: 4.9, Diff: 4.9, Sum: 58.4] [Termination Attempts: Min: 1, Avg: 16.7, Max: 28, Diff: 27, Sum: 217] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.4] [GC Worker Total (ms): Min: 2252.7, Avg: 2252.8, Max: 2252.9, Diff: 0.2, Sum: 29286.3] [GC Worker End (ms): Min: 77746196.1, Avg: 77746196.1, Max: 77746196.1, Diff: 0.0] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [String Dedup Fixup: 167.7 ms, GC Workers: 13] [Queue Fixup (ms): Min: 0.0, Avg: 0.4, Max: 1.2, Diff: 1.2, Sum: 5.1] [Table Fixup (ms): Min: 165.5, Avg: 165.9, Max: 166.3, Diff: 0.9, Sum: 2156.9] [Clear CT: 1.5 ms] [Other: 4766.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4763.9 ms] [Ref Enq: 0.8 ms] [Redirty Cards: 0.7 ms] [Humongous Register: 0.2 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.4 ms] [Eden: 1392.0M(1392.0M)->0.0B(1440.0M) Survivors: 192.0M->144.0M Heap: 20.8G(31.0G)->19.6G(31.0G)] [Times: user=22.82 sys=13.83, real=7.19 secs] Question 1. Is there a way to find out why Ref Proc took 4.7 s at this instance only? all other instances it was less than a second. 2. Why object copy took 2.1s even though young gen region size is 1.3G in this case and there was not much garbage collected in this case. 3. Why is this happening occasionally and is there a way to enable more logs when this happens. Thanks, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Thu Jul 13 08:06:34 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 13 Jul 2017 10:06:34 +0200 Subject: High Reference Processing/Object Copy time In-Reply-To: References: Message-ID: <1499933194.2815.11.camel@oracle.com> Hi, On Thu, 2017-07-13 at 04:58 +0000, Bernd Eckenfels wrote: > The sys time is very high in this snippet, how to the other snippets > compare? Did you turn off transparent huge pages (THP) in your OS and > is there no swapping happening? > The documentation offers some more potential issues when there is high system time:?https://docs.oracle.com/javase/9/gctuning/garbage-first-ga rbage-collector-tuning.htm#GUID-8D9B2530-E370-4B8B-8ADD-A43674FC6658?(T he section is applicable to both JDK8 and 9). The VM/garbage collector is a user level program. High system time (at least as high as in your snippet) strongly indicates a problem in the environment (or in the interaction with your environment, i.e. memory or I/O related). > BTW: this is more a discussion for the user mailing list. Agree, please move to the hotspot-gc-use list which is more appropriate. Thanks, ? Thomas From erik.helin at oracle.com Thu Jul 13 11:09:24 2017 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 13 Jul 2017 13:09:24 +0200 Subject: RFR (XS): 8183538: UpdateRS phase should claim cards In-Reply-To: <1499861747.6693.6.camel@oracle.com> References: <1499861747.6693.6.camel@oracle.com> Message-ID: <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com> Hi Thomas, On 07/12/2017 02:15 PM, Thomas Schatzl wrote: > Hi all, > > please review this small change that adds claiming of cards in the > update rs phase so that scan rs does not rescan them. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183538 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183538/webrev/ looks good, Reviewed. I was trying to find a way where we could utilize the claim_card function, but could not come up with a good approach. Push this and then we can see if we can reduce the slight code/logic duplication later. Thanks, Erik > Testing: > jprt > > Thanks, > Thomas > From thomas.schatzl at oracle.com Thu Jul 13 11:35:12 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 13 Jul 2017 13:35:12 +0200 Subject: RFR (XS): 8183538: UpdateRS phase should claim cards In-Reply-To: <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com> References: <1499861747.6693.6.camel@oracle.com> <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com> Message-ID: <1499945712.2756.2.camel@oracle.com> Hi, On Thu, 2017-07-13 at 13:09 +0200, Erik Helin wrote: > Hi Thomas, > > On 07/12/2017 02:15 PM, Thomas Schatzl wrote: > > > > Hi all, > > > > ? please review this small change that adds claiming of cards in > > the > > update rs phase so that scan rs does not rescan them. > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8183538 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8183538/webrev/ > looks good, Reviewed. > > I was trying to find a way where we could utilize the claim_card? > function, but could not come up with a good approach. Push this and > then we can see if we can reduce the slight code/logic duplication > later. ? yes, me too :) All variants I could think of would penalize one or the other phase. Thanks for your review. Thanks, ? Thomas From erik.helin at oracle.com Thu Jul 13 14:53:12 2017 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 13 Jul 2017 16:53:12 +0200 Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set In-Reply-To: References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> Message-ID: On 07/04/2017 02:17 PM, Mikael Gerdin wrote: > Hi Erik, > > Do you know if any of the tests actually would have failed if rem set > reconstruction after evacuation failure didn't work properly? > > I'd feel safer with this change if you ran with some verification code > to ensure that the into_cset queue was always useless when evac failure > occurs. Good point, I have now run GCBasher for a very long time with: -XX:+G1EvacuationFailALot -XX:G1EvacuationFailureALotCount=5 -XX:+VerifyBeforeGC -XX:+VerifyAfterGC This mean that GCBasher encounters a (forced) evacuation failure every fifth GC and also runs full verification for every GC. So far it has been working fine. I have also run all tests in the JTReg group hotspot_gc with G1EvacuationFailALot set to true (in g1_globals.hpp) and G1EvacuationFailureALotCount set to 5 (also in g1_globals.hpp). This mean that all GC tests (including the stress tests) encountered an evacuation failure every fifth GC. This also worked fine. I also wrote a new patch against tip (where _into_cset_dcqs is still present) to do some custom verification. The contents of G1RemSet::_into_cset_dcqs and G1CollectedHeap::_dirty_card_queue_set should be identical after a collection. This sort-of worked :) The queues are *very* similar (often around 98% of the cards in G1RemSet::_into_cset_dcqs are found in G1CollectedHeap::_dirty_card_queue_set). The reason for the "missing cards" is that cards in G1RemSet::_into_cset_dcqs comes from the post-write barrier, and the post-write barrier dirties the card that contains the object header (except for arrays, where it dirties the field/slot). The cards in G1CollectedHeap::_dirty_card_queue_set comes from G1ParScanThreadState::update_rs, and update_rs always dirties the card that contains the field (*not* the header). Hence, if an object crosses card boundaries, then the post-write barrier and update_rs will dirty different cards. This has no impact on correctness, it is like this for performance reasons (dirtying the card that contains the object header leads to fewer dirty cards, but we don't have quick access to the object header in update_rs). So, with the above, I'm fairly confident (famous last words) that this patch is working :) I also rebased this patch on top of all the latest changes: - http://cr.openjdk.java.net/~ehelin/8183539/01/ (it is the same patch, just rebased) Thanks, Erik > Thanks > /Mikael > >> >> Thanks, >> Erik From thomas.schatzl at oracle.com Thu Jul 13 15:06:57 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 13 Jul 2017 17:06:57 +0200 Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set In-Reply-To: References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> Message-ID: <1499958417.2756.4.camel@oracle.com> Hi Erik, On Thu, 2017-07-13 at 16:53 +0200, Erik Helin wrote: > On 07/04/2017 02:17 PM, Mikael Gerdin wrote: > > > > Hi Erik, > > > > Do you know if any of the tests actually would have failed if rem > > set > > reconstruction after evacuation failure didn't work properly? > > > > I'd feel safer with this change if you ran with some verification > > code to ensure that the into_cset queue was always useless when > > evac failure occurs. > > Good point, I have now run GCBasher for a very long time with: > -XX:+G1EvacuationFailALot -XX:G1EvacuationFailureALotCount=5? > -XX:+VerifyBeforeGC -XX:+VerifyAfterGC > > This mean that GCBasher encounters a (forced) evacuation failure > every fifth GC and also runs full verification for every GC. So far > it has been working fine. > > I have also run all tests in the JTReg group hotspot_gc with? > G1EvacuationFailALot set to true (in g1_globals.hpp) and? > G1EvacuationFailureALotCount set to 5 (also in g1_globals.hpp). This? > mean that all GC tests (including the stress tests) encountered an? > evacuation failure every fifth GC. This also worked fine. > > I also wrote a new patch against tip (where _into_cset_dcqs is still? > present) to do some custom verification. The contents of? > G1RemSet::_into_cset_dcqs and G1CollectedHeap::_dirty_card_queue_set? > should be identical after a collection. This sort-of worked :) > > The queues are *very* similar (often around 98% of the cards in? > G1RemSet::_into_cset_dcqs are found in > G1CollectedHeap::_dirty_card_queue_set). The reason for the "missing > cards" is that cards in G1RemSet::_into_cset_dcqs comes from the? > post-write barrier, and the post-write barrier dirties the card that? > contains the object header (except for arrays, where it dirties the? > field/slot). The cards in G1CollectedHeap::_dirty_card_queue_set > comes from G1ParScanThreadState::update_rs, and update_rs always > dirties the card that contains the field (*not* the header). Hence, > if an object crosses card boundaries, then the post-write barrier and > update_rs will dirty different cards. This has no impact on > correctness, it is like this for performance reasons (dirtying the > card that contains the object header leads to fewer dirty cards, but > we don't have quick access to the object header in update_rs). > > So, with the above, I'm fairly confident (famous last words) that > this patch is working :) Thanks for this thorough investigation, sounds good. Ship it. Thomas From thomas.schatzl at oracle.com Fri Jul 14 09:35:04 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 14 Jul 2017 11:35:04 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap Message-ID: <1500024904.3458.8.camel@oracle.com> Hi all, ? can I have reviews for this change that tries to clean up (and only clean up) the G1CMBitMap class (and the surrounding helper classes) a bit? What has been done: - fix naming - improve visibility of methods - remove superfluous code - make G1CMBitMapClosure pass a HeapWord* instead of a bitmap index, avoiding that the user code is cluttered with conversions from bitmap indices to HeapWords - remove inheritance between G1CMBitMap and G1CMBitMapRO, similar to the BitMap class make G1CMBitMapRO a "view" of G1CMBitMap. - remove unused code in G1CMBitMapRO - move method implementations into .inline.hpp file The next CR JDK-8184347 will deal with moving G1CMBitmap* into separate files. CR: https://bugs.openjdk.java.net/browse/JDK-8184346 Webrev: http://cr.openjdk.java.net/~tschatzl/8184346/webrev/ Testing: jprt Thanks, ? Thomas From rkennke at redhat.com Fri Jul 14 09:53:16 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 14 Jul 2017 11:53:16 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <1500024904.3458.8.camel@oracle.com> References: <1500024904.3458.8.camel@oracle.com> Message-ID: <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> Hi Thomas, > Hi all, > > can I have reviews for this change that tries to clean up (and only > clean up) the G1CMBitMap class (and the surrounding helper classes) a > bit? > > What has been done: > - fix naming > - improve visibility of methods > - remove superfluous code > - make G1CMBitMapClosure pass a HeapWord* instead of a bitmap index, > avoiding that the user code is cluttered with conversions from bitmap > indices to HeapWords > - remove inheritance between G1CMBitMap and G1CMBitMapRO, similar to > the BitMap class make G1CMBitMapRO a "view" of G1CMBitMap. > - remove unused code in G1CMBitMapRO > - move method implementations into .inline.hpp file The changes look good to me. + return _cm->nextMarkBitMap()->is_marked((HeapWord *)obj); I'd write that as (HeapWord*) obj but I'm never quite sure what style is preferable in Hotspot ;-) Are changes in g1FromCardCache.cpp/.hpp unrelated? > The next CR JDK-8184347 will deal with moving G1CMBitmap* into separate > files. And while you're at it, you may want to move it to gc/shared and renamed it to something like MarkBitmap? https://bugs.openjdk.java.net/browse/JDK-8180193 Best regards, Roman (not official reviewer) -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.helin at oracle.com Fri Jul 14 10:21:54 2017 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 14 Jul 2017 12:21:54 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> Message-ID: <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> On 07/10/2017 04:10 PM, Roman Kennke wrote: > Am 10.07.2017 um 15:13 schrieb Erik Helin: >> On 07/07/2017 03:21 PM, Roman Kennke wrote: >>> Am 07.07.2017 um 14:35 schrieb Erik Helin: >>>> On 07/06/2017 06:23 PM, Roman Kennke wrote: >>>>>>>> Ok to push this? >>>>>> >>>>>> I just realized that your change doesn't build on Windows since you >>>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really >>>>>> picky >>>>>> about that. >>>>>> /Mikael >>>>> >>>>> Uhhh. >>>>> Ok, here's revision #3 with precompiled added in: >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/ >>>>> >>>> >>>> Hi Roman, >>>> >>>> I just started looking :) I think GenCollectedHeap::gc_prologue and >>>> GenCollectedHeap::gc_epilogue should be virtual, and >>>> always_do_update_barrier = UseConcMarkSweepGC moved down >>>> CMSHeap::gc_epilogue. >>>> >>>> What do you think? >>> >>> Yes, I have seen that. My original plan was to leave it as is because I >>> know that Erik ?. is working on a big barrier set refactoring that would >>> remove this code anyway. However, it doesn't really matter, here's the >>> cleaned up patch: >>> >>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/ >>> >> >> A few comments: >> >> cmsHeap.hpp: >> - you are missing quite a few #includes, but it works since >> genCollectedHeap.hpp #includes a whole lot of stuff. Not necessary to >> fix now, because the "missing #include" will start to pop up when >> someone tries to break apart GenCollectedHeap into smaller pieces. > Right. > I always try to minimize includes, especially in header files (they are > bound to proliferate later anyway). In addition to that, if a class is > only referenced as pointer, I avoid includes and use forward class > definition instead. I think that we in general try to include what is needed, not what only what makes the code compile (header guards will of course ensure that the header files are only parsed once). So in cmsHeap.hpp, at least I would have added: #include "gc/cms/concurrentMarkSweepGeneration.hpp" #include "gc/shared/collectedHeap.hpp" #include "gc/shared/gcCause.hpp" and forward declared: class CLDClosure; class OopsInGenClosure; class outputStream; class StrongRootsScope; class ThreadClosure; >> >> - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they >> be private in CMSHeap? > They are virtual and protected in GenCollectedHeap and called by > GenCollectedHeap. Makes sense to also make them protected in CMSHeap? Or > am I missing something? > >> - there are two `private:` blocks, please use only one `private:` >> block. >> > Fixed. And now there is two `protected:` blocks, immediately after each other: 86 protected: 87 void gc_prologue(bool full); 88 void gc_epilogue(bool full); 89 90 protected: 91 // Accessor for memory state verification support 92 NOT_PRODUCT( 93 virtual size_t skip_header_HeapWords() { return CMSCollector::skip_header_HeapWords(); } 94 ) IMO, I would just make the three functions above private. I know they are protected in GenCollectedHeap, but it should be fine to have them private in CMSHeap. Having them protected signals, at least to me, that this class could be considered as a base class (protected to me reads "this can be accessed by classes inheriting from this class), and we don't want any class to inherit from CMSHeap. >> - one extra newline here: >> 32 class CMSHeap : public GenCollectedHeap { >> 33 >> >> - one extra newline here: >> 46 >> 47 >> >> cmsHeap.cpp: >> - one extra newline here: >> 36 CMSHeap::CMSHeap(GenCollectorPolicy *policy) : >> GenCollectedHeap(policy) { >> 37 >> >> - one extra newline here: >> 65 >> 66 >> > Removed all of them. > >> - do you need to use `this` here? >> 87 this->GenCollectedHeap::print_on_error(st); >> >> Isn't it enough to just GenCollectedHeap::print_on_error(st)? > Yes, it is. Just a habit of mine to make it more readable (to me). Fixed it. >> - one extra newline here: >> 92 bool CMSHeap::create_cms_collector() { >> 93 > Fixed. >> - this is pre-existing, but since we are copying code, do we want to >> clean it up? >> 104 if (collector == NULL || >> !collector->completed_initialization()) { >> 105 if (collector) { >> 106 delete collector; // Be nice in embedded situation >> 107 } >> 108 vm_shutdown_during_initialization("Could not create CMS >> collector"); >> 109 return false; >> 110 } >> >> The collector == NULL check is not needed here. CMSCollector derives >> from CHeapObj and CHeapObj::operator new will by default do >> vm_exit_out_of_memory if the returned memory is NULL. The check can >> just be: >> >> if (!collector->completed_initialization()) { >> vm_shutdown_during_initialization("Could not create CMS collector"); >> return false; >> } >> return true; >> > Ok, good point. Fixed. Sorry, reading the code again it is obvious that create_cms_collector never can return false. It either returns true or calls vm_shutdown_during_initialization (which will not return). So, I would just make create_cms_collctor void, the if branch below is dead code: 51 if (!success) return JNI_ENOMEM; Btw, this code looks really fishy :) The CMSCollector is created with new but the pointer (collector) is never stored anywhere. It works, becaues the constructor for CMSCollector sets a static variable in ConcurrentMarkSweepGeneration, but it isn't exactly beautiful :) Don't change this now, I just wanted to point it out, since the code looks a bit mysterious. >> - maybe skip the // success comment here: >> 111 return true; // success > That was probably pre-existing too. Should be thankful that it did not > say return true; // return true :-P > >> - is it possible to end up in CMSHeap::should_do_concurrent_full_gc() >> if we are not using CMS? As in: >> 123 bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) { >> 124 if (!UseConcMarkSweepGC) { >> 125 return false; >> 126 } >> > Duh. Fixed. > >> - one extra newline here: >> 135 >> 136 >> >> genCollectedHeap.hpp: >> - I don't think you have to make _skip_header_HeapWords protected. >> Instead I think we can skip_header_HeapWords() virtual, make it >> return 0 in GenCollectedHeap and return >> CMSCollector::skip_header_HeapWords in CMSHeap and just remove the _ >> skip_header_HeapWords variable. > Great catch! I love it when refactoring leads to simplifications... > Fixed. >> - do you really need #ifdef ASSERT around check_gen_kinds? >> > No, not really. > >> - can you make GCH_strong_roots_tasks a protected enum in >> GenCollectedHeap? As in >> class GenCollectedHeap : public CollectedHeap { >> protected: >> enum StrongRootTasks { >> GCH_PS_Universe_oops_do, >> }; >> }; >> > Good idea. Done. > >> Have you though about vmStructs.cpp, does it need any changes? > No. I don't really know what needs to go in there. I added: > > declare_constant(CollectedHeap::CMSHeap) \ > > just so that it's there next to the other heap types. Not sure what else > may be needed, if anything? This is for the serviceability agent. You will have to poke around in hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used. Unfortunately I'm not that familiar with the agent, perhaps someone else can chime in here? Thanks, Erik > http://cr.openjdk.java.net/~rkennke/8179387/webrev.05/ > > > Better now? > > Roman > From thomas.schatzl at oracle.com Fri Jul 14 10:58:32 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 14 Jul 2017 12:58:32 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> Message-ID: <1500029912.3458.26.camel@oracle.com> Hi Roman, On Fri, 2017-07-14 at 11:53 +0200, Roman Kennke wrote: > Hi Thomas, > > > Hi all, > > > > ? can I have reviews for this change that tries to clean up (and > > only clean up) the G1CMBitMap class (and the surrounding helper > > classes) a bit? > > > > What has been done: > > - fix naming > > - improve visibility of methods > > - remove superfluous code > > - make G1CMBitMapClosure pass a HeapWord* instead of a bitmap > > index, avoiding that the user code is cluttered with conversions > > from bitmap indices to HeapWords > > - remove inheritance between G1CMBitMap and G1CMBitMapRO, similar > > to the BitMap class make G1CMBitMapRO a "view" of G1CMBitMap. > > - remove unused code in G1CMBitMapRO > > - move method implementations into .inline.hpp file > ?The changes look good to me. Thanks for your review. > + return _cm->nextMarkBitMap()->is_marked((HeapWord *)obj); > > I'd write that as > > (HeapWord*) obj > > but I'm never quite sure what style is preferable in Hotspot ;-) I do not know either :) I would kind of prefer no space between cast and the variable, as casts to me are something like unary operators where we do not add a space between operator and variable either. I removed the space between the type and the star at least. > Are changes in? g1FromCardCache.cpp/.hpp unrelated? Yes, sorry. I will remove those and send out an extra RFR. I forgot to split them out. > > The next CR JDK-8184347 will deal with moving G1CMBitmap* into > > separate > > files. > ?And while you're at it, you may want to move it to gc/shared and > renamed it to something like MarkBitmap? > https://bugs.openjdk.java.net/browse/JDK-8180193 > Not particularly against this change, but I think we should do the move and renaming separately when the change is actually required, i.e. just before there is another dependency on it. Also, G1CMBitMap has hard dependencies on several other G1 specific classes, so I think it is too early to move it to the shared directory from that POV too. New webrevs: http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1/?(full) http://cr.openjdk.java.net/~tschatzl/8184346/webrev.0_to_1/?(diff) Thanks, ? Thomas From thomas.schatzl at oracle.com Fri Jul 14 11:04:57 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 14 Jul 2017 13:04:57 +0200 Subject: RFR (XS): 8184452: Add bounds checking for FromCardCache Message-ID: <1500030297.3458.29.camel@oracle.com> Hi all, ? can I have reviews for this change that adds asserts/bounds checking to the FromCardCache methods? This helped me a lot to find crashes in some upcoming change, and I think it is useful to have. If you think it is not worth the trouble, feel free to tell me and I will retract the change. CR: https://bugs.openjdk.java.net/browse/JDK-8184452 Webrev: http://cr.openjdk.java.net/~tschatzl/8184452/webrev/ Testing: jprt Thanks, ? Thomas From shade at redhat.com Fri Jul 14 11:12:18 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 14 Jul 2017 13:12:18 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <1500029912.3458.26.camel@oracle.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> Message-ID: Hi Thomas, On 07/14/2017 12:58 PM, Thomas Schatzl wrote: >>> The next CR JDK-8184347 will deal with moving G1CMBitmap* into >>> separate >>> files. >> And while you're at it, you may want to move it to gc/shared and >> renamed it to something like MarkBitmap? >> https://bugs.openjdk.java.net/browse/JDK-8180193 >> > > Not particularly against this change, but I think we should do the move > and renaming separately when the change is actually required, i.e. just > before there is another dependency on it. I think this would be inconvenient, because when "another dependency" would come in a large webrev, it would have to include the CMBitmap move too, complicating reviews. It seems pulling the actual non-G1-specific parts to shared is good to minimize those changes. Would you like us to do take the CMBitmap rename and move to shared/ then, after you do G1-local move? Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Fri Jul 14 11:12:37 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 14 Jul 2017 13:12:37 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <1500029912.3458.26.camel@oracle.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> Message-ID: <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com> Hi Thomas, > Hi Roman, > > On Fri, 2017-07-14 at 11:53 +0200, Roman Kennke wrote: >> Hi Thomas, >> >>> Hi all, >>> >>> can I have reviews for this change that tries to clean up (and >>> only clean up) the G1CMBitMap class (and the surrounding helper >>> classes) a bit? >>> >>> What has been done: >>> - fix naming >>> - improve visibility of methods >>> - remove superfluous code >>> - make G1CMBitMapClosure pass a HeapWord* instead of a bitmap >>> index, avoiding that the user code is cluttered with conversions >>> from bitmap indices to HeapWords >>> - remove inheritance between G1CMBitMap and G1CMBitMapRO, similar >>> to the BitMap class make G1CMBitMapRO a "view" of G1CMBitMap. >>> - remove unused code in G1CMBitMapRO >>> - move method implementations into .inline.hpp file >> The changes look good to me. > Thanks for your review. > >> + return _cm->nextMarkBitMap()->is_marked((HeapWord *)obj); >> >> I'd write that as >> >> (HeapWord*) obj >> >> but I'm never quite sure what style is preferable in Hotspot ;-) > I do not know either :) I would kind of prefer no space between cast > and the variable, as casts to me are something like unary operators > where we do not add a space between operator and variable either. > > I removed the space between the type and the star at least. Fine for me. >> Are changes in g1FromCardCache.cpp/.hpp unrelated? > Yes, sorry. I will remove those and send out an extra RFR. I forgot to > split them out. Ok. >>> The next CR JDK-8184347 will deal with moving G1CMBitmap* into >>> separate >>> files. >> And while you're at it, you may want to move it to gc/shared and >> renamed it to something like MarkBitmap? >> https://bugs.openjdk.java.net/browse/JDK-8180193 >> > Not particularly against this change, but I think we should do the move > and renaming separately when the change is actually required, i.e. just > before there is another dependency on it. That's fine for me. Just wanted to point out that this is going to come :-) > New webrevs: > http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1/ (full) > http://cr.openjdk.java.net/~tschatzl/8184346/webrev.0_to_1/ (diff) Good for me. Roman From rkennke at redhat.com Fri Jul 14 11:15:09 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 14 Jul 2017 13:15:09 +0200 Subject: RFR (XS): 8184452: Add bounds checking for FromCardCache In-Reply-To: <1500030297.3458.29.camel@oracle.com> References: <1500030297.3458.29.camel@oracle.com> Message-ID: Am 14.07.2017 um 13:04 schrieb Thomas Schatzl: > Hi all, > > can I have reviews for this change that adds asserts/bounds checking > to the FromCardCache methods? > > This helped me a lot to find crashes in some upcoming change, and I > think it is useful to have. If you think it is not worth the trouble, > feel free to tell me and I will retract the change. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8184452 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8184452/webrev/ > Testing: > jprt > > Thanks, > Thomas > I'm all for more asserts if it helps to figure out bugs, so yes. Change looks good too. Roman (not official reviewer) From thomas.schatzl at oracle.com Fri Jul 14 11:19:18 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 14 Jul 2017 13:19:18 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() Message-ID: <1500031158.3458.41.camel@oracle.com> Hi all, ? could I get reviews for this refactoring change that merges?G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() into one single method G1ConcurrentMark::mark_in_next_bitmap() that factors out all code that is otherwise done multiple times separately. I.e. checking that the given address is smaller than nTAMS, asserts, dirty card mark check and the actual card mark into a single file that is then called everywhere. I also removed some superfluous asserts that are subsumed in previous asserts or methods used. It can also be seen as start of cleaning up G1ConcurrentMark. I intentionally left both?G1ParCopyHelper::mark_object() and?G1ParCopyHelper::mark_forwarded_object(), although they look like they could be merged. First, it does not seem worthwhile because their semantics and asserts seem to be separate enough, second, some probably not-so-distant future change will need them separate again :P If you really want I could do that nevertheless. Note that this change depends on the recent G1CMBitMap cleanup in JDK- 8184346 (but not on the move of G1CMBitMap into separate files that is referenced in the webrev - the file touched just happens to be in the mq stack).? CR: https://bugs.openjdk.java.net/browse/JDK-8184348 Webrev: http://cr.openjdk.java.net/~tschatzl/8184348/webrev/ Testing: jprt, some additional local hotspot test runs Thanks, ? Thomas From shade at redhat.com Fri Jul 14 11:20:44 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 14 Jul 2017 13:20:44 +0200 Subject: RFR (XS): 8184452: Add bounds checking for FromCardCache In-Reply-To: <1500030297.3458.29.camel@oracle.com> References: <1500030297.3458.29.camel@oracle.com> Message-ID: On 07/14/2017 01:04 PM, Thomas Schatzl wrote: > Webrev: > http://cr.openjdk.java.net/~tschatzl/8184452/webrev/ Looks good. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Fri Jul 14 11:22:08 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 14 Jul 2017 13:22:08 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> Message-ID: <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com> Hi Erik, > On 07/10/2017 04:10 PM, Roman Kennke wrote: >> Am 10.07.2017 um 15:13 schrieb Erik Helin: >>> On 07/07/2017 03:21 PM, Roman Kennke wrote: >>>> Am 07.07.2017 um 14:35 schrieb Erik Helin: >>>>> On 07/06/2017 06:23 PM, Roman Kennke wrote: >>>>>>>>> Ok to push this? >>>>>>> >>>>>>> I just realized that your change doesn't build on Windows since you >>>>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really >>>>>>> picky >>>>>>> about that. >>>>>>> /Mikael >>>>>> >>>>>> Uhhh. >>>>>> Ok, here's revision #3 with precompiled added in: >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/ >>>>>> >>>>> >>>>> Hi Roman, >>>>> >>>>> I just started looking :) I think GenCollectedHeap::gc_prologue and >>>>> GenCollectedHeap::gc_epilogue should be virtual, and >>>>> always_do_update_barrier = UseConcMarkSweepGC moved down >>>>> CMSHeap::gc_epilogue. >>>>> >>>>> What do you think? >>>> >>>> Yes, I have seen that. My original plan was to leave it as is >>>> because I >>>> know that Erik ?. is working on a big barrier set refactoring that >>>> would >>>> remove this code anyway. However, it doesn't really matter, here's the >>>> cleaned up patch: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/ >>>> >>> >>> A few comments: >>> >>> cmsHeap.hpp: >>> - you are missing quite a few #includes, but it works since >>> genCollectedHeap.hpp #includes a whole lot of stuff. Not >>> necessary to >>> fix now, because the "missing #include" will start to pop up when >>> someone tries to break apart GenCollectedHeap into smaller pieces. >> Right. >> I always try to minimize includes, especially in header files (they are >> bound to proliferate later anyway). In addition to that, if a class is >> only referenced as pointer, I avoid includes and use forward class >> definition instead. > > I think that we in general try to include what is needed, not what > only what makes the code compile (header guards will of course ensure > that the header files are only parsed once). So in cmsHeap.hpp, at > least I would have added: > > #include "gc/cms/concurrentMarkSweepGeneration.hpp" > #include "gc/shared/collectedHeap.hpp" > #include "gc/shared/gcCause.hpp" > > and forward declared: > > class CLDClosure; > class OopsInGenClosure; > class outputStream; > class StrongRootsScope; > class ThreadClosure; Ok, added those and some more that I found. Not sure why we'd need #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out for now. >>> >>> - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they >>> be private in CMSHeap? >> They are virtual and protected in GenCollectedHeap and called by >> GenCollectedHeap. Makes sense to also make them protected in CMSHeap? Or >> am I missing something? >> >>> - there are two `private:` blocks, please use only one `private:` >>> block. >>> >> Fixed. > > And now there is two `protected:` blocks, immediately after each other: > Duh. Fixed. > IMO, I would just make the three functions above private. I know they > are protected in GenCollectedHeap, but it should be fine to have them > private in CMSHeap. Having them protected signals, at least to me, > that this class could be considered as a base class (protected to me > reads "this can be accessed by classes inheriting from this class), > and we don't want any class to inherit from CMSHeap. How can they be called from the superclass if they are private in the subclass? Would that work in C++? protected (to me) means visibility between super and subclasses. If I'd want to signal that I intend that to be overridden, I'd say 'virtual'. > Sorry, reading the code again it is obvious that create_cms_collector > never can return false. It either returns true or calls > vm_shutdown_during_initialization (which will not return). So, I would > just make create_cms_collctor void, the if branch below is dead code: > > 51 if (!success) return JNI_ENOMEM; > Right! Very good catch! Changed that. > Btw, this code looks really fishy :) Err, yep. I'll make a note somewhere (in bugs.o.j.n) to fix that later. > This is for the serviceability agent. You will have to poke around in > hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used. > Unfortunately I'm not that familiar with the agent, perhaps someone > else can chime in here? Considering that the remaining references to GenCollectedHeap in vmStructs.cpp don't look like related to CMSHeap, I'd argue that what I did is all that's needed for now. Do you agree? http://cr.openjdk.java.net/~rkennke/8179387/webrev.06.diff/ http://cr.openjdk.java.net/~rkennke/8179387/webrev.06/ Thanks for reviewing! Roman From rkennke at redhat.com Fri Jul 14 11:24:47 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 14 Jul 2017 13:24:47 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> Message-ID: <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> Am 14.07.2017 um 13:12 schrieb Aleksey Shipilev: > Hi Thomas, > > On 07/14/2017 12:58 PM, Thomas Schatzl wrote: >>>> The next CR JDK-8184347 will deal with moving G1CMBitmap* into >>>> separate >>>> files. >>> And while you're at it, you may want to move it to gc/shared and >>> renamed it to something like MarkBitmap? >>> https://bugs.openjdk.java.net/browse/JDK-8180193 >>> >> Not particularly against this change, but I think we should do the move >> and renaming separately when the change is actually required, i.e. just >> before there is another dependency on it. > I think this would be inconvenient, because when "another dependency" would come > in a large webrev, it would have to include the CMBitmap move too, complicating > reviews. I understood it such that we would post the moving around of gc/g1 files to gc/shared right before we'd post Shenandoah (in the not-so-distant future, hopefully). That would work for me. I wouldn't like to include everything in a giant webrev :-P Roman From shade at redhat.com Fri Jul 14 11:25:53 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 14 Jul 2017 13:25:53 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: <1500031158.3458.41.camel@oracle.com> References: <1500031158.3458.41.camel@oracle.com> Message-ID: On 07/14/2017 01:19 PM, Thomas Schatzl wrote: > Webrev: > http://cr.openjdk.java.net/~tschatzl/8184348/webrev/ *) I'd probably split the assert with newlines. Makes webrevs tidier! *) This is not needed, because par_mark already has the optimistic check, down below in Bitmap::par_set_bit? 54 // Dirty read to avoid CAS. 55 if (_nextMarkBitMap->is_marked(obj_addr)) { 56 return false; 57 } *) So, mark_reference_grey used to be called from G1CMSATBBufferClosure on objects below TAMS, but now it would get called on objects past TAMS too? Doesn't G1 verify there are no bits set in bitmap past TAMS (G1HeapVerifier::verify_no_bits_over_tams)? Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From erik.helin at oracle.com Fri Jul 14 12:00:14 2017 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 14 Jul 2017 14:00:14 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com> Message-ID: On 07/14/2017 01:22 PM, Roman Kennke wrote: > Hi Erik, > >> On 07/10/2017 04:10 PM, Roman Kennke wrote: >>> Am 10.07.2017 um 15:13 schrieb Erik Helin: >>>> On 07/07/2017 03:21 PM, Roman Kennke wrote: >>>>> Am 07.07.2017 um 14:35 schrieb Erik Helin: >>>>>> On 07/06/2017 06:23 PM, Roman Kennke wrote: >>>>>>>>>> Ok to push this? >>>>>>>> >>>>>>>> I just realized that your change doesn't build on Windows since you >>>>>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really >>>>>>>> picky >>>>>>>> about that. >>>>>>>> /Mikael >>>>>>> >>>>>>> Uhhh. >>>>>>> Ok, here's revision #3 with precompiled added in: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/ >>>>>>> >>>>>> >>>>>> Hi Roman, >>>>>> >>>>>> I just started looking :) I think GenCollectedHeap::gc_prologue and >>>>>> GenCollectedHeap::gc_epilogue should be virtual, and >>>>>> always_do_update_barrier = UseConcMarkSweepGC moved down >>>>>> CMSHeap::gc_epilogue. >>>>>> >>>>>> What do you think? >>>>> >>>>> Yes, I have seen that. My original plan was to leave it as is >>>>> because I >>>>> know that Erik ?. is working on a big barrier set refactoring that >>>>> would >>>>> remove this code anyway. However, it doesn't really matter, here's the >>>>> cleaned up patch: >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/ >>>>> >>>> >>>> A few comments: >>>> >>>> cmsHeap.hpp: >>>> - you are missing quite a few #includes, but it works since >>>> genCollectedHeap.hpp #includes a whole lot of stuff. Not >>>> necessary to >>>> fix now, because the "missing #include" will start to pop up when >>>> someone tries to break apart GenCollectedHeap into smaller pieces. >>> Right. >>> I always try to minimize includes, especially in header files (they are >>> bound to proliferate later anyway). In addition to that, if a class is >>> only referenced as pointer, I avoid includes and use forward class >>> definition instead. >> >> I think that we in general try to include what is needed, not what >> only what makes the code compile (header guards will of course ensure >> that the header files are only parsed once). So in cmsHeap.hpp, at >> least I would have added: >> >> #include "gc/cms/concurrentMarkSweepGeneration.hpp" >> #include "gc/shared/collectedHeap.hpp" >> #include "gc/shared/gcCause.hpp" >> >> and forward declared: >> >> class CLDClosure; >> class OopsInGenClosure; >> class outputStream; >> class StrongRootsScope; >> class ThreadClosure; > Ok, added those and some more that I found. Not sure why we'd need > #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out for now. Because you are accessing CMSCollcetor in: 99 NOT_PRODUCT( 100 virtual size_t skip_header_HeapWords() { return CMSCollector::skip_header_HeapWords(); } 101 ) and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An alternative would of course be to just declare skip_header_HeapWords() in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then you only need to include concurrentMarkSweeoGeneration.hpp in cmsHeap.cpp. >>>> >>>> - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they >>>> be private in CMSHeap? >>> They are virtual and protected in GenCollectedHeap and called by >>> GenCollectedHeap. Makes sense to also make them protected in CMSHeap? Or >>> am I missing something? >>> >>>> - there are two `private:` blocks, please use only one `private:` >>>> block. >>>> >>> Fixed. >> >> And now there is two `protected:` blocks, immediately after each other: >> > Duh. Fixed. > >> IMO, I would just make the three functions above private. I know they >> are protected in GenCollectedHeap, but it should be fine to have them >> private in CMSHeap. Having them protected signals, at least to me, >> that this class could be considered as a base class (protected to me >> reads "this can be accessed by classes inheriting from this class), >> and we don't want any class to inherit from CMSHeap. > > How can they be called from the superclass if they are private in the > subclass? Would that work in C++? > > protected (to me) means visibility between super and subclasses. If I'd > want to signal that I intend that to be overridden, I'd say 'virtual'. It is perfectly fine to have private virtual methods in C++ (see for example https://stackoverflow.com/questions/2170688/private-virtual-method-in-c). A virtual function only needs to be protected if a "child class" needs to access the function in the "parent class". For both gc_prologue and gc_epilogue, this is the case, which is why they have to be 'protected' in GenCollectedHeap. But, no class is going to derive from CMSHeap, so they can be private in CMSHeap. skip_header_HeapWords needs to be virtual, since classes inheriting from GenCollectedHeap might want to change its behavior. However, no class inheriting from GenCollectedHeap (only CMSHeap so far) needs to call GenCollectedHeap::skip_header_HeapWords, so it can actually be private virtual in GenCollectedHeap. But, in order to not confuse readers, it might better to keep it protected virtual in GenCollectedHeap. There is no reason to have skip_header_HeapWords protected in CMSHeap though, there it should be declared private (and potentially virtual, since override comes first in C++11). >> Sorry, reading the code again it is obvious that create_cms_collector >> never can return false. It either returns true or calls >> vm_shutdown_during_initialization (which will not return). So, I would >> just make create_cms_collctor void, the if branch below is dead code: >> >> 51 if (!success) return JNI_ENOMEM; >> > Right! Very good catch! Changed that. > >> Btw, this code looks really fishy :) > Err, yep. I'll make a note somewhere (in bugs.o.j.n) to fix that later. >> This is for the serviceability agent. You will have to poke around in >> hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used. >> Unfortunately I'm not that familiar with the agent, perhaps someone >> else can chime in here? > > Considering that the remaining references to GenCollectedHeap in > vmStructs.cpp don't look like related to CMSHeap, I'd argue that what I > did is all that's needed for now. Do you agree? Honestly, I don't know, that is why I asked if someone else with more knowledge in this area can comment. Have you tried building and using the SA agent with your change? You can also ask around on hotspot-rt-dev and or serviceability-dev. Thanks, Erik > http://cr.openjdk.java.net/~rkennke/8179387/webrev.06.diff/ > > http://cr.openjdk.java.net/~rkennke/8179387/webrev.06/ > > > Thanks for reviewing! > Roman > From thomas.schatzl at oracle.com Fri Jul 14 12:09:40 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 14 Jul 2017 14:09:40 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> Message-ID: <1500034180.3458.67.camel@oracle.com> Hi Roman, On Fri, 2017-07-14 at 13:24 +0200, Roman Kennke wrote: > Am 14.07.2017 um 13:12 schrieb Aleksey Shipilev: > > > > Hi Thomas, > > > > On 07/14/2017 12:58 PM, Thomas Schatzl wrote: > > > > > > > > > > > > > > > > > The next CR JDK-8184347 will deal with moving G1CMBitmap* > > > > > into separate files. > > > > ?And while you're at it, you may want to move it to gc/shared > > > > and renamed it to something like MarkBitmap? > > > > https://bugs.openjdk.java.net/browse/JDK-8180193 > > > > > > > Not particularly against this change, but I think we should do > > > the move and renaming separately when the change is actually > > > required, i.e. just before there is another dependency on it. > > I think this would be inconvenient, because when "another > > dependency" would come in a large webrev, it would have to include > > the CMBitmap move too, complicating reviews. > I understood it such that we would post the moving around of gc/g1 > files to gc/shared right before we'd post Shenandoah (in the not-so- > distant future, hopefully). That would work for me. I wouldn't like > to include everything in a giant webrev :-P > ? that is exactly what I meant - thanks for your understanding. Thomas From thomas.schatzl at oracle.com Fri Jul 14 12:20:00 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 14 Jul 2017 14:20:00 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: References: <1500031158.3458.41.camel@oracle.com> Message-ID: <1500034800.3458.75.camel@oracle.com> Hi Aleksey, ? thanks for looking into this. On Fri, 2017-07-14 at 13:25 +0200, Aleksey Shipilev wrote: > On 07/14/2017 01:19 PM, Thomas Schatzl wrote: > > > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8184348/webrev/ > *) I'd probably split the assert with newlines. Makes webrevs tidier! Not completely sure what you are referring to, but I split some very long asserts across lines. As for the asserts themselves, I tend to group them together in blocks separate from actual code with newlines. But there are (often) no newlines between subsequent asserts. > *) This is not needed, because par_mark already has the optimistic > check, down > below in Bitmap::par_set_bit? > > ? 54???// Dirty read to avoid CAS. > ? 55???if (_nextMarkBitMap->is_marked(obj_addr)) { > ? 56?????return false; > ? 57???} Thanks for catching this, I simply copied this check from the former grayRoot() method... :) > *) So, mark_reference_grey used to be called from > G1CMSATBBufferClosure on > objects below TAMS, but now it would get called on objects past TAMS > too? CMTask::make_reference_grey() now calls G1ConcurrentMark::mark_in_next_bitmap(), not ConcurrentMark::par_mark() which does not exist any more: G1ConcurrentMark::mark_in_next_bitmap() in the first check filters out marking attempts above nTAMS (g1ConcurrentMark.inline.hpp:47 now), returning false, which makes make_reference_grey() exit immediately in that case. This seems to achieve the same effect. See the comment in g1ConcurrentMark.inline.hpp:51 too, which refers to that issue. (The documentation of G1ConcurrentMark::mark_in_next_bitmap() also mentions that: "Mark the given object on the next bitmap if it is below nTAMS") Indeed, I tripped over this when trying to refactor this, and I did do runs of some gc stress applications with verification on (actually that issue is also caught during check-in by jprt tests). :) If you are worried whether there is a performance difference because maybe now we do more work in some cases, all paths previously leading to the former G1ConcurrentMark::par_mark() did the nTAMS check in one way or another already (of course in inconsistent fashion) so there should be no change here. There may be some further optimizations to be done here (like for marking during initial mark pause, as e.g. survivor region nTAMS == bottom so we will never put a mark for them), but that, unless like the one with the duplicated marking check which is dead simple, I would prefer to do in an extra CR. But please feel free to mention them, I may pick them up immediately afterwards :) > Doesn't G1 verify there are no bits set in bitmap past TAMS > (G1HeapVerifier::verify_no_bits_over_tams)? It does, but as mentioned above, these mark attempts past nTAMS should be filtered out as expected. New webrevs: http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1/?(full) http://cr.openjdk.java.net/~tschatzl/8184348/webrev.0_to_1/?(diff) Thanks a lot, ? Thomas From shade at redhat.com Fri Jul 14 13:18:43 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 14 Jul 2017 15:18:43 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: <1500034800.3458.75.camel@oracle.com> References: <1500031158.3458.41.camel@oracle.com> <1500034800.3458.75.camel@oracle.com> Message-ID: On 07/14/2017 02:20 PM, Thomas Schatzl wrote: > Not completely sure what you are referring to, but I split some very > long asserts across lines. Yes, I meant that, sorry for not being clear. Any webrev that requires me to scroll horizontally on 2560-pixel wide screen triggers me! >> *) So, mark_reference_grey used to be called from >> G1CMSATBBufferClosure on >> objects below TAMS, but now it would get called on objects past TAMS >> too? > > CMTask::make_reference_grey() now calls > G1ConcurrentMark::mark_in_next_bitmap(), not ConcurrentMark::par_mark() > which does not exist any more: G1ConcurrentMark::mark_in_next_bitmap() > in the first check filters out marking attempts above nTAMS > (g1ConcurrentMark.inline.hpp:47 now), returning false, which makes > make_reference_grey() exit immediately in that case. This seems to > achieve the same effect. Ah, I missed that part! I agree this part is fine then. > If you are worried whether there is a performance difference because > maybe now we do more work in some cases, all paths previously leading > to the former G1ConcurrentMark::par_mark() did the nTAMS check in one > way or another already (of course in inconsistent fashion) so there > should be no change here. No, I am not worried. SATB-heavy workloads have problems way beyond bitmap marking :) > New webrevs: > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1/ (full) Looks good to me. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Fri Jul 14 14:34:30 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 14 Jul 2017 16:34:30 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: References: <1500031158.3458.41.camel@oracle.com> <1500034800.3458.75.camel@oracle.com> Message-ID: <1500042870.3458.84.camel@oracle.com> Hi again, On Fri, 2017-07-14 at 15:18 +0200, Aleksey Shipilev wrote: > On 07/14/2017 02:20 PM, Thomas Schatzl wrote: > > > > Not completely sure what you are referring to, but I split some > > very > > long asserts across lines. > Yes, I meant that, sorry for not being clear. Any webrev that > requires me to scroll horizontally on 2560-pixel wide screen triggers > me! I noticed that too :) > > > > > > *) So, mark_reference_grey used to be called from > > > G1CMSATBBufferClosure on > > > objects below TAMS, but now it would get called on objects past > > > TAMS > > > too? > > CMTask::make_reference_grey() now calls > > G1ConcurrentMark::mark_in_next_bitmap(), not > > ConcurrentMark::par_mark() > > which does not exist any more: > > G1ConcurrentMark::mark_in_next_bitmap() > > in the first check filters out marking attempts above nTAMS > > (g1ConcurrentMark.inline.hpp:47 now), returning false, which makes > > make_reference_grey() exit immediately in that case. This seems to > > achieve the same effect. > Ah, I missed that part! I agree this part is fine then. > > > > > If you are worried whether there is a performance difference > > because maybe now we do more work in some cases, all paths > > previously leading to the former G1ConcurrentMark::par_mark() did > > the nTAMS check in one way or another already (of course in > > inconsistent fashion) so there should be no change here. > No, I am not worried. SATB-heavy workloads have problems way beyond > bitmap marking :) > > > > > New webrevs: > > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1/ (full) > Looks good to me. Thanks. Unfortunately, after re-appyling and fixing other changes based on this one I noticed that I missed one opportunity to refactor in G1CMTask::deal_with_reference(). I would like to add this to this changeset still... sorry. There is some note about some perf optimization that mentions that it is advantagous to do the nTAMS check before determining the heap region; however I do not think this is an issue. Quickly comparing runs of a fairly large and reference-intensive workload (BigRAMTester with 20g heap e.g. attached to JDK-8152438), marking cycles with the latest webrev.2 are at least as fast as without any of this RFR's changes. New webrevs: http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2?(diff) http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2?(full) Thanks, ? Thomas From daniel.daugherty at oracle.com Fri Jul 14 22:48:34 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 14 Jul 2017 16:48:34 -0600 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <743ff172-88b6-30dc-8808-a4a97be4a571@oracle.com> References: <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> <743ff172-88b6-30dc-8808-a4a97be4a571@oracle.com> Message-ID: On 7/12/17 2:39 PM, Robbin Ehn wrote: > On 2017-07-12 15:32, Roman Kennke wrote: >> Hi Robbin and all, >> >> I fixed the 32bit failures by using jlong in all relevant places: >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ >> > > Looks good! > >> >> then Robbin found another problem. SafepointCleanupTest started to fail, >> because "mark nmethods" is no longer printed. This made me think that >> we're not measuring the conflated (and possibly parallelized) >> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with >> "safepoint cleanup tasks" which measures the total duration of safepoint >> cleanup. We can't reasonably measure a possibly parallel and conflated >> pass standalone, but we can measure all and by subtrating all the other >> subphases, get an idea how long deflation and nmethod marking take up. >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ >> > > Looks good and thanks for fixing > > It's time to ship this, can we have a second review please! > http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ src/share/vm/code/nmethod.hpp b/src/share/vm/code/nmethod.hpp No comments. src/share/vm/runtime/safepoint.cpp No comments. src/share/vm/runtime/safepoint.cpp No comments. test/runtime/logging/SafepointCleanupTest.java No comments. Thumbs up. Only looked at the files that changed relative to the last version that I reviewed (webrev.12, I think)... Dan > > /Robbin > >> >> The full webrev is now: >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ >> >> >> Hope that's all ;-) >> >> Roman >> >> Am 10.07.2017 um 21:22 schrieb Robbin Ehn: >>> Hi, unfortunately the push failed on 32-bit. >>> >>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) >>> >>> I do not have anytime to look at this, so here is the error. >>> >>> /Robbin >>> >>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' >>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>> member function 'long int nmethod::stack_traversal_mark()': >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>> >>> error: call of overloaded 'load_acquire(volatile long int*)' is >>> ambiguous >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>> >>> note: candidates are: >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>> >>> note: static jint OrderAccess::load_acquire(const volatile jint*) >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>> >>> note: no known conversion for argument 1 from 'volatile long int*' >>> to 'const volatile jint* {aka const volatile int*}' >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>> >>> note: static juint OrderAccess::load_acquire(const volatile juint*) >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>> >>> note: no known conversion for argument 1 from 'volatile long int*' >>> to 'const volatile juint* {aka const volatile unsigned int*}' >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>> member function 'void nmethod::set_stack_traversal_mark(long int)': >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>> >>> error: call of overloaded 'release_store(volatile long int*, long >>> int&)' is ambiguous >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>> >>> note: candidates are: >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>> >>> note: static void OrderAccess::release_store(volatile jint*, jint) >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>> >>> note: no known conversion for argument 1 from 'volatile long int*' >>> to 'volatile jint* {aka volatile int*}' >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>> >>> note: static void OrderAccess::release_store(volatile juint*, juint) >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>> >>> note: no known conversion for argument 1 from 'volatile long int*' >>> to 'volatile juint* {aka volatile unsigned int*}' >>> >>> On 2017-07-10 20:50, Robbin Ehn wrote: >>>> I'll start a push now. >>>> >>>> /Robbin >>>> >>>> On 2017-07-10 12:38, Roman Kennke wrote: >>>>> Ok, so I guess I need a sponsor for this now: >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>> >>>>> >>>>> Roman >>>>> >>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>>>> >>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>>>> > wrote: >>>>>>> >>>>>>> Hi Roman, >>>>>>> >>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>>>> Hi Robbin, >>>>>>>>> >>>>>>>>> Far down -> >>>>>>>>> >>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm not happy about this change: >>>>>>>>>>> >>>>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>>>> + // TODO: Is this really needed? >>>>>>>>>>> + OrderAccess::storestore(); >>>>>>>>>>> + } >>>>>>>>>>> >>>>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>>>> consistent >>>>>>>>>>> with an OrderAccess::storestore() that's not properly >>>>>>>>>>> documented >>>>>>>>>>> which is only increasing the technical debt. >>>>>>>>>>> >>>>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>>>> >>>>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case >>>>>>>>>>>> that >>>>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>>>> sweeper) >>>>>>>>>>>> is holding still. >>>>>>>>>>> >>>>>>>>>>> and: >>>>>>>>>>> >>>>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>>>> sweeper.cpp... >>>>>>>>>> >>>>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>>>> marking >>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>>>> (outside >>>>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>>>> storestore() >>>>>>>>>> should be necessary. >>>>>>>>>> >>>>>>>>>> From Igor's comment I can see how it happened though: >>>>>>>>>> Apparently >>>>>>>>>> there >>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>>>> with >>>>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>>>> required >>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>>>> also put >>>>>>>>>> a storestore() in the other places that call >>>>>>>>>> mark_as_seen_on_stack(), >>>>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>>>> storestore() >>>>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>>>> 'for >>>>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>>>> necessary in >>>>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>>>> >>>>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>>>> Refactor the >>>>>>>>>> code so that both paths at least call the storestore() in the >>>>>>>>>> same >>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and >>>>>>>>>> call >>>>>>>>>> storestore() in the dtor as proposed?) >>>>>>>>> >>>>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>>>> >>>>>>>>> So there is a slight optimization when not running sweeper to >>>>>>>>> skip >>>>>>>>> compiler barrier/fence in stw. >>>>>>>>> >>>>>>>>> Don't think that matter, so I propose something like: >>>>>>>>> - long stack_traversal_mark() { return >>>>>>>>> _stack_traversal_mark; } >>>>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>>>> _stack_traversal_mark = l; } >>>>>>>>> + long stack_traversal_mark() { return >>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>>>> >>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>>>> that >>>>>>>>> it is concurrent accessed. >>>>>>>>> And remove both storestore. >>>>>>>>> >>>>>>>>> "Also neither of these state variables are volatile in >>>>>>>>> nmethod, so >>>>>>>>> even the compiler may reorder the stores" >>>>>>>>> Fortunately at least _state is volatile now. >>>>>>>>> >>>>>>>>> I think _state also should use la/rs semantics instead, but >>>>>>>>> that's >>>>>>>>> another story. >>>>>>>> Like this? >>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> Yes, exactly, I like this! >>>>>>> Dan? Igor ? Tobias? >>>>>>> >>>>>> >>>>>> That seems correct. >>>>>> >>>>>> igor >>>>>> >>>>>>> Thanks Roman! >>>>>>> >>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow >>>>>>> this >>>>>>> thread/changeset to the end! >>>>>>> >>>>>>> /Robbin >>>>>>> >>>>>>>> Roman >>>>>> >>>>> >> From robbin.ehn at oracle.com Sun Jul 16 08:25:14 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Sun, 16 Jul 2017 10:25:14 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com> <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> Message-ID: <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com> Hi Roman, On 2017-07-12 15:32, Roman Kennke wrote: > Hi Robbin and all, > > I fixed the 32bit failures by using jlong in all relevant places: > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ > > > then Robbin found another problem. SafepointCleanupTest started to fail, > because "mark nmethods" is no longer printed. This made me think that > we're not measuring the conflated (and possibly parallelized) > deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with > "safepoint cleanup tasks" which measures the total duration of safepoint > cleanup. We can't reasonably measure a possibly parallel and conflated > pass standalone, but we can measure all and by subtrating all the other > subphases, get an idea how long deflation and nmethod marking take up. > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ > > > The full webrev is now: > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ > > > Hope that's all ;-) With this changeset something always pop-ups. Failure reason: Targets failed. Target macosx_x64_10.9-fastdebug FAILED. /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16 -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070 -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86 -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64 -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED -DINCLUDE_AOT -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22: error: variable has incomplete type 'StrongRootsScope' StrongRootsScope srs(num_cleanup_workers); ^ /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: note: forward declaration of 'StrongRootsScope' class StrongRootsScope; ^ /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22: error: variable has incomplete type 'StrongRootsScope' StrongRootsScope srs(1); ^ /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: note: forward declaration of 'StrongRootsScope' class StrongRootsScope; ^ 2 errors generated. make[3]: *** [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o] Error 1 make[3]: *** Waiting for unfinished jobs.... make[2]: *** [hotspot-server-libs] Error 2 Send me the new webrev and I'll test it before the 16th round of review :) /Robbin > > Roman > > Am 10.07.2017 um 21:22 schrieb Robbin Ehn: >> Hi, unfortunately the push failed on 32-bit. >> >> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) >> >> I do not have anytime to look at this, so here is the error. >> >> /Robbin >> >> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' >> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >> member function 'long int nmethod::stack_traversal_mark()': >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >> error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >> note: candidates are: >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >> note: static jint OrderAccess::load_acquire(const volatile jint*) >> >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >> note: no known conversion for argument 1 from 'volatile long int*' >> to 'const volatile jint* {aka const volatile int*}' >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >> note: static juint OrderAccess::load_acquire(const volatile juint*) >> >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >> note: no known conversion for argument 1 from 'volatile long int*' >> to 'const volatile juint* {aka const volatile unsigned int*}' >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >> member function 'void nmethod::set_stack_traversal_mark(long int)': >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >> error: call of overloaded 'release_store(volatile long int*, long >> int&)' is ambiguous >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >> note: candidates are: >> In file included from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >> from >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >> note: static void OrderAccess::release_store(volatile jint*, jint) >> >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >> note: no known conversion for argument 1 from 'volatile long int*' >> to 'volatile jint* {aka volatile int*}' >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >> note: static void OrderAccess::release_store(volatile juint*, juint) >> >> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >> note: no known conversion for argument 1 from 'volatile long int*' >> to 'volatile juint* {aka volatile unsigned int*}' >> >> On 2017-07-10 20:50, Robbin Ehn wrote: >>> I'll start a push now. >>> >>> /Robbin >>> >>> On 2017-07-10 12:38, Roman Kennke wrote: >>>> Ok, so I guess I need a sponsor for this now: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>> >>>> >>>> Roman >>>> >>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>>> >>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>>> > wrote: >>>>>> >>>>>> Hi Roman, >>>>>> >>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>>> Hi Robbin, >>>>>>>> >>>>>>>> Far down -> >>>>>>>> >>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> I'm not happy about this change: >>>>>>>>>> >>>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>>> + // TODO: Is this really needed? >>>>>>>>>> + OrderAccess::storestore(); >>>>>>>>>> + } >>>>>>>>>> >>>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>>> consistent >>>>>>>>>> with an OrderAccess::storestore() that's not properly documented >>>>>>>>>> which is only increasing the technical debt. >>>>>>>>>> >>>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>>> >>>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that >>>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>>> sweeper) >>>>>>>>>>> is holding still. >>>>>>>>>> >>>>>>>>>> and: >>>>>>>>>> >>>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>>> sweeper.cpp... >>>>>>>>> >>>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>>> marking >>>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>>> (outside >>>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>>> storestore() >>>>>>>>> should be necessary. >>>>>>>>> >>>>>>>>> From Igor's comment I can see how it happened though: Apparently >>>>>>>>> there >>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>>> with >>>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>>> required >>>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>>> also put >>>>>>>>> a storestore() in the other places that call >>>>>>>>> mark_as_seen_on_stack(), >>>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>>> storestore() >>>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>>> 'for >>>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>>> necessary in >>>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>>> >>>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>>> Refactor the >>>>>>>>> code so that both paths at least call the storestore() in the same >>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call >>>>>>>>> storestore() in the dtor as proposed?) >>>>>>>> >>>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>>> >>>>>>>> So there is a slight optimization when not running sweeper to skip >>>>>>>> compiler barrier/fence in stw. >>>>>>>> >>>>>>>> Don't think that matter, so I propose something like: >>>>>>>> - long stack_traversal_mark() { return >>>>>>>> _stack_traversal_mark; } >>>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>>> _stack_traversal_mark = l; } >>>>>>>> + long stack_traversal_mark() { return >>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>>> >>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>>> that >>>>>>>> it is concurrent accessed. >>>>>>>> And remove both storestore. >>>>>>>> >>>>>>>> "Also neither of these state variables are volatile in nmethod, so >>>>>>>> even the compiler may reorder the stores" >>>>>>>> Fortunately at least _state is volatile now. >>>>>>>> >>>>>>>> I think _state also should use la/rs semantics instead, but that's >>>>>>>> another story. >>>>>>> Like this? >>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>> >>>>>>> >>>>>> >>>>>> Yes, exactly, I like this! >>>>>> Dan? Igor ? Tobias? >>>>>> >>>>> >>>>> That seems correct. >>>>> >>>>> igor >>>>> >>>>>> Thanks Roman! >>>>>> >>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this >>>>>> thread/changeset to the end! >>>>>> >>>>>> /Robbin >>>>>> >>>>>>> Roman >>>>> >>>> > From kim.barrett at oracle.com Mon Jul 17 00:33:38 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 16 Jul 2017 20:33:38 -0400 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: <5964BF9B.4010309@oracle.com> References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> <595CBE40.5050603@oracle.com> <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com> <5964BF9B.4010309@oracle.com> Message-ID: > On Jul 11, 2017, at 8:07 AM, Erik ?sterlund wrote: >> This suggests a potential (though seemingly hard to avoid) fragility >> resulting from the lowered lock rank. > > Note that this does not matter for JavaThreads (including compiler threads), for concurrent refinement threads or concurrent marking threads, nor does it matter for any thread when marking is not active. > > So it seems to me that the worst consequence of this is possibly worse latency for operations coinciding in time with concurrent marking, that have large amounts of mutations or resurrections, and are not performed by JavaThreads (including compiler threads) or GC threads (that are performing the concurrent marking) or concurrent refinement threads (that have nothing to do with SATB), that are running concurrently with each other. > > That does not seem to be a huge problem in my book. If it was, and an unknown bunch of non-JavaThreads are heavily mutating or resurrecting objects concurrent to marking, such that contention is inflicted on the shared queue lock for the shared SATB queue, then the right solution for that seems to be to give such threads their own local queue, rather than to reduce the time spent under the surprisingly hot shared queue lock. I think this part of the reply misses my point, though later discussion is on the right track. The rank for any locks in the filtering or mutator assist code can be anything not higher than the CBL lock ranks, since filtering and mutator assist are invoked in related contexts. Any locks in the filtering code must be lower than the shared queue lock ranks. Reducing the CBL and shared queue ranks to allow them to be locked in more contexts implicitly imposes additional requirements on the filtering and mutator assist code, especially the latter, which is not presently invoked while holding the shared queue lock. Code which would have been "easily" safe before this change may now be not so easy, or may even be broken. In this discussion we've already identified two places that require further repair before we can start taking advantage of these reduced lock ranks. And future changes in those areas may be more difficult than with the old lock ranks. But since I agree with the rationale for reducing the ranks of these locks, it seems we need to accept these additional costs (some known additional work needed, and restrictions on future changes). But we should remember these costs exist (RFEs for the additional work, maybe some comments on the filtering and mutator assist API functions discussing the issue). >> The present SATB filtering doesn't seem to acquire any locks, but it's >> a non-trivial amount of code spread over multiple files, so would be >> easy to miss something or break it in that respect. Reducing the lock >> ranks requires being very careful with the SATB filtering code. > > IMO, adding any lock into the SATB barrier which is used all over hotspot in some very shady places arguably requires being very careful regardless of my changes. So I am going to assume whoever does that for whatever reason is going to be careful. > >> The "mutator" help for dirty card queue processing is not presently >> done for the shared queue, but I think could be today. I'm less sure >> about that with lowered queue lock ranks; I *think* there aren't any >> relevant locks there (other than the very rare shared queue lock in >> refine_card_concurrently), but that's a substantially larger and more >> complex amount of code than SATB queue filtering. > > As discussed with Thomas earlier in this thread, there are indeed locks blocking this. The HeapRegionRemSet::_m lock is currently a leaf lock. If collaborative refinement was to be performed on non-Java threads (and non-concurrent refinement threads), then this lock would have to decrease to the access rank first. But we concluded that warrants a new RFE with separate analysis. > > As with the SATB queues though, I do not know what threads would be causing such trouble? It is not JavaThreads (including compiler threads), concurrent refinement threads, concurrent marking threads. That does not leave us with a whole lot of threads to cause that contention on the shared queue lock. And as with the SATB queues, if there are such threads that cause such contention on the shared queue lock, then the right fix seems to be to give them their own local queue and stop taking the shared queue lock in the first place. A native thread copying a jweak to a (strong) jobject uses the shared queue. I don't think we're going to fix that by giving native threads their own queues. A Java thread calls into C++, takes a low-rank lock, and while holding that lock touches a queue. Everything in the queue touching needs to be ranked lower than that lock, including filter and mutator assist code. That this isn't permitted today is beside the point; this seems to me to be exactly the sort of situation this change is intended to permit. Since I think the rank reductions are a necessary (though not sufficient) step, call it Reviewed. From shade at redhat.com Mon Jul 17 07:23:04 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 17 Jul 2017 09:23:04 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: <1500042870.3458.84.camel@oracle.com> References: <1500031158.3458.41.camel@oracle.com> <1500034800.3458.75.camel@oracle.com> <1500042870.3458.84.camel@oracle.com> Message-ID: On 07/14/2017 04:34 PM, Thomas Schatzl wrote: > Thanks. Unfortunately, after re-appyling and fixing other changes based > on this one I noticed that I missed one opportunity to refactor in > G1CMTask::deal_with_reference(). I would like to add this to this > changeset still... sorry. > > There is some note about some perf optimization that mentions that it > is advantagous to do the nTAMS check before determining the heap > region; however I do not think this is an issue. > > Quickly comparing runs of a fairly large and reference-intensive > workload (BigRAMTester with 20g heap e.g. attached to JDK-8152438), > marking cycles with the latest webrev.2 are at least as fast as without > any of this RFR's changes. > > New webrevs: > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff) > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full) Looks good. I wonder what this was about in the old code: 187 if (_g1h->is_in_g1_reserved(objAddr)) { New code properly asserts the object is in reserved. Did we ever had oops stored outside of reserved? That would be surprising! Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From mikael.gerdin at oracle.com Mon Jul 17 08:07:37 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 17 Jul 2017 10:07:37 +0200 Subject: RFR (XS): 8183538: UpdateRS phase should claim cards In-Reply-To: <1499945712.2756.2.camel@oracle.com> References: <1499861747.6693.6.camel@oracle.com> <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com> <1499945712.2756.2.camel@oracle.com> Message-ID: Hi Thomas, On 2017-07-13 13:35, Thomas Schatzl wrote: > Hi, > > On Thu, 2017-07-13 at 13:09 +0200, Erik Helin wrote: >> Hi Thomas, >> >> On 07/12/2017 02:15 PM, Thomas Schatzl wrote: >>> >>> Hi all, >>> >>> please review this small change that adds claiming of cards in >>> the >>> update rs phase so that scan rs does not rescan them. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8183538 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8183538/webrev/ >> looks good, Reviewed. Looks good to me as well. /Mikael >> >> I was trying to find a way where we could utilize the claim_card >> function, but could not come up with a good approach. Push this and >> then we can see if we can reduce the slight code/logic duplication >> later. > > yes, me too :) All variants I could think of would penalize one or > the other phase. > > Thanks for your review. > > Thanks, > Thomas > From thomas.schatzl at oracle.com Mon Jul 17 08:23:37 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 17 Jul 2017 10:23:37 +0200 Subject: RFR (XS): 8183538: UpdateRS phase should claim cards In-Reply-To: References: <1499861747.6693.6.camel@oracle.com> <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com> <1499945712.2756.2.camel@oracle.com> Message-ID: <1500279817.2845.7.camel@oracle.com> Hi Mikael, On Mon, 2017-07-17 at 10:07 +0200, Mikael Gerdin wrote: > Hi Thomas, > > On 2017-07-13 13:35, Thomas Schatzl wrote: > > > > Hi, > > > > On Thu, 2017-07-13 at 13:09 +0200, Erik Helin wrote: > > > > > > Hi Thomas, > > > > > > On 07/12/2017 02:15 PM, Thomas Schatzl wrote: > > > > > > > > > > > > Hi all, > > > > > > > > ???please review this small change that adds claiming of cards > > > > in the update rs phase so that scan rs does not rescan them. > > > > > > > > CR: > > > > https://bugs.openjdk.java.net/browse/JDK-8183538 > > > > Webrev: > > > > http://cr.openjdk.java.net/~tschatzl/8183538/webrev/ > > > looks good, Reviewed. > Looks good to me as well. > /Mikael ? thanks for your review. Thomas From thomas.schatzl at oracle.com Mon Jul 17 08:25:25 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 17 Jul 2017 10:25:25 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: References: <1500031158.3458.41.camel@oracle.com> <1500034800.3458.75.camel@oracle.com> <1500042870.3458.84.camel@oracle.com> Message-ID: <1500279925.2845.8.camel@oracle.com> Hi Aleksey, On Mon, 2017-07-17 at 09:23 +0200, Aleksey Shipilev wrote: > On 07/14/2017 04:34 PM, Thomas Schatzl wrote: > > > > Thanks. Unfortunately, after re-appyling and fixing other changes > > based > > on this one I noticed that I missed one opportunity to refactor in > > G1CMTask::deal_with_reference(). I would like to add this to this > > changeset still... sorry. > > > > There is some note about some perf optimization that mentions that > > it > > is advantagous to do the nTAMS check before determining the heap > > region; however I do not think this is an issue. > > > > Quickly comparing runs of a fairly large and reference-intensive > > workload (BigRAMTester with 20g heap e.g. attached to JDK-8152438), > > marking cycles with the latest webrev.2 are at least as fast as > > without > > any of this RFR's changes. > > > > New webrevs: > > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff) > > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full) > Looks good. > > I wonder what this was about in the old code: > > ?187???if (_g1h->is_in_g1_reserved(objAddr)) { > > New code properly asserts the object is in reserved. Did we ever had > oops stored > outside of reserved? That would be surprising! ? the reference can be NULL here. The is_in_g1_reserved() check also filters those, in a bit of a crude way. So I changed this to an explicit NULL check, and let it run into the assert (in ConcurrentMark::mark_in_next_bitmap()) in other cases. I have not seen any issues in my testing of the changes I extracted these from. There should obviously no oops referencing anything outside of the heap. Thanks for your review. Thanks, ? Thomas From shade at redhat.com Mon Jul 17 08:29:32 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 17 Jul 2017 10:29:32 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: <1500279925.2845.8.camel@oracle.com> References: <1500031158.3458.41.camel@oracle.com> <1500034800.3458.75.camel@oracle.com> <1500042870.3458.84.camel@oracle.com> <1500279925.2845.8.camel@oracle.com> Message-ID: <3c8fd54d-bce5-229b-38d9-f9ede82e2c54@redhat.com> On 07/17/2017 10:25 AM, Thomas Schatzl wrote: >>> New webrevs: >>> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff) >>> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full) >> Looks good. >> >> I wonder what this was about in the old code: >> >> 187 if (_g1h->is_in_g1_reserved(objAddr)) { >> >> New code properly asserts the object is in reserved. Did we ever had >> oops stored >> outside of reserved? That would be surprising! > > the reference can be NULL here. The is_in_g1_reserved() check also > filters those, in a bit of a crude way. So I changed this to an > explicit NULL check, and let it run into the assert (in > ConcurrentMark::mark_in_next_bitmap()) in other cases. That explains it, thanks. Go! -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Mon Jul 17 08:33:10 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 17 Jul 2017 10:33:10 +0200 Subject: RFR (XS): 8184452: Add bounds checking for FromCardCache In-Reply-To: References: <1500030297.3458.29.camel@oracle.com> Message-ID: <1500280390.2845.11.camel@oracle.com> Hi Roman, Aleksey, On Fri, 2017-07-14 at 13:20 +0200, Aleksey Shipilev wrote: > On 07/14/2017 01:04 PM, Thomas Schatzl wrote: > > > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8184452/webrev/ > Looks good. On Fri, 2017-07-14 at 13:15 +0200, Roman Kennke wrote: > Am 14.07.2017 um 13:04 schrieb Thomas Schatzl: > >? > > Hi all, > >? > >???can I have reviews for this change that adds asserts/bounds > > checking to the FromCardCache methods? > > [...] > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8184452/webrev/ > > Testing: > > jprt > >? > > Thanks, > >???Thomas >? > I'm all for more asserts if it helps to figure out bugs, so yes. > Change looks good too. >? > Roman (not official reviewer) ? thanks for your reviews! Thanks, ? Thomas From erik.osterlund at oracle.com Mon Jul 17 08:49:45 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 17 Jul 2017 10:49:45 +0200 Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings In-Reply-To: References: <59510D5E.10009@oracle.com> <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com> <595CBE40.5050603@oracle.com> <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com> <5964BF9B.4010309@oracle.com> Message-ID: <596C7A29.3000602@oracle.com> Hi Kim, Thank you for the review! I have some comments though... On 2017-07-17 02:33, Kim Barrett wrote: >> On Jul 11, 2017, at 8:07 AM, Erik ?sterlund wrote: >>> This suggests a potential (though seemingly hard to avoid) fragility >>> resulting from the lowered lock rank. >> Note that this does not matter for JavaThreads (including compiler threads), for concurrent refinement threads or concurrent marking threads, nor does it matter for any thread when marking is not active. >> >> So it seems to me that the worst consequence of this is possibly worse latency for operations coinciding in time with concurrent marking, that have large amounts of mutations or resurrections, and are not performed by JavaThreads (including compiler threads) or GC threads (that are performing the concurrent marking) or concurrent refinement threads (that have nothing to do with SATB), that are running concurrently with each other. >> >> That does not seem to be a huge problem in my book. If it was, and an unknown bunch of non-JavaThreads are heavily mutating or resurrecting objects concurrent to marking, such that contention is inflicted on the shared queue lock for the shared SATB queue, then the right solution for that seems to be to give such threads their own local queue, rather than to reduce the time spent under the surprisingly hot shared queue lock. > I think this part of the reply misses my point, though later > discussion is on the right track. > > The rank for any locks in the filtering or mutator assist code can be > anything not higher than the CBL lock ranks, since filtering and > mutator assist are invoked in related contexts. Any locks in the > filtering code must be lower than the shared queue lock ranks. > > Reducing the CBL and shared queue ranks to allow them to be locked in > more contexts implicitly imposes additional requirements on the > filtering and mutator assist code, especially the latter, which is not > presently invoked while holding the shared queue lock. Code which > would have been "easily" safe before this change may now be not so > easy, or may even be broken. In this discussion we've already > identified two places that require further repair before we can start > taking advantage of these reduced lock ranks. And future changes in > those areas may be more difficult than with the old lock ranks. 1) I agree - more work is needed to free the unethically caged heap oop store. Some constraints have been removed, but there are a few more. 2) I disagree that we can not already take immediate advantage of this. My main problem is the SATB queues required for the weak oop load barriers in hotspot. They are now free, and therefore I can take immediate advantage of these changes. 3) I think that to the greatest extent possible, lock ranks should follow the way we intend to lock, rather than letting the ranks affect the way we lock. The deadlock detection system was designed to have false positives. Therefore we should first figure out if we have a true possible deadlock or a false positive. In case of false positives, I think we should try pretty hard not to compromise solid locking schemes in order to fight false positives of the deadlock detection system. I understand this is sometimes difficult, but I think it is a good idea in general. > But since I agree with the rationale for reducing the ranks of these > locks, it seems we need to accept these additional costs (some known > additional work needed, and restrictions on future changes). But we > should remember these costs exist (RFEs for the additional work, maybe > some comments on the filtering and mutator assist API functions > discussing the issue). I am glad we agree here. I will file RFEs. >>> The present SATB filtering doesn't seem to acquire any locks, but it's >>> a non-trivial amount of code spread over multiple files, so would be >>> easy to miss something or break it in that respect. Reducing the lock >>> ranks requires being very careful with the SATB filtering code. >> IMO, adding any lock into the SATB barrier which is used all over hotspot in some very shady places arguably requires being very careful regardless of my changes. So I am going to assume whoever does that for whatever reason is going to be careful. >> >>> The "mutator" help for dirty card queue processing is not presently >>> done for the shared queue, but I think could be today. I'm less sure >>> about that with lowered queue lock ranks; I *think* there aren't any >>> relevant locks there (other than the very rare shared queue lock in >>> refine_card_concurrently), but that's a substantially larger and more >>> complex amount of code than SATB queue filtering. >> As discussed with Thomas earlier in this thread, there are indeed locks blocking this. The HeapRegionRemSet::_m lock is currently a leaf lock. If collaborative refinement was to be performed on non-Java threads (and non-concurrent refinement threads), then this lock would have to decrease to the access rank first. But we concluded that warrants a new RFE with separate analysis. >> >> As with the SATB queues though, I do not know what threads would be causing such trouble? It is not JavaThreads (including compiler threads), concurrent refinement threads, concurrent marking threads. That does not leave us with a whole lot of threads to cause that contention on the shared queue lock. And as with the SATB queues, if there are such threads that cause such contention on the shared queue lock, then the right fix seems to be to give them their own local queue and stop taking the shared queue lock in the first place. > A native thread copying a jweak to a (strong) jobject uses the shared > queue. I don't think we're going to fix that by giving native threads > their own queues. I am not sure what threads you are referring to here. But I guess that is okay. > A Java thread calls into C++, takes a low-rank lock, and while holding > that lock touches a queue. Everything in the queue touching needs to > be ranked lower than that lock, including filter and mutator assist > code. That this isn't permitted today is beside the point; this seems > to me to be exactly the sort of situation this change is intended to > permit. As mentioned earlier, I specifically need the SATB enqueue barriers to be free. I want the heap oop store to be free too, but that is not blocking me. > Since I think the rank reductions are a necessary (though not sufficient) > step, call it Reviewed. Thank you for the review. /Erik From mikael.gerdin at oracle.com Mon Jul 17 08:57:21 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 17 Jul 2017 10:57:21 +0200 Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set In-Reply-To: <1499958417.2756.4.camel@oracle.com> References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> <1499958417.2756.4.camel@oracle.com> Message-ID: <8cb3ed9f-e520-4f10-6d6e-fdbb7560859e@oracle.com> Hi Erik, On 2017-07-13 17:06, Thomas Schatzl wrote: > Hi Erik, > > On Thu, 2017-07-13 at 16:53 +0200, Erik Helin wrote: >> On 07/04/2017 02:17 PM, Mikael Gerdin wrote: >>> >>> Hi Erik, >>> >>> Do you know if any of the tests actually would have failed if rem >>> set >>> reconstruction after evacuation failure didn't work properly? >>> >>> I'd feel safer with this change if you ran with some verification >>> code to ensure that the into_cset queue was always useless when >>> evac failure occurs. >> >> Good point, I have now run GCBasher for a very long time with: >> -XX:+G1EvacuationFailALot -XX:G1EvacuationFailureALotCount=5 >> -XX:+VerifyBeforeGC -XX:+VerifyAfterGC >> >> This mean that GCBasher encounters a (forced) evacuation failure >> every fifth GC and also runs full verification for every GC. So far >> it has been working fine. >> >> I have also run all tests in the JTReg group hotspot_gc with >> G1EvacuationFailALot set to true (in g1_globals.hpp) and >> G1EvacuationFailureALotCount set to 5 (also in g1_globals.hpp). This >> mean that all GC tests (including the stress tests) encountered an >> evacuation failure every fifth GC. This also worked fine. >> >> I also wrote a new patch against tip (where _into_cset_dcqs is still >> present) to do some custom verification. The contents of >> G1RemSet::_into_cset_dcqs and G1CollectedHeap::_dirty_card_queue_set >> should be identical after a collection. This sort-of worked :) >> >> The queues are *very* similar (often around 98% of the cards in >> G1RemSet::_into_cset_dcqs are found in >> G1CollectedHeap::_dirty_card_queue_set). The reason for the "missing >> cards" is that cards in G1RemSet::_into_cset_dcqs comes from the >> post-write barrier, and the post-write barrier dirties the card that >> contains the object header (except for arrays, where it dirties the >> field/slot). The cards in G1CollectedHeap::_dirty_card_queue_set >> comes from G1ParScanThreadState::update_rs, and update_rs always >> dirties the card that contains the field (*not* the header). Hence, >> if an object crosses card boundaries, then the post-write barrier and >> update_rs will dirty different cards. This has no impact on >> correctness, it is like this for performance reasons (dirtying the >> card that contains the object header leads to fewer dirty cards, but >> we don't have quick access to the object header in update_rs). >> >> So, with the above, I'm fairly confident (famous last words) that >> this patch is working :) > > Thanks for this thorough investigation, sounds good. > > Ship it. +1 /Mikael > > Thomas > From erik.helin at oracle.com Mon Jul 17 09:42:47 2017 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 17 Jul 2017 11:42:47 +0200 Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set In-Reply-To: <8cb3ed9f-e520-4f10-6d6e-fdbb7560859e@oracle.com> References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com> <1499958417.2756.4.camel@oracle.com> <8cb3ed9f-e520-4f10-6d6e-fdbb7560859e@oracle.com> Message-ID: <05fd85c9-dc07-8a15-90cd-d3e78d16df56@oracle.com> On 07/17/2017 10:57 AM, Mikael Gerdin wrote: > Hi Erik, > > On 2017-07-13 17:06, Thomas Schatzl wrote: >> Hi Erik, >> >>> On Thu, 2017-07-13 at 16:53 +0200, Erik Helin wrote: >>> So, with the above, I'm fairly confident (famous last words) that >>> this patch is working :) >> >> Thanks for this thorough investigation, sounds good. >> >> Ship it. > > +1 Thanks Thomas and Mikael for reviewing! Erik > /Mikael >> >> Thomas >> From rkennke at redhat.com Mon Jul 17 12:07:21 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 17 Jul 2017 14:07:21 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com> Message-ID: <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com> (I included hotspot-runtime-dev and serviceability-dev to review vmStructs.cpp changes. see below) Hi Erik, >> Ok, added those and some more that I found. Not sure why we'd need >> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out >> for now. > > Because you are accessing CMSCollcetor in: > > 99 NOT_PRODUCT( > 100 virtual size_t skip_header_HeapWords() { return > CMSCollector::skip_header_HeapWords(); } > 101 ) > > and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An > alternative would of course be to just declare skip_header_HeapWords() > in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then > you only need to include concurrentMarkSweeoGeneration.hpp in > cmsHeap.cpp. Ah ok, I've missed that one. Added it now. >>> IMO, I would just make the three functions above private. I know they >>> are protected in GenCollectedHeap, but it should be fine to have them >>> private in CMSHeap. Having them protected signals, at least to me, >>> that this class could be considered as a base class (protected to me >>> reads "this can be accessed by classes inheriting from this class), >>> and we don't want any class to inherit from CMSHeap. >> >> How can they be called from the superclass if they are private in the >> subclass? Would that work in C++? >> >> protected (to me) means visibility between super and subclasses. If I'd >> want to signal that I intend that to be overridden, I'd say 'virtual'. > > It is perfectly fine to have private virtual methods in C++ (see for > example > https://stackoverflow.com/questions/2170688/private-virtual-method-in-c). > A virtual function only needs to be protected if a "child class" needs > to access the function in the "parent class". For both gc_prologue and > gc_epilogue, this is the case, which is why they have to be > 'protected' in GenCollectedHeap. But, no class is going to derive from > CMSHeap, so they can be private in CMSHeap. Cool. Learned something new :-) It actually makes sense. I've moved all 3 methods into the private block in CMSHeap. I left them virtual (because of missing override), and I also left them in protected in GenCollectedHeap (prologue/epilogue because we need to, skip_header_HeapWords() to not confuse readers.) >>> This is for the serviceability agent. You will have to poke around in >>> hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used. >>> Unfortunately I'm not that familiar with the agent, perhaps someone >>> else can chime in here? >> >> Considering that the remaining references to GenCollectedHeap in >> vmStructs.cpp don't look like related to CMSHeap, I'd argue that what I >> did is all that's needed for now. Do you agree? > > Honestly, I don't know, that is why I asked if someone else with more > knowledge in this area can comment. Have you tried building and using > the SA agent with your change? You can also ask around on > hotspot-rt-dev and or serviceability-dev. I haven't tried building SA. I poked around hotspot/src/jdk.hotspot.agent and I think it should be ok. Can somebody who knows about it confirm this? Differential webrev: http://cr.openjdk.java.net/~rkennke/8179387/webrev.07.diff/ Full webrev: http://cr.openjdk.java.net/~rkennke/8179387/webrev.07/ Roman From mikael.gerdin at oracle.com Mon Jul 17 14:12:28 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 17 Jul 2017 16:12:28 +0200 Subject: RFR(XS) 8183935: G1GCPhaseTimes and G1RootProcessor do not include macros.hpp Message-ID: Hi, Please review this trivial change to add includes of macros.hpp to G1GCPhaseTimes and G1RootProcessor. They both the value of INCLUDE_AOT and as such should explicitly include the proper header to ensure that it is set to the correct value. Bug: https://bugs.openjdk.java.net/browse/JDK-8183935 Webrev: http://cr.openjdk.java.net/~mgerdin/8183935/webrev.0/ Testing: JPRT build-only Thanks /Mikael From thomas.schatzl at oracle.com Mon Jul 17 14:21:37 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 17 Jul 2017 16:21:37 +0200 Subject: RFR(XS) 8183935: G1GCPhaseTimes and G1RootProcessor do not include macros.hpp In-Reply-To: References: Message-ID: <1500301297.2845.22.camel@oracle.com> Hi Mikael, On Mon, 2017-07-17 at 16:12 +0200, Mikael Gerdin wrote: > Hi, > > Please review this trivial change to add includes of macros.hpp to? > G1GCPhaseTimes and G1RootProcessor. They both the value of > INCLUDE_AOT? > and as such should explicitly include the proper header to ensure > that? > it is set to the correct value. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8183935 > Webrev: http://cr.openjdk.java.net/~mgerdin/8183935/webrev.0/ > Testing: JPRT build-only ? ship it :) Although I do not think it is necessary to include macros.hpp both in the hpp and cpp file, but not sure. It won't hurt. Thanks, ? Thomas From mikael.gerdin at oracle.com Mon Jul 17 14:22:47 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 17 Jul 2017 16:22:47 +0200 Subject: RFR(XS) 8183935: G1GCPhaseTimes and G1RootProcessor do not include macros.hpp In-Reply-To: <1500301297.2845.22.camel@oracle.com> References: <1500301297.2845.22.camel@oracle.com> Message-ID: <7e3a87b0-e00b-3726-1835-346f902ab336@oracle.com> Hi Thomas, On 2017-07-17 16:21, Thomas Schatzl wrote: > Hi Mikael, > > On Mon, 2017-07-17 at 16:12 +0200, Mikael Gerdin wrote: >> Hi, >> >> Please review this trivial change to add includes of macros.hpp to >> G1GCPhaseTimes and G1RootProcessor. They both the value of >> INCLUDE_AOT >> and as such should explicitly include the proper header to ensure >> that >> it is set to the correct value. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8183935 >> Webrev: http://cr.openjdk.java.net/~mgerdin/8183935/webrev.0/ >> Testing: JPRT build-only > > ship it :) Although I do not think it is necessary to include > macros.hpp both in the hpp and cpp file, but not sure. It won't hurt. I sort of agree but I think it's a nice convention to always #include macros in files which look at the INCLUDE_* macros. Thanks for the review! /Mikael > > Thanks, > Thomas > From erik.helin at oracle.com Mon Jul 17 14:28:47 2017 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 17 Jul 2017 16:28:47 +0200 Subject: RFR(XS) 8183935: G1GCPhaseTimes and G1RootProcessor do not include macros.hpp In-Reply-To: References: Message-ID: Reviewed. Thanks, Erik On 07/17/2017 04:12 PM, Mikael Gerdin wrote: > Hi, > > Please review this trivial change to add includes of macros.hpp to > G1GCPhaseTimes and G1RootProcessor. They both the value of INCLUDE_AOT > and as such should explicitly include the proper header to ensure that > it is set to the correct value. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8183935 > Webrev: http://cr.openjdk.java.net/~mgerdin/8183935/webrev.0/ > Testing: JPRT build-only > > Thanks > /Mikael From shade at redhat.com Tue Jul 18 08:55:26 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 18 Jul 2017 10:55:26 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> Message-ID: No comments? I'll ask OpenJDK Lead to move this JEP to Candidate soon then. Thanks, -Aleksey On 07/10/2017 10:14 PM, Aleksey Shipilev wrote: > Hi, > > I would like to solicit feedback on Epsilon GC JEP: > https://bugs.openjdk.java.net/browse/JDK-8174901 > http://openjdk.java.net/jeps/8174901 > > The JEP text should be pretty self-contained, but we can certainly add more > points after the discussion happens. > > For the last few months, there were quite a few instances where Epsilon proved a > good vehicle to do GC performance research, especially on object locality and > code generation fronts. I think it also serves as the trivial target for > Erik's/Roman's GC interface work. > > The implementation and tests are there in the Sandbox, for those who are curious. > > Thanks, > -Aleksey > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From erik.helin at oracle.com Tue Jul 18 10:09:47 2017 From: erik.helin at oracle.com (Erik Helin) Date: Tue, 18 Jul 2017 12:09:47 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> Message-ID: <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> Hi Aleksey, first of all, thanks for trying this out and starting a discussion. Regarding the JEP, I have a few questions/comments: - the JEP specifies "last-drop performance improvements" as a motivation. However, I think you also know that taking a pause and compacting a heap that is mostly filled with garbage most likely results in higher throughput*. So are you thinking in terms of pauses here when you say performance? - why do you think Epsilon GC is a good baseline? IMHO, no barriers is not the perfect baseline, since it is just a theoretical exercise. Just cranking up the heap and using Serial is more realistic baseline, but even using that as a baseline is questionable. - the JEP specifies this as an experimental feature, meaning that you intend non-JVM developers to be able to run this. Have you considered the cost of supporting this option? You say "New jtreg tests under hotspot/gc/epsilon would be enough to assert correctness". For which platforms? How often should these tests be run, every night? Whenever we want to do large changes, like updating logging, tracing, etc, will we have to take Epsilon GC into account? Will there be serviceability support for Epsilon GC, like jstat, MXBeans, perf counters etc? - You quote "The experience, however, tells that many players in the Java ecosystem already did this exercise with expunging GC from their custom-built JVMs". So it seems that those users that want something like Epsilon GC are fine with building OpenJDK themselves? Having -XX:+UseEpsilonGC as a developer flag is much different compared to exposing it (and supporting, even if in experimental mode) to users. Please recall that even removing/changing an experimental flag requires a CSR request and careful motivation as why you want to remove it. I guess most of my question can be summarized as: this seems like it perhaps could be useful tool for JVM GC developers, why do you want to expose the flag to non-JVM developers (given all the work/support/maintenance that comes with that)? It is _great_ that you are experimenting and trying out new ideas in the VM, please continue doing that! Please don't interpret my questions/comments as to grumpy, this is just my experience from maintaining 5-6 different GC algorithms for more than five years that is speaking. There is _always_ a maintenance cost :) Thanks, Erik * almost always. There will of course be scenarios where the throughput could be higher without compacting. On 07/18/2017 10:55 AM, Aleksey Shipilev wrote: > No comments? I'll ask OpenJDK Lead to move this JEP to Candidate soon then. > > Thanks, > -Aleksey > > On 07/10/2017 10:14 PM, Aleksey Shipilev wrote: >> Hi, >> >> I would like to solicit feedback on Epsilon GC JEP: >> https://bugs.openjdk.java.net/browse/JDK-8174901 >> http://openjdk.java.net/jeps/8174901 >> >> The JEP text should be pretty self-contained, but we can certainly add more >> points after the discussion happens. >> >> For the last few months, there were quite a few instances where Epsilon proved a >> good vehicle to do GC performance research, especially on object locality and >> code generation fronts. I think it also serves as the trivial target for >> Erik's/Roman's GC interface work. >> >> The implementation and tests are there in the Sandbox, for those who are curious. >> >> Thanks, >> -Aleksey >> > > From shade at redhat.com Tue Jul 18 11:23:46 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 18 Jul 2017 13:23:46 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> Message-ID: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> Hi Erik, Thanks for looking into this! On 07/18/2017 12:09 PM, Erik Helin wrote: > first of all, thanks for trying this out and starting a discussion. Regarding > the JEP, I have a few questions/comments: > - the JEP specifies "last-drop performance improvements" as a > motivation. However, I think you also know that taking a pause and > compacting a heap that is mostly filled with garbage most likely > results in higher throughput*. So are you thinking in terms of pauses > here when you say performance? This cuts both ways: while it is true that moving GC improves locality [1], it is also true that the runtime overhead from barriers can be quite high [2, 3, 4]. So, "performance" in that section is tied to both throughput (no barriers) and pauses (no pauses). [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers [3] Also, remember the reason for UseCondCardMark [4] Also, remember the whole thing about G1 barriers > - why do you think Epsilon GC is a good baseline? IMHO, no barriers is > not the perfect baseline, since it is just a theoretical exercise. > Just cranking up the heap and using Serial is more realistic > baseline, but even using that as a baseline is questionable. It sometimes is. Non-generational GC is a good baseline for some workloads. Even Serial does not cut it, because even if you crank up old and trim down young, there is no way to disable reference write barrier store that maintains card tables. > - the JEP specifies this as an experimental feature, meaning that you > intend non-JVM developers to be able to run this. Have you considered > the cost of supporting this option? You say "New jtreg tests under > hotspot/gc/epsilon would be enough to assert correctness". For which > platforms? How often should these tests be run, every night? I think for all platforms, somewhere in hs-tier3? IMO, current test set in hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my 4-core i7. > Whenever we want to do large changes, like updating logging, tracing, etc, > will we have to take Epsilon GC into account? Will there be serviceability > support for Epsilon GC, like jstat, MXBeans, perf counters etc? I tried to address the maintenance costs in the JEP? It is unlikely to cause trouble, since it mostly calls into the shared code. And GC interface work would hopefully make BarrierSet into more shareable chunk of interface, which makes the whole thing even more self-contained. There is some new code in MemoryPools that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean that reports allocation pressure, although I'd need to add a test to assert that. To me, if the no-op GC requires much maintenance whenever something in JVM is changing, that points to the insanity of GC interface. No-op GC is a good canary in the coalmine for this. This is why one of the motivations is seeing what exactly a minimal GC should support to be functional. > - You quote "The experience, however, tells that many players in the > Java ecosystem already did this exercise with expunging GC from their > custom-built JVMs". So it seems that those users that want something > like Epsilon GC are fine with building OpenJDK themselves? Having > -XX:+UseEpsilonGC as a developer flag is much different compared to > exposing it (and supporting, even if in experimental mode) to users. There is a fair share of survivorship bias: we know about people who succeeded, do we know how many failed or given up? I think developers who do day-to-day Hotspot development grossly underestimate the effort required to even build a custom JVM. Most power users I know have did this exercise with great pains. I used to sing the same song to them: just build OpenJDK yourself, but then pesky details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType, oh new compilers that build OpenJDK with warnings and build does treat warnings as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc. etc. As much as OpenJDK build improved over the years, I am not audacious enough to claim it would ever be a completely smooth experience :) Now I am just willingly hand them binary builds. So I think having the experimental feature available in the actual product build extends the feature exposure. For example, suppose you are the academic writing a paper on GC, would you accept custom-build JVM into your results, or would you rather pick up the "gold" binary build from a standard distribution and run with it? > I guess most of my question can be summarized as: this seems like it perhaps > could be useful tool for JVM GC developers, why do you want to expose the flag > to non-JVM developers (given all the work/support/maintenance that comes with > that)? My initial thought was that the discussion about the costs should involve discussing the actual code. This is why there is a complete implementation in the Sandbox, and also the webrev posted. In the months following my initial (crazy) experiments, I had multiple people coming to me and asking when Epsilon is going to be in JDK, because they want to use it. And those were the ultra-power-users who actually know what they are doing with their garbage-free applications. So the short answer about why Epsilon is good to have in product is because the cost seems low, the benefits are present, and so cost/benefit is still low. > It is _great_ that you are experimenting and trying out new ideas in the VM, > please continue doing that! Please don't interpret my questions/comments as > to grumpy, this is just my experience from maintaining 5-6 different GC > algorithms for more than five years that is speaking. There is _always_ a > maintenance cost :) Yeah, I know how that feels. Look at the actual Epsilon changes, do they look scary to you, given your experience maintaining the related code? Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From erik.helin at oracle.com Tue Jul 18 12:37:19 2017 From: erik.helin at oracle.com (Erik Helin) Date: Tue, 18 Jul 2017 14:37:19 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> Message-ID: <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> On 07/18/2017 01:23 PM, Aleksey Shipilev wrote: > Hi Erik, > > Thanks for looking into this! > > On 07/18/2017 12:09 PM, Erik Helin wrote: >> first of all, thanks for trying this out and starting a discussion. Regarding >> the JEP, I have a few questions/comments: >> - the JEP specifies "last-drop performance improvements" as a >> motivation. However, I think you also know that taking a pause and >> compacting a heap that is mostly filled with garbage most likely >> results in higher throughput*. So are you thinking in terms of pauses >> here when you say performance? > > This cuts both ways: while it is true that moving GC improves locality [1], it > is also true that the runtime overhead from barriers can be quite high [2, 3, > 4]. So, "performance" in that section is tied to both throughput (no barriers) > and pauses (no pauses). > > [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality > [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers > [3] Also, remember the reason for UseCondCardMark > [4] Also, remember the whole thing about G1 barriers Absolutely, barriers can come with an overhead. But a barrier that consists of dirtying a card does not come with a quite high overhead. In fact, it comes with a very low overhead :) >> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is >> not the perfect baseline, since it is just a theoretical exercise. >> Just cranking up the heap and using Serial is more realistic >> baseline, but even using that as a baseline is questionable. > > It sometimes is. Non-generational GC is a good baseline for some workloads. Even > Serial does not cut it, because even if you crank up old and trim down young, > there is no way to disable reference write barrier store that maintains card tables. I will still point out though that a GC without a barrier is still just a theoretical baseline. One could imagine a single-gen mark-compact GC for OpenJDK (that would require no barriers), but AFAIK almost all users prefer the slight overhead of dirtying a card (and in return get a generational GC) for the use cases where a single-gen mark-compact algorithm would be applicable. >> - the JEP specifies this as an experimental feature, meaning that you >> intend non-JVM developers to be able to run this. Have you considered >> the cost of supporting this option? You say "New jtreg tests under >> hotspot/gc/epsilon would be enough to assert correctness". For which >> platforms? How often should these tests be run, every night? > > I think for all platforms, somewhere in hs-tier3? IMO, current test set in > hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my > 4-core i7. > >> Whenever we want to do large changes, like updating logging, tracing, etc, >> will we have to take Epsilon GC into account? Will there be serviceability >> support for Epsilon GC, like jstat, MXBeans, perf counters etc? > I tried to address the maintenance costs in the JEP? It is unlikely to cause > trouble, since it mostly calls into the shared code. And GC interface work would > hopefully make BarrierSet into more shareable chunk of interface, which makes > the whole thing even more self-contained. There is some new code in MemoryPools > that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean > that reports allocation pressure, although I'd need to add a test to assert that. > > To me, if the no-op GC requires much maintenance whenever something in JVM is > changing, that points to the insanity of GC interface. No-op GC is a good canary > in the coalmine for this. This is why one of the motivations is seeing what > exactly a minimal GC should support to be functional. Again, our opinions differ on this. Am I all for changing the GC interface? Yes, I have expressed nothing but full support of the great work that Roman is doing. Do I think we need something like a canary in the coalmine for JVM internal, GC internal, code? No. If you want anything resembling a canary, write a unit test using googletest that exercises the interface. However, again, this might be useful for someone who wants try to do some changes to the JVM GC code. But that, to me, is not enough to expose it to non-JVM developers. It could be useful to have in the source code though, maybe like a --with-jvm-feature kind of thing? >> - You quote "The experience, however, tells that many players in the >> Java ecosystem already did this exercise with expunging GC from their >> custom-built JVMs". So it seems that those users that want something >> like Epsilon GC are fine with building OpenJDK themselves? Having >> -XX:+UseEpsilonGC as a developer flag is much different compared to >> exposing it (and supporting, even if in experimental mode) to users. > > There is a fair share of survivorship bias: we know about people who succeeded, > do we know how many failed or given up? I think developers who do day-to-day > Hotspot development grossly underestimate the effort required to even build a > custom JVM. Most power users I know have did this exercise with great pains. I > used to sing the same song to them: just build OpenJDK yourself, but then pesky > details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType, > oh new compilers that build OpenJDK with warnings and build does treat warnings > as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc. > etc. As much as OpenJDK build improved over the years, I am not audacious enough > to claim it would ever be a completely smooth experience :) Now I am just > willingly hand them binary builds. Such users will still be able to get binary builds if someone is willing to produce them with Epsilon GC. There are plenty of OpenJDK binary builds available from various organizations/companies. > So I think having the experimental feature available in the actual product build > extends the feature exposure. For example, suppose you are the academic writing > a paper on GC, would you accept custom-build JVM into your results, or would you > rather pick up the "gold" binary build from a standard distribution and run with it? I guess such researcher would be producing a build from the same source as the one the made changes to? How could they otherwise do any kind of reasonable comparison? >> I guess most of my question can be summarized as: this seems like it perhaps >> could be useful tool for JVM GC developers, why do you want to expose the flag >> to non-JVM developers (given all the work/support/maintenance that comes with >> that)? > > My initial thought was that the discussion about the costs should involve > discussing the actual code. This is why there is a complete implementation in > the Sandbox, and also the webrev posted. > > In the months following my initial (crazy) experiments, I had multiple people > coming to me and asking when Epsilon is going to be in JDK, because they want to > use it. And those were the ultra-power-users who actually know what they are > doing with their garbage-free applications. > > So the short answer about why Epsilon is good to have in product is because the > cost seems low, the benefits are present, and so cost/benefit is still low. And it is here that our opinions differ :) For you the maintenance cost is low, whereas for me, having yet another command-line flag, yet another code path, gets in the way. You have to respect that we have different background and experiences here. >> It is _great_ that you are experimenting and trying out new ideas in the VM, >> please continue doing that! Please don't interpret my questions/comments as >> to grumpy, this is just my experience from maintaining 5-6 different GC >> algorithms for more than five years that is speaking. There is _always_ a >> maintenance cost :) > > Yeah, I know how that feels. Look at the actual Epsilon changes, do they look > scary to you, given your experience maintaining the related code? I don't like taking the role of the grumpy open source maintainer :) No, the code is not scary, code is rarely scary IMO, it is just code. Running tests, fixing that a test -Xmx1g isn't run on a RPi, having additional code paths, more cases to take into consideration when refactoring, is burdensome. And to me, the benefits of benchmarking against Epsilon vs benchmarking against Serial/Parallel isn't that high to me. But, I can understand that it is useful when trying to evaluate for example the cost of stores into a HashMap. Which is why I'm not against the code, but I'm not keen on exposing this to non-JVM developers. Thanks, Erik > Thanks, > -Aleksey > From rkennke at redhat.com Tue Jul 18 12:45:25 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 18 Jul 2017 14:45:25 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> Message-ID: <03eb1ee9-d022-18b7-4f91-c9ead4922c60@redhat.com> Hi Aleksey, what speaks against doing full GCs when memory runs out? I can imagine scenarios when it could be useful to allow full-GCs: 1. Allow full-GCs only on System.gc()... for testing? Or for control fanatics? 2. Allow full-GCs only on OOM.. for containerized apps or as replacement for letting the process die and respawn (i.e. don't care at all about pauses, but care about throughput and absolutely-no-barriers) 3. Allow full-GCs in both cases I can see this enabled/disabled selectively by flags. Yes, I know, complexity, maintenance, etc blah blah ;-) But it should be very simple to do. Reuse markSweep.cpp should do it. Basically serial GC without the generational barriers. What do you think? Roman Am 18.07.2017 um 13:23 schrieb Aleksey Shipilev: > Hi Erik, > > Thanks for looking into this! > > On 07/18/2017 12:09 PM, Erik Helin wrote: >> first of all, thanks for trying this out and starting a discussion. Regarding >> the JEP, I have a few questions/comments: >> - the JEP specifies "last-drop performance improvements" as a >> motivation. However, I think you also know that taking a pause and >> compacting a heap that is mostly filled with garbage most likely >> results in higher throughput*. So are you thinking in terms of pauses >> here when you say performance? > This cuts both ways: while it is true that moving GC improves locality [1], it > is also true that the runtime overhead from barriers can be quite high [2, 3, > 4]. So, "performance" in that section is tied to both throughput (no barriers) > and pauses (no pauses). > > [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality > [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers > [3] Also, remember the reason for UseCondCardMark > [4] Also, remember the whole thing about G1 barriers > >> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is >> not the perfect baseline, since it is just a theoretical exercise. >> Just cranking up the heap and using Serial is more realistic >> baseline, but even using that as a baseline is questionable. > It sometimes is. Non-generational GC is a good baseline for some workloads. Even > Serial does not cut it, because even if you crank up old and trim down young, > there is no way to disable reference write barrier store that maintains card tables. > >> - the JEP specifies this as an experimental feature, meaning that you >> intend non-JVM developers to be able to run this. Have you considered >> the cost of supporting this option? You say "New jtreg tests under >> hotspot/gc/epsilon would be enough to assert correctness". For which >> platforms? How often should these tests be run, every night? > I think for all platforms, somewhere in hs-tier3? IMO, current test set in > hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my > 4-core i7. > >> Whenever we want to do large changes, like updating logging, tracing, etc, >> will we have to take Epsilon GC into account? Will there be serviceability >> support for Epsilon GC, like jstat, MXBeans, perf counters etc? > I tried to address the maintenance costs in the JEP? It is unlikely to cause > trouble, since it mostly calls into the shared code. And GC interface work would > hopefully make BarrierSet into more shareable chunk of interface, which makes > the whole thing even more self-contained. There is some new code in MemoryPools > that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean > that reports allocation pressure, although I'd need to add a test to assert that. > > To me, if the no-op GC requires much maintenance whenever something in JVM is > changing, that points to the insanity of GC interface. No-op GC is a good canary > in the coalmine for this. This is why one of the motivations is seeing what > exactly a minimal GC should support to be functional. > > >> - You quote "The experience, however, tells that many players in the >> Java ecosystem already did this exercise with expunging GC from their >> custom-built JVMs". So it seems that those users that want something >> like Epsilon GC are fine with building OpenJDK themselves? Having >> -XX:+UseEpsilonGC as a developer flag is much different compared to >> exposing it (and supporting, even if in experimental mode) to users. > There is a fair share of survivorship bias: we know about people who succeeded, > do we know how many failed or given up? I think developers who do day-to-day > Hotspot development grossly underestimate the effort required to even build a > custom JVM. Most power users I know have did this exercise with great pains. I > used to sing the same song to them: just build OpenJDK yourself, but then pesky > details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType, > oh new compilers that build OpenJDK with warnings and build does treat warnings > as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc. > etc. As much as OpenJDK build improved over the years, I am not audacious enough > to claim it would ever be a completely smooth experience :) Now I am just > willingly hand them binary builds. > > So I think having the experimental feature available in the actual product build > extends the feature exposure. For example, suppose you are the academic writing > a paper on GC, would you accept custom-build JVM into your results, or would you > rather pick up the "gold" binary build from a standard distribution and run with it? > > >> I guess most of my question can be summarized as: this seems like it perhaps >> could be useful tool for JVM GC developers, why do you want to expose the flag >> to non-JVM developers (given all the work/support/maintenance that comes with >> that)? > My initial thought was that the discussion about the costs should involve > discussing the actual code. This is why there is a complete implementation in > the Sandbox, and also the webrev posted. > > In the months following my initial (crazy) experiments, I had multiple people > coming to me and asking when Epsilon is going to be in JDK, because they want to > use it. And those were the ultra-power-users who actually know what they are > doing with their garbage-free applications. > > So the short answer about why Epsilon is good to have in product is because the > cost seems low, the benefits are present, and so cost/benefit is still low. > > >> It is _great_ that you are experimenting and trying out new ideas in the VM, >> please continue doing that! Please don't interpret my questions/comments as >> to grumpy, this is just my experience from maintaining 5-6 different GC >> algorithms for more than five years that is speaking. There is _always_ a >> maintenance cost :) > Yeah, I know how that feels. Look at the actual Epsilon changes, do they look > scary to you, given your experience maintaining the related code? > > Thanks, > -Aleksey > From erik.osterlund at oracle.com Tue Jul 18 13:20:04 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 18 Jul 2017 15:20:04 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> Message-ID: <596E0B04.8030407@oracle.com> Hi Aleksey, If I understand this correctly, the motivation for EpsilonGC is to be able to measure the overheads due to GC pauses and GC barriers and measure only the application throughput without GC jitter, and then use that as a baseline for measuring performance of an actual GC implementation compared to EpsilonGC. Howerver, automatic memory management is quite complicated when you think about it. Will EpsilonGC allocate all memory up-front, or expand the heap? In the case where it expanded on-demand until it runs out of memory, what consequences does that potential expansion have on throughput? In the case it is allocated upfront, will pages be pre-touched? If so, what NUMA nodes will the pre-mapped memory map in to? Will mutators try to allocate NUMA-local memory? What consequences will the larger heap footprint have on the throughput because of decreased memory locality and as a result increased last level cache misses and suddenly having to spread to more NUMA nodes? Does the larger footprint change the requirements on compressed oops and what encoding/decoding of oop compression is required? In case of an expanding heap - can it even use compressed oops? In case of a not expanding heap allocated up-front, does a comparison of a GC using compressed oops with a baseline that can inherently not use it make sense? Will lack of compaction and resulting possibly worse object locality of memory accesses affect performance? I am not convinced that we can just remove GC-induced overheads from the picture and measure the application throughput without the GC by using an EpsilonGC as proposed. At least I do not think I would use it to draw conclusions about GC-induced throughput loss. It seems like an apples to oranges comparison to me. Or perhaps I have missed something? Thanks, /Erik On 2017-07-18 13:23, Aleksey Shipilev wrote: > Hi Erik, > > Thanks for looking into this! > > On 07/18/2017 12:09 PM, Erik Helin wrote: >> first of all, thanks for trying this out and starting a discussion. Regarding >> the JEP, I have a few questions/comments: >> - the JEP specifies "last-drop performance improvements" as a >> motivation. However, I think you also know that taking a pause and >> compacting a heap that is mostly filled with garbage most likely >> results in higher throughput*. So are you thinking in terms of pauses >> here when you say performance? > This cuts both ways: while it is true that moving GC improves locality [1], it > is also true that the runtime overhead from barriers can be quite high [2, 3, > 4]. So, "performance" in that section is tied to both throughput (no barriers) > and pauses (no pauses). > > [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality > [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers > [3] Also, remember the reason for UseCondCardMark > [4] Also, remember the whole thing about G1 barriers > >> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is >> not the perfect baseline, since it is just a theoretical exercise. >> Just cranking up the heap and using Serial is more realistic >> baseline, but even using that as a baseline is questionable. > It sometimes is. Non-generational GC is a good baseline for some workloads. Even > Serial does not cut it, because even if you crank up old and trim down young, > there is no way to disable reference write barrier store that maintains card tables. > >> - the JEP specifies this as an experimental feature, meaning that you >> intend non-JVM developers to be able to run this. Have you considered >> the cost of supporting this option? You say "New jtreg tests under >> hotspot/gc/epsilon would be enough to assert correctness". For which >> platforms? How often should these tests be run, every night? > I think for all platforms, somewhere in hs-tier3? IMO, current test set in > hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my > 4-core i7. > >> Whenever we want to do large changes, like updating logging, tracing, etc, >> will we have to take Epsilon GC into account? Will there be serviceability >> support for Epsilon GC, like jstat, MXBeans, perf counters etc? > I tried to address the maintenance costs in the JEP? It is unlikely to cause > trouble, since it mostly calls into the shared code. And GC interface work would > hopefully make BarrierSet into more shareable chunk of interface, which makes > the whole thing even more self-contained. There is some new code in MemoryPools > that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean > that reports allocation pressure, although I'd need to add a test to assert that. > > To me, if the no-op GC requires much maintenance whenever something in JVM is > changing, that points to the insanity of GC interface. No-op GC is a good canary > in the coalmine for this. This is why one of the motivations is seeing what > exactly a minimal GC should support to be functional. > > >> - You quote "The experience, however, tells that many players in the >> Java ecosystem already did this exercise with expunging GC from their >> custom-built JVMs". So it seems that those users that want something >> like Epsilon GC are fine with building OpenJDK themselves? Having >> -XX:+UseEpsilonGC as a developer flag is much different compared to >> exposing it (and supporting, even if in experimental mode) to users. > There is a fair share of survivorship bias: we know about people who succeeded, > do we know how many failed or given up? I think developers who do day-to-day > Hotspot development grossly underestimate the effort required to even build a > custom JVM. Most power users I know have did this exercise with great pains. I > used to sing the same song to them: just build OpenJDK yourself, but then pesky > details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType, > oh new compilers that build OpenJDK with warnings and build does treat warnings > as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc. > etc. As much as OpenJDK build improved over the years, I am not audacious enough > to claim it would ever be a completely smooth experience :) Now I am just > willingly hand them binary builds. > > So I think having the experimental feature available in the actual product build > extends the feature exposure. For example, suppose you are the academic writing > a paper on GC, would you accept custom-build JVM into your results, or would you > rather pick up the "gold" binary build from a standard distribution and run with it? > > >> I guess most of my question can be summarized as: this seems like it perhaps >> could be useful tool for JVM GC developers, why do you want to expose the flag >> to non-JVM developers (given all the work/support/maintenance that comes with >> that)? > My initial thought was that the discussion about the costs should involve > discussing the actual code. This is why there is a complete implementation in > the Sandbox, and also the webrev posted. > > In the months following my initial (crazy) experiments, I had multiple people > coming to me and asking when Epsilon is going to be in JDK, because they want to > use it. And those were the ultra-power-users who actually know what they are > doing with their garbage-free applications. > > So the short answer about why Epsilon is good to have in product is because the > cost seems low, the benefits are present, and so cost/benefit is still low. > > >> It is _great_ that you are experimenting and trying out new ideas in the VM, >> please continue doing that! Please don't interpret my questions/comments as >> to grumpy, this is just my experience from maintaining 5-6 different GC >> algorithms for more than five years that is speaking. There is _always_ a >> maintenance cost :) > Yeah, I know how that feels. Look at the actual Epsilon changes, do they look > scary to you, given your experience maintaining the related code? > > Thanks, > -Aleksey > From shade at redhat.com Tue Jul 18 13:26:03 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 18 Jul 2017 15:26:03 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> Message-ID: On 07/18/2017 02:37 PM, Erik Helin wrote: >> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality >> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers >> [3] Also, remember the reason for UseCondCardMark >> [4] Also, remember the whole thing about G1 barriers > > Absolutely, barriers can come with an overhead. But a barrier that consists of > dirtying a card does not come with a quite high overhead. In fact, it comes with > a very low overhead :) Mhm! "Low" is in the eye of beholder. You can't beat zero overhead. And there are people who literally count instructions on their hot paths, while still developing in Java. Let me ask you a trick question: how do you *know* the card mark overhead is small, if you don't have a no-barrier GC to compare against? >>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is >>> not the perfect baseline, since it is just a theoretical exercise. >>> Just cranking up the heap and using Serial is more realistic >>> baseline, but even using that as a baseline is questionable. >> >> It sometimes is. Non-generational GC is a good baseline for some workloads. Even >> Serial does not cut it, because even if you crank up old and trim down young, >> there is no way to disable reference write barrier store that maintains card >> tables. > > I will still point out though that a GC without a barrier is still just a > theoretical baseline. One could imagine a single-gen mark-compact GC for OpenJDK > (that would require no barriers), but AFAIK almost all users prefer the slight > overhead of dirtying a card (and in return get a generational GC) for the use > cases where a single-gen mark-compact algorithm would be applicable. Mark-compact, maybe. But single-gen mark-sweep algorithms are plenty, see e.g. Go runtime. I have hard time seeing how is that theoretical. > However, again, this might be useful for someone who wants try to do some > changes to the JVM GC code. But that, to me, is not enough to expose it to > non-JVM developers. It could be useful to have in the source code though, maybe > like a --with-jvm-feature kind of thing? That would go against the maintainability argument, no? Because you will still have to maintain the code, *and* it will require building a special JVM flavor. So it is a lose-lose: neither users get it, nor maintainers have simpler lives. > [snip] Such users will still be able to get binary builds if someone is willing to > produce them with Epsilon GC. There are plenty of OpenJDK binary builds > available from various organizations/companies. Well, yes. I actually happen to know the company which can distribute this in the downstream OpenJDK builds, and reap the ultra-power-users loyalty. But, I am maintaining that having the code upstream is beneficial, even if that company is going to do maintenance work either way. >> So the short answer about why Epsilon is good to have in product is because the >> cost seems low, the benefits are present, and so cost/benefit is still low. > > And it is here that our opinions differ :) For you the maintenance cost is low, > whereas for me, having yet another command-line flag, yet another code path, > gets in the way. You have to respect that we have different background and > experiences here. I am not trying to challenge your background or experience here, I am challenging the cost estimates though. Because ad absurdum, we can shoot down any feature change coming into JVM, just because it introduces yet another flag, yet another code path, etc. I cannot see where the Epsilon maintenance would be a burden: it comes with automated tests that run fast, its implementation seemss trivial, its exposure to VM code seems trivial too (apart from the BarrierSet thing that would be trimmed down with GC interface work). >> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look >> scary to you, given your experience maintaining the related code? > > I don't like taking the role of the grumpy open source maintainer :) No, the > code is not scary, code is rarely scary IMO, it is just code. Running tests, > fixing that a test -Xmx1g isn't run on a RPi, having additional code paths, more > cases to take into consideration when refactoring, is burdensome. And to me, the > benefits of benchmarking against Epsilon vs benchmarking against Serial/Parallel > isn't that high to me. > > But, I can understand that it is useful when trying to evaluate for example the > cost of stores into a HashMap. Which is why I'm not against the code, but I'm > not keen on exposing this to non-JVM developers. I hear you, but thing is, Epsilon does not seem a coding exercise anymore. Epsilon is useful for GC performance work especially when readily available, and there are willing users to adopt it. Similarly how we respect maintainers' burden in the product, we have to also see what benefits users, especially the ones who are championing our project performance even by cutting corners with e.g. no-op GCs. Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Tue Jul 18 13:28:26 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 18 Jul 2017 15:28:26 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <596E0B04.8030407@oracle.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <596E0B04.8030407@oracle.com> Message-ID: At the very least, Epsilon's a great tool for measuring the cost of barriers. How many times have we heard the question: 'but what is the overhead of the additional barriers of Shenandoah?' And we couldn't really answer it. Compared to what? G1? Serial? Parallel? CMS? Each of which has their own peculiarities when it comes to barriers. With Epsilon it is possible to construct a benchmark that does certain heap accesses (primitive/objects reads/writes special stuff like CASes, etc) and do no more allocations (thus locality spread doesn't really matter) and give an answer to those questions and say: no-barriers throughput is this, and with that GC's barriers, we have this. etc I realize that such results are a bit theoretical, but it gives a much better idea than not having any way to measure this in an isolated way at all. Roman Am 18.07.2017 um 15:20 schrieb Erik ?sterlund: > Hi Aleksey, > > If I understand this correctly, the motivation for EpsilonGC is to be > able to measure the overheads due to GC pauses and GC barriers and > measure only the application throughput without GC jitter, and then > use that as a baseline for measuring performance of an actual GC > implementation compared to EpsilonGC. > > Howerver, automatic memory management is quite complicated when you > think about it. Will EpsilonGC allocate all memory up-front, or expand > the heap? In the case where it expanded on-demand until it runs out of > memory, what consequences does that potential expansion have on > throughput? In the case it is allocated upfront, will pages be > pre-touched? If so, what NUMA nodes will the pre-mapped memory map in > to? Will mutators try to allocate NUMA-local memory? What consequences > will the larger heap footprint have on the throughput because of > decreased memory locality and as a result increased last level cache > misses and suddenly having to spread to more NUMA nodes? Does the > larger footprint change the requirements on compressed oops and what > encoding/decoding of oop compression is required? In case of an > expanding heap - can it even use compressed oops? In case of a not > expanding heap allocated up-front, does a comparison of a GC using > compressed oops with a baseline that can inherently not use it make > sense? Will lack of compaction and resulting possibly worse object > locality of memory accesses affect performance? > > I am not convinced that we can just remove GC-induced overheads from > the picture and measure the application throughput without the GC by > using an EpsilonGC as proposed. At least I do not think I would use it > to draw conclusions about GC-induced throughput loss. It seems like an > apples to oranges comparison to me. Or perhaps I have missed something? > > Thanks, > /Erik > > On 2017-07-18 13:23, Aleksey Shipilev wrote: >> Hi Erik, >> >> Thanks for looking into this! >> >> On 07/18/2017 12:09 PM, Erik Helin wrote: >>> first of all, thanks for trying this out and starting a discussion. >>> Regarding >>> the JEP, I have a few questions/comments: >>> - the JEP specifies "last-drop performance improvements" as a >>> motivation. However, I think you also know that taking a pause and >>> compacting a heap that is mostly filled with garbage most likely >>> results in higher throughput*. So are you thinking in terms of >>> pauses >>> here when you say performance? >> This cuts both ways: while it is true that moving GC improves >> locality [1], it >> is also true that the runtime overhead from barriers can be quite >> high [2, 3, >> 4]. So, "performance" in that section is tied to both throughput (no >> barriers) >> and pauses (no pauses). >> >> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality >> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers >> [3] Also, remember the reason for UseCondCardMark >> [4] Also, remember the whole thing about G1 barriers >> >>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is >>> not the perfect baseline, since it is just a theoretical exercise. >>> Just cranking up the heap and using Serial is more realistic >>> baseline, but even using that as a baseline is questionable. >> It sometimes is. Non-generational GC is a good baseline for some >> workloads. Even >> Serial does not cut it, because even if you crank up old and trim >> down young, >> there is no way to disable reference write barrier store that >> maintains card tables. >> >>> - the JEP specifies this as an experimental feature, meaning that you >>> intend non-JVM developers to be able to run this. Have you >>> considered >>> the cost of supporting this option? You say "New jtreg tests under >>> hotspot/gc/epsilon would be enough to assert correctness". For which >>> platforms? How often should these tests be run, every night? >> I think for all platforms, somewhere in hs-tier3? IMO, current test >> set in >> hotspot/gc/epsilon is fairly complete, and it takes less than a >> minute on my >> 4-core i7. >> >>> Whenever we want to do large changes, like updating logging, >>> tracing, etc, >>> will we have to take Epsilon GC into account? Will there be >>> serviceability >>> support for Epsilon GC, like jstat, MXBeans, perf counters etc? >> I tried to address the maintenance costs in the JEP? It is unlikely >> to cause >> trouble, since it mostly calls into the shared code. And GC interface >> work would >> hopefully make BarrierSet into more shareable chunk of interface, >> which makes >> the whole thing even more self-contained. There is some new code in >> MemoryPools >> that handles the minimal diagnostics. MXBeans still work, at least >> ThreadMXBean >> that reports allocation pressure, although I'd need to add a test to >> assert that. >> >> To me, if the no-op GC requires much maintenance whenever something >> in JVM is >> changing, that points to the insanity of GC interface. No-op GC is a >> good canary >> in the coalmine for this. This is why one of the motivations is >> seeing what >> exactly a minimal GC should support to be functional. >> >> >>> - You quote "The experience, however, tells that many players in the >>> Java ecosystem already did this exercise with expunging GC from >>> their >>> custom-built JVMs". So it seems that those users that want something >>> like Epsilon GC are fine with building OpenJDK themselves? Having >>> -XX:+UseEpsilonGC as a developer flag is much different compared to >>> exposing it (and supporting, even if in experimental mode) to users. >> There is a fair share of survivorship bias: we know about people who >> succeeded, >> do we know how many failed or given up? I think developers who do >> day-to-day >> Hotspot development grossly underestimate the effort required to even >> build a >> custom JVM. Most power users I know have did this exercise with great >> pains. I >> used to sing the same song to them: just build OpenJDK yourself, but >> then pesky >> details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, >> oh FreeType, >> oh new compilers that build OpenJDK with warnings and build does >> treat warnings >> as errors, oh actual API mismatches against msvcrt, glibc, whatever, >> etc. etc. >> etc. As much as OpenJDK build improved over the years, I am not >> audacious enough >> to claim it would ever be a completely smooth experience :) Now I am >> just >> willingly hand them binary builds. >> >> So I think having the experimental feature available in the actual >> product build >> extends the feature exposure. For example, suppose you are the >> academic writing >> a paper on GC, would you accept custom-build JVM into your results, >> or would you >> rather pick up the "gold" binary build from a standard distribution >> and run with it? >> >> >>> I guess most of my question can be summarized as: this seems like it >>> perhaps >>> could be useful tool for JVM GC developers, why do you want to >>> expose the flag >>> to non-JVM developers (given all the work/support/maintenance that >>> comes with >>> that)? >> My initial thought was that the discussion about the costs should >> involve >> discussing the actual code. This is why there is a complete >> implementation in >> the Sandbox, and also the webrev posted. >> >> In the months following my initial (crazy) experiments, I had >> multiple people >> coming to me and asking when Epsilon is going to be in JDK, because >> they want to >> use it. And those were the ultra-power-users who actually know what >> they are >> doing with their garbage-free applications. >> >> So the short answer about why Epsilon is good to have in product is >> because the >> cost seems low, the benefits are present, and so cost/benefit is >> still low. >> >> >>> It is _great_ that you are experimenting and trying out new ideas in >>> the VM, >>> please continue doing that! Please don't interpret my >>> questions/comments as >>> to grumpy, this is just my experience from maintaining 5-6 different GC >>> algorithms for more than five years that is speaking. There is >>> _always_ a >>> maintenance cost :) >> Yeah, I know how that feels. Look at the actual Epsilon changes, do >> they look >> scary to you, given your experience maintaining the related code? >> >> Thanks, >> -Aleksey >> > From thomas.schatzl at oracle.com Tue Jul 18 13:34:41 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 18 Jul 2017 15:34:41 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> Message-ID: <1500384881.2815.79.camel@oracle.com> Hi Aleksey, ? I would like to expand this cost/benefit analysis a bit; I think the most contentious point brought up by Erik has been the develop vs. experimental flag issue. For that, let me present you my understanding of the size and costs of making this an experimental (actually product) vs. develop flag for the intended target group as presented here. On Tue, 2017-07-18 at 13:23 +0200, Aleksey Shipilev wrote: > Hi Erik, > > Thanks for looking into this! > > On 07/18/2017 12:09 PM, Erik Helin wrote: > > > > first of all, thanks for trying this out and starting a discussion. > > Regarding the JEP, I have a few questions/comments: [...] > > > - why do you think Epsilon GC is a good baseline? IMHO, no barriers > > is not the perfect baseline, since it is just a theoretical > > exercise. Just cranking up the heap and using Serial is more > > realistic ? baseline, but even using that as a baseline is > > questionable. > It sometimes is. Non-generational GC is a good baseline for some > workloads. Even Serial does not cut it, because even if you crank up > old and trim down young, there is no way to disable reference write > barrier store that maintains card tables. Not prevented by making it a develop option. > > - the JEP specifies this as an experimental feature, meaning that > > you intend non-JVM developers to be able to run this. Have you > > considered the cost of supporting this option? You say "New jtreg > > tests under hotspot/gc/epsilon would be enough to assert > > correctness". For which platforms? How often should these tests be > > run, every night?? > I think for all platforms, somewhere in hs-tier3? IMO, current test > set in hotspot/gc/epsilon is fairly complete, and it takes less than > a minute on my 4-core i7. Running it daily, on X platforms on Y OSes for Z releases adds up quickly. Could run something else instead. And there is always something else to run on these machines, trust me. :) > > > > Whenever we want to do large changes, like updating logging, > > tracing, etc, will we have to take Epsilon GC into account? Will > > there be serviceability support for Epsilon GC, like jstat, > > MXBeans, perf counters etc? > I tried to address the maintenance costs in the JEP? It is unlikely > to cause trouble, since it mostly calls into the shared code. And GC > interface work would hopefully make BarrierSet into more shareable > chunk of interface, which makes the whole thing even more self- > contained. There is some new code in MemoryPools that handles the > minimal diagnostics. MXBeans still work, at least ThreadMXBean > that reports allocation pressure, although I'd need to add a test to > assert that. > > To me, if the no-op GC requires much maintenance whenever something > in JVM is changing, that points to the insanity of GC interface. No- > op GC is a good canary in the coalmine for this. This is why one of > the motivations is seeing what exactly a minimal GC should support to > be functional. Sanity checking of the interfaces is not prevented by a develop option. > > > > - You quote "The experience, however, tells that many players in > > the Java ecosystem already did this exercise with expunging GC from > > their custom-built JVMs". So it seems that those users that want > > something like Epsilon GC are fine with building OpenJDK > > themselves? Having -XX:+UseEpsilonGC as a developer flag is much > > different compared to exposing it (and supporting, even if in > > experimental mode) to users. > > There is a fair share of survivorship bias: we know about people who > succeeded, do we know how many failed or given up? I think developers > who do day-to-day Hotspot development grossly underestimate the > effort required to even build a custom JVM. Most power users I know > have did this exercise with great pains. I used to sing the same song > to them: just build OpenJDK yourself, but then pesky details pour in. > Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType, oh > new compilers that build OpenJDK with warnings and build does treat > warnings as errors, oh actual API mismatches against msvcrt, glibc, > whatever, etc. etc. etc. As much as OpenJDK build improved over the > years, I am not audacious enough to claim it would ever be a > completely smooth experience :) Now I am just willingly hand them > binary builds. > > So I think having the experimental feature available in the actual > product build extends the feature exposure. I agree here. The question is, by how much. So academics (and I am not trying to hit on academics here, you brought them up ;)) that write a paper on GC but never need to rebuild the VM (including the JDK here) because they don't do any changes would be inconvenienced. Let me ask, how many do you expect these to be? From my understanding there seems to be a very manageable yearly total GC paper output at the usual conferences. Not sure how putting Epsilon GC in product would improve that. So, even after all these target group concerns, how much time do you think these persons writing that paper (that do not need to recompile the VM and need to show their numbers in Epsilon GC) are going to spend on getting numbers compared to the hypothetical time for compiling the VM? [My personal experience is that when developing any changes by far most of the time is spent on waiting for the machine(s) to complete testing, not writing any actual changes or building. When writing a paper I my experience is that a very large part of the time is spent on running and re-running tests over and over again to be able to understand and explain results, or tweaking changes, or simply fixing bugs for some results] > For example, suppose you are the academic writing a paper on GC, > would you accept custom-build JVM into your results, or would you > rather pick up the "gold" binary build from a standard distribution > and run with it? Not sure what you meant with this latter argument, if it is actually an argument. If I wanted to effect a change in the VM and measure it, I would already need to change and recompile the VM. So it is not a big stretch to imagine that baselines could come from something recompiled. I have seen quite a few papers using modified baselines for one or the other reason (like adding necessary instrumentation, maybe fixing obvious bugs). >From experience I know that for many reasons it is already often basically impossible for somebody else to reproduce particular results (without extreme effort) if not impossible. Even understanding some baseline results may require some imagination how they were obtained. Not even talking about reproducing them. There seems to be a very small step from trusting results from a "gold" official binary to trusting a slightly modified one. As for the amount of inconvenience, I think the users that already need to recompile for their changes are not very much inconvenienced. I.e. changing a single "develop" to "product" seems to be a very small effort. > > I guess most of my question can be summarized as: this seems like > > it perhaps could be useful tool for JVM GC developers, why do you > > want to expose the flag to non-JVM developers (given all the > > work/support/maintenance that comes with that)? > My initial thought was that the discussion about the costs should > involve discussing the actual code. This is why there is a complete > implementation in the Sandbox, and also the webrev posted. > > In the months following my initial (crazy) experiments, I had > multiple people coming to me and asking when Epsilon is going to be > in JDK, because they want to use it. And those were the ultra-power- > users who actually know what they are doing with their garbage-free > applications. Aren't ultra-power-users able to rebuild the VM? What is their cost vs. the effort spent into making their applications garbage-free or implementing the necessary workarounds to be able to use that gc (mentioned load-balancer trickery etc)? > So the short answer about why Epsilon is good to have in product is > because the cost seems low, the benefits are present, and so > cost/benefit is still low. The number of people benefitting from having this available in a product build seems to be extremely small. And so seem to be their relative costs to fix that. Increased exposure seems to be a real recurring cost for maintenance in the product, although it seems relatively small compared to other features. Still somebody has to do it. > > It is _great_ that you are experimenting and trying out new ideas > > in the VM, please continue doing that! Please don't interpret my > > questions/comments as to grumpy, this is just my experience from > > maintaining 5-6 different GC algorithms for more than five years > > that is speaking. There is _always_ a maintenance cost :) > Yeah, I know how that feels. Look at the actual Epsilon changes, do > they look scary to you, given your experience maintaining the related > code? Well, 1500 LOC (well, ~800 without the tests) of changes do look scary to me, whatever they do :) Overall, on the question of develop vs. experimental option, I would tend to prefer a develop option. In this area there simply seem to be too many downsides compared to the upsides for an extremely limited user group. Thanks, ? Thomas From shade at redhat.com Tue Jul 18 13:44:24 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 18 Jul 2017 15:44:24 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <596E0B04.8030407@oracle.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <596E0B04.8030407@oracle.com> Message-ID: On 07/18/2017 03:20 PM, Erik ?sterlund wrote: > If I understand this correctly, the motivation for EpsilonGC is to be able to > measure the overheads due to GC pauses and GC barriers and measure only the > application throughput without GC jitter, and then use that as a baseline for > measuring performance of an actual GC implementation compared to EpsilonGC. There are several motivations, all in "Motivation" section in JEP. Performance work is one of them, that's right. > Howerver, automatic memory management is quite complicated when you think about > it. Yes, and lots of those are handled by the shared code that Epsilon calls into, just like any other GC. > Will EpsilonGC allocate all memory up-front, or expand the heap? In the case > where it expanded on-demand until it runs out of memory, what consequences does > that potential expansion have on throughput? It does have consequences, the same kind of consequences it has with allocating TLABs. You can trim them down with larger TLABs, larger pages, pre-touching, all of which are handled outside of Epsilon, by shared code. > In the case it is allocated upfront, will pages be pre-touched? Oh yes, there are two lines of code that also handle AlwaysPreTouch. But otherwise it is handled by shared heap space allocation code. I would like to see AlwaysPreTouch handled more consistently across GCs though. This is my point from another mail: if Epsilon has to do something on its own, it is a good sign shared GC utilities are not much of use. > If so, what NUMA nodes will the pre-mapped memory map in to? Will mutators > try to allocate NUMA-local memory? I think this is handled by shared code, at least for NUMA interleaving. I would hope that NUMA-aware allocation could be granular to TLABs, in which case it goes into shared code too, instead of pushing to reimplement this for every GC. If not, then Epsilon is not fully NUMA-aware. > What consequences will the larger heap footprint have on the throughput > because of decreased memory locality and as a result increased last level > cache misses and suddenly having to spread to more NUMA nodes? Yes, it would. See two paragraphs below: > Does the larger footprint change the requirements on compressed oops and > what encoding/decoding of oop compression is required? In case of an > expanding heap - can it even use compressed oops? In case of a not expanding > heap allocated up-front, does a comparison of a GC using compressed oops with > a baseline that can inherently not use it make sense? I guess the only relevant point here is, what happens if you need more heap than 32 GB, and then you have to disable compressed oops? In which case, of course, you will lose. But, you have to keep in mind that the target applications that are supposed to benefit from Epsilon are low-heap, quite probably zero-garbage. In this case, the question about heap size is moot: you allocate enough heap to hold your live data, whether with Epsilon or not. > Will lack of compaction and resulting possibly worse object locality of > memory accesses affect performance? Yes, it would. But it cuts both ways: having more throughput *if* you code with locality in mind. I am not against GCs that compact, but I do understand there are cases where I don't want them either. > I am not convinced that we can just remove GC-induced overheads from the picture > and measure the application throughput without the GC by using an EpsilonGC as > proposed. At least I do not think I would use it to draw conclusions about > GC-induced throughput loss. It seems like an apples to oranges comparison to me. > Or perhaps I have missed something? I think this uses a strawman pointing out all other things that could go wrong, to claim that the only thing the actual no-op GC implementation has to do (e.g. empty BarrierSet, allocation, and responding to heap exhaustion) is not needed either :) Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Tue Jul 18 13:46:40 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 18 Jul 2017 15:46:40 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <1500384881.2815.79.camel@oracle.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <1500384881.2815.79.camel@oracle.com> Message-ID: <0cf664ca-1532-7fd3-6644-ef6b910663dd@redhat.com> Hi Thomas, (reading the rest a bit later) On 07/18/2017 03:34 PM, Thomas Schatzl wrote: > I would like to expand this cost/benefit analysis a bit; I think the > most contentious point brought up by Erik has been the develop vs. > experimental flag issue. > For that, let me present you my understanding of the size and costs of > making this an experimental (actually product) vs. develop flag for the > intended target group as presented here. > Overall, on the question of develop vs. experimental option, I would tend to > prefer a develop option. In this area there simply seem to be too many > downsides compared to the upsides for an extremely limited user group. Ok, suppose we want to hide it from most users. Now we need an option that is available in release builds (because you want to test native GC performance), but not openly available in release builds. Which option type is that? I thought "experimental" is closest to that. Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Tue Jul 18 14:04:56 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 18 Jul 2017 16:04:56 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <1500384881.2815.79.camel@oracle.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <1500384881.2815.79.camel@oracle.com> Message-ID: <7dc40654-6045-19a4-5610-51c460b38bdb@redhat.com> (I have read the rest) Okay, you have convinced me, maintainers do not want to have it exposed as experimental option. Would you be willing to accept it as develop then? Other random ramblings: On 07/18/2017 03:34 PM, Thomas Schatzl wrote: > Running it daily, on X platforms on Y OSes for Z releases adds up > quickly. Could run something else instead. And there is always > something else to run on these machines, trust me. :) Right. Well, I have recently authored a few changes [1,2] that made Shenandoah GC tests run around 20% faster in fastdebug. I suppose some of that improvement is applicable to other GCs too. My question is, can I please have 1 minute of that machine time per build back as payment? :D [1] http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/f922d99ce776 [2] http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/9fe3d41b0e51 > The question is, by how much. So academics (and I am not trying to hit > on academics here, you brought them up ;)) that write a paper on GC but > never need to rebuild the VM (including the JDK here) because they > don't do any changes would be inconvenienced. > > Let me ask, how many do you expect these to be? From my understanding there > seems to be a very manageable yearly total GC paper output at the usual > conferences. Not sure how putting Epsilon GC in product would improve that. "Build it and they will come" works here. "develop" is seen as unstable and untouchable by most. > As for the amount of inconvenience, I think the users that already need > to recompile for their changes are not very much inconvenienced. I.e. > changing a single "develop" to "product" seems to be a very small > effort. Okay, we can do this downstream. > Aren't ultra-power-users able to rebuild the VM? What is their cost vs. > the effort spent into making their applications garbage-free or > implementing the necessary workarounds to be able to use that gc > (mentioned load-balancer trickery etc)? I am pretty sure they would be much, much, much happier to download the Oracle/RedHat/Azul's binary build and run with it in production, thus capitalizing on all the testing those companies did for their JDK binaries. Native compilers and native toolchains are the bottomless sources of bugs too, right? Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From erik.helin at oracle.com Tue Jul 18 15:22:50 2017 From: erik.helin at oracle.com (Erik Helin) Date: Tue, 18 Jul 2017 17:22:50 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> Message-ID: On 07/18/2017 03:26 PM, Aleksey Shipilev wrote: > On 07/18/2017 02:37 PM, Erik Helin wrote: >>> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality >>> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers >>> [3] Also, remember the reason for UseCondCardMark >>> [4] Also, remember the whole thing about G1 barriers >> >> Absolutely, barriers can come with an overhead. But a barrier that consists of >> dirtying a card does not come with a quite high overhead. In fact, it comes with >> a very low overhead :) > > Mhm! "Low" is in the eye of beholder. You can't beat zero overhead. And there > are people who literally count instructions on their hot paths, while still > developing in Java. > > Let me ask you a trick question: how do you *know* the card mark overhead is > small, if you don't have a no-barrier GC to compare against? There is no need for trick questions. Aleksey, we are working towards the same goal: making OpenJDK's GCs better. That doesn't mean we can't have different opinions on a few topics. You of course know the cost a GC barrier by measuring it. You measure it by constructing a build where you do not emit the barriers and compare it to a build where you do. Again, I have already said that I can see your work being useful for other JVM developers. >>>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is >>>> not the perfect baseline, since it is just a theoretical exercise. >>>> Just cranking up the heap and using Serial is more realistic >>>> baseline, but even using that as a baseline is questionable. >>> >>> It sometimes is. Non-generational GC is a good baseline for some workloads. Even >>> Serial does not cut it, because even if you crank up old and trim down young, >>> there is no way to disable reference write barrier store that maintains card >>> tables. >> >> I will still point out though that a GC without a barrier is still just a >> theoretical baseline. One could imagine a single-gen mark-compact GC for OpenJDK >> (that would require no barriers), but AFAIK almost all users prefer the slight >> overhead of dirtying a card (and in return get a generational GC) for the use >> cases where a single-gen mark-compact algorithm would be applicable. > > Mark-compact, maybe. But single-gen mark-sweep algorithms are plenty, see e.g. > Go runtime. I have hard time seeing how is that theoretical. That is not what I said. As I wrote above: > but AFAIK almost all users prefer the slight > overhead of dirtying a card (and in return get a generational GC) for > the use cases where a single-gen mark-compact algorithm would be > applicable. There are of course use cases for single-gen mark-sweep algorithms, and as I write above, for single-gen mark-compact algorithms as well. But for Java, and OpenJDK, at least it is my understanding that most users prefer a generational algorithm like Serial compared to a single-gen mark-compact algorithm (at least I have not seen a lot of users asking for that). But maybe I'm missing something here? This is why I wrote, and still think, that a GC without a barrier for Java seems more like a theoretical baseline. There are of course single generational GC algorithms that uses a barrier that it would be very interesting to see implemented in OpenJDK (including the great work that you and others are doing with Shenandoah). >> However, again, this might be useful for someone who wants try to do some >> changes to the JVM GC code. But that, to me, is not enough to expose it to >> non-JVM developers. It could be useful to have in the source code though, maybe >> like a --with-jvm-feature kind of thing? > > That would go against the maintainability argument, no? Because you will still > have to maintain the code, *and* it will require building a special JVM flavor. > So it is a lose-lose: neither users get it, nor maintainers have simpler lives. No, I don't view it that way. Having the code in the upstream repository and having it exposed in binary builds are two very different things to me, and comes with very different requirements in terms of maintenance. If the code is in the upstream repository, then it is a tool for developers working in OpenJDK and for integrators building OpenJDK. We have a much easier time changing such code compared to code that users have come to rely on (and expect certain behavior from). >> [snip] Such users will still be able to get binary builds if someone is willing to >> produce them with Epsilon GC. There are plenty of OpenJDK binary builds >> available from various organizations/companies. > > Well, yes. I actually happen to know the company which can distribute this in > the downstream OpenJDK builds, and reap the ultra-power-users loyalty. But, I am > maintaining that having the code upstream is beneficial, even if that company is > going to do maintenance work either way. > > >>> So the short answer about why Epsilon is good to have in product is because the >>> cost seems low, the benefits are present, and so cost/benefit is still low. >> >> And it is here that our opinions differ :) For you the maintenance cost is low, >> whereas for me, having yet another command-line flag, yet another code path, >> gets in the way. You have to respect that we have different background and >> experiences here. > > I am not trying to challenge your background or experience here, I am > challenging the cost estimates though. Because ad absurdum, we can shoot down > any feature change coming into JVM, just because it introduces yet another flag, > yet another code path, etc. Do you see me doing that? I at least hope I am welcoming to everyone that wants to contribute a patch to OpenJDK, big or small (please let me know otherwise). > I cannot see where the Epsilon maintenance would be a burden: it comes with > automated tests that run fast, its implementation seemss trivial, its exposure > to VM code seems trivial too (apart from the BarrierSet thing that would be > trimmed down with GC interface work). And from my experience there is always maintenance work (documentation, support, testing matrix increase, etc) with supporting a new kind of collector. You and I just do a different cost/benefit analysis on exposing this behavior to non-JVM developers. >>> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look >>> scary to you, given your experience maintaining the related code? >> >> I don't like taking the role of the grumpy open source maintainer :) No, the >> code is not scary, code is rarely scary IMO, it is just code. Running tests, >> fixing that a test -Xmx1g isn't run on a RPi, having additional code paths, more >> cases to take into consideration when refactoring, is burdensome. And to me, the >> benefits of benchmarking against Epsilon vs benchmarking against Serial/Parallel >> isn't that high to me. >> >> But, I can understand that it is useful when trying to evaluate for example the >> cost of stores into a HashMap. Which is why I'm not against the code, but I'm >> not keen on exposing this to non-JVM developers. > > I hear you, but thing is, Epsilon does not seem a coding exercise anymore. > Epsilon is useful for GC performance work especially when readily available, and > there are willing users to adopt it. Similarly how we respect maintainers' > burden in the product, we have to also see what benefits users, especially the > ones who are championing our project performance even by cutting corners with > e.g. no-op GCs. Yes, you always have to weigh the benefits against the costs, and in this case, exposing Epsilon GC to non-JVM developers seems, at least for now and to me, taht the benefits do not outweigh the costs. Who knows, maybe this will change and we redo the cost/benefit analysis? It is very easy to go from developer flag to experimental flag, it is way, way harder to go from experimental flag to developer flag. Thanks, Erik > Thanks, > -Aleksey > From shade at redhat.com Tue Jul 18 15:41:21 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 18 Jul 2017 17:41:21 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> Message-ID: <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com> Hi Erik, I think we are coming to a consensus here. Piece-wise: On 07/18/2017 05:22 PM, Erik Helin wrote: > That is not what I said. As I wrote above: > >> but AFAIK almost all users prefer the slight >> overhead of dirtying a card (and in return get a generational GC) for >> the use cases where a single-gen mark-compact algorithm would be >> applicable. > > There are of course use cases for single-gen mark-sweep algorithms, and as I > write above, for single-gen mark-compact algorithms as well. But for Java, and > OpenJDK, at least it is my understanding that most users prefer a generational > algorithm like Serial compared to a single-gen mark-compact algorithm (at least > I have not seen a lot of users asking for that). But maybe I'm missing something > here? Mmm, "prefer" is not the same as "have no other option than trust JVM developers that generational is better for their workloads, and having no energy to try to build the collector proving otherwise". Because there is no no collector in OpenJDK that avoids generational barriers. Saying "prefer" here is very very odd. > No, I don't view it that way. Having the code in the upstream repository and > having it exposed in binary builds are two very different things to me, and > comes with very different requirements in terms of maintenance. If the code is > in the upstream repository, then it is a tool for developers working in OpenJDK > and for integrators building OpenJDK. We have a much easier time changing such > code compared to code that users have come to rely on (and expect certain > behavior from). Okay. I am still quite a bit puzzled why "experimental" comes with any notion of supportability, compatibility, testing coverage, etc. I don't think most of current experimental options declared in globals.hpp come with that in mind. In fact, many are even marked with "(Unsafe) (Unstable)"... >> I hear you, but thing is, Epsilon does not seem a coding exercise anymore. >> Epsilon is useful for GC performance work especially when readily available, and >> there are willing users to adopt it. Similarly how we respect maintainers' >> burden in the product, we have to also see what benefits users, especially the >> ones who are championing our project performance even by cutting corners with >> e.g. no-op GCs. > > Yes, you always have to weigh the benefits against the costs, and in this case, > exposing Epsilon GC to non-JVM developers seems, at least for now and to me, > taht the benefits do not outweigh the costs. Who knows, maybe this will change > and we redo the cost/benefit analysis? It is very easy to go from developer flag > to experimental flag, it is way, way harder to go from experimental flag to > developer flag. Okay, that sounds like a compromise to me: push Epsilon under "develop" flag, and then ask users or downstreams to switch it to "product" if they want. This is not ideal, but it works. Does that resolve your concerns? Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From erik.helin at oracle.com Wed Jul 19 09:17:41 2017 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 19 Jul 2017 11:17:41 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com> Message-ID: <34ae4e22-c1aa-ff99-1a0d-9fec183280b9@oracle.com> On 07/18/2017 05:41 PM, Aleksey Shipilev wrote: >> Yes, you always have to weigh the benefits against the costs, and in this case, >> exposing Epsilon GC to non-JVM developers seems, at least for now and to me, >> taht the benefits do not outweigh the costs. Who knows, maybe this will change >> and we redo the cost/benefit analysis? It is very easy to go from developer flag >> to experimental flag, it is way, way harder to go from experimental flag to >> developer flag. > > Okay, that sounds like a compromise to me: push Epsilon under "develop" flag, > and then ask users or downstreams to switch it to "product" if they want. This > is not ideal, but it works. Does that resolve your concerns? Yep, I would prefer it to be a develop flag. Will you update the JEP to reflect this? Thanks, Erik > Thanks, > -Aleksey > From thomas.schatzl at oracle.com Wed Jul 19 09:27:05 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 19 Jul 2017 11:27:05 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com> Message-ID: <1500456425.2870.36.camel@oracle.com> Hi Aleksey, On Tue, 2017-07-18 at 17:41 +0200, Aleksey Shipilev wrote: > Hi Erik, > > I think we are coming to a consensus here. > > Piece-wise: > > On 07/18/2017 05:22 PM, Erik Helin wrote: > > > > No, I don't view it that way. Having the code in the upstream > > repository and having it exposed in binary builds are two very > > different things to me, and comes with very different requirements > > in terms of maintenance. If the code is in the upstream repository, > > then it is a tool for developers working in OpenJDK and for > > integrators building OpenJDK. We have a much easier time changing > > such code compared to code that users have come to rely on (and > > expect certain behavior from). > > Okay. I am still quite a bit puzzled why "experimental" comes with > any notion of supportability, compatibility, testing coverage, etc.? Every option that is exposed to the user in the product build is part of the public API, and so must be supported similar to other options. An experimental option is just another "official" interface to the user as described by the CSR wiki page [1]. Just consider this: a security issue in an experimental option is just as much a security issue in the product as any other. Since we do not want to wait that to happen, it needs the same support and testing as any other. Experimental options are (at least in the GC group) more obscure options that help you shoot yourselves into your foot performance wise better if you fiddle too much with them :) So the use -XX:+UseExperimentalVMOptions is more an acknowledgment for you that you are really sure you want to do that. They may be still required for some users for application (what we think are) corner cases that are not (yet?) handled well automatically by the VM. Or as alternatives for other product options that only apply to e.g. a single collector. Or just mislabelled as such. > I don't think most of current experimental options declared in >?globals.hpp come with that in mind. In fact, many are even marked? > with "(Unsafe) (Unstable)"... The VM is a very old project, from before when terms like "unit testing", "code coverage" and related were a thing. Around 28 of those remaining out of 1729 in globals.hpp does not sound too bad. Could be better of course (also the actual number of switches ;)). Also I am not sure whether they are actually unsafe and unstable any more. Thanks, ? Thomas [1]?https://wiki.openjdk.java.net/display/csr/?; there is a more detailed, likely provisional guide [2] covering options a bit more. [2]?http://cr.openjdk.java.net/~darcy/OpenJdkDevGuide/OpenJdkDevelopers Guide.v0.777.html#kinds_of_interfaces From shade at redhat.com Wed Jul 19 10:32:08 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 19 Jul 2017 12:32:08 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <1500456425.2870.36.camel@oracle.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com> <1500456425.2870.36.camel@oracle.com> Message-ID: <384f94c6-96c3-20f0-2ea2-a9fafd29d99c@redhat.com> On 07/19/2017 11:27 AM, Thomas Schatzl wrote: >> Okay. I am still quite a bit puzzled why "experimental" comes with >> any notion of supportability, compatibility, testing coverage, etc. > > Every option that is exposed to the user in the product build is part > of the public API, and so must be supported similar to other options. > An experimental option is just another "official" interface to the user > as described by the CSR wiki page [1]. > > Just consider this: a security issue in an experimental option is just > as much a security issue in the product as any other. Since we do not > want to wait that to happen, it needs the same support and testing as > any other. But, but... the definition in globals.hpp: // experimental flags are in support of features that ***are not // part of the officially supported product***, but are available // for experimenting with. They could, for example, be performance // features that ***may not have undergone full or rigorous QA***, but which may // help performance in some cases and released for experimentation // by the community of users and developers. This flag also allows one to // be able to build a fully supported product that nonetheless also // ships with some ***unsupported, lightly tested***, experimental features. // Like the UnlockDiagnosticVMOptions flag above, there is a corresponding // UnlockExperimentalVMOptions flag, which allows the control and // modification of the experimental flags. (emphasis mine) Are you saying that GC group makes that definition stronger by saying experimental flags are like product functional-stability-wise, but not performance-wise? So, that means GC group runs the functional testing with every combination of experimental options? Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Wed Jul 19 12:12:28 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 19 Jul 2017 14:12:28 +0200 Subject: RFC: Epsilon GC JEP In-Reply-To: <34ae4e22-c1aa-ff99-1a0d-9fec183280b9@oracle.com> References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com> <621d6f35-617c-d603-3159-cd537831e66e@oracle.com> <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com> <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com> <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com> <34ae4e22-c1aa-ff99-1a0d-9fec183280b9@oracle.com> Message-ID: <3438f311-80e4-8e12-3e58-a8a0f7750858@redhat.com> On 07/19/2017 11:17 AM, Erik Helin wrote: > On 07/18/2017 05:41 PM, Aleksey Shipilev wrote: >>> Yes, you always have to weigh the benefits against the costs, and in this case, >>> exposing Epsilon GC to non-JVM developers seems, at least for now and to me, >>> taht the benefits do not outweigh the costs. Who knows, maybe this will change >>> and we redo the cost/benefit analysis? It is very easy to go from developer flag >>> to experimental flag, it is way, way harder to go from experimental flag to >>> developer flag. >> >> Okay, that sounds like a compromise to me: push Epsilon under "develop" flag, >> and then ask users or downstreams to switch it to "product" if they want. This >> is not ideal, but it works. Does that resolve your concerns? > > Yep, I would prefer it to be a develop flag. Will you update the JEP to reflect > this? Updated. Better yet, the implementation is updated to make Epsilon 'develop'. Which required some trickery to make the tests pass with release builds, and survive changing the flag back to 'product' or 'experimental' without omitting the tests. Also, my build servers now patch Epsilon builds back to 'experimental'. Cheers, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From mikael.gerdin at oracle.com Wed Jul 19 14:52:51 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 19 Jul 2017 16:52:51 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: <1500042870.3458.84.camel@oracle.com> References: <1500031158.3458.41.camel@oracle.com> <1500034800.3458.75.camel@oracle.com> <1500042870.3458.84.camel@oracle.com> Message-ID: <18f703d2-fb15-6ae8-affa-9d4ea11c85e1@oracle.com> Hi Thomas, On 2017-07-14 16:34, Thomas Schatzl wrote: > Hi again, > > On Fri, 2017-07-14 at 15:18 +0200, Aleksey Shipilev wrote: >> On 07/14/2017 02:20 PM, Thomas Schatzl wrote: >>> >>> Not completely sure what you are referring to, but I split some >>> very >>> long asserts across lines. >> Yes, I meant that, sorry for not being clear. Any webrev that >> requires me to scroll horizontally on 2560-pixel wide screen triggers >> me! > > I noticed that too :) > >>>> >>>> *) So, mark_reference_grey used to be called from >>>> G1CMSATBBufferClosure on >>>> objects below TAMS, but now it would get called on objects past >>>> TAMS >>>> too? >>> CMTask::make_reference_grey() now calls >>> G1ConcurrentMark::mark_in_next_bitmap(), not >>> ConcurrentMark::par_mark() >>> which does not exist any more: >>> G1ConcurrentMark::mark_in_next_bitmap() >>> in the first check filters out marking attempts above nTAMS >>> (g1ConcurrentMark.inline.hpp:47 now), returning false, which makes >>> make_reference_grey() exit immediately in that case. This seems to >>> achieve the same effect. >> Ah, I missed that part! I agree this part is fine then. >> >>> >>> If you are worried whether there is a performance difference >>> because maybe now we do more work in some cases, all paths >>> previously leading to the former G1ConcurrentMark::par_mark() did >>> the nTAMS check in one way or another already (of course in >>> inconsistent fashion) so there should be no change here. >> No, I am not worried. SATB-heavy workloads have problems way beyond >> bitmap marking :) >> >>> >>> New webrevs: >>> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1/ (full) >> Looks good to me. > > Thanks. Unfortunately, after re-appyling and fixing other changes based > on this one I noticed that I missed one opportunity to refactor in > G1CMTask::deal_with_reference(). I would like to add this to this > changeset still... sorry. > > There is some note about some perf optimization that mentions that it > is advantagous to do the nTAMS check before determining the heap > region; however I do not think this is an issue. > > Quickly comparing runs of a fairly large and reference-intensive > workload (BigRAMTester with 20g heap e.g. attached to JDK-8152438), > marking cycles with the latest webrev.2 are at least as fast as without > any of this RFR's changes. > > New webrevs: > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff) > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full) Looks good to me. /Mikael > > Thanks, > Thomas > From thomas.schatzl at oracle.com Wed Jul 19 14:58:13 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 19 Jul 2017 16:58:13 +0200 Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot() In-Reply-To: <18f703d2-fb15-6ae8-affa-9d4ea11c85e1@oracle.com> References: <1500031158.3458.41.camel@oracle.com> <1500034800.3458.75.camel@oracle.com> <1500042870.3458.84.camel@oracle.com> <18f703d2-fb15-6ae8-affa-9d4ea11c85e1@oracle.com> Message-ID: <1500476293.2568.0.camel@oracle.com> Hi Mikael, On Wed, 2017-07-19 at 16:52 +0200, Mikael Gerdin wrote: > Hi Thomas, > > On 2017-07-14 16:34, Thomas Schatzl wrote: > >? > > New webrevs: > > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff) > > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full) > Looks good to me. > /Mikael ? thanks for your review. Thomas From milan.mimica at gmail.com Wed Jul 19 17:05:28 2017 From: milan.mimica at gmail.com (Milan Mimica) Date: Wed, 19 Jul 2017 17:05:28 +0000 Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as belonging to mtGC Message-ID: Hello I'm resending the two patches (JDK-8176571, JDK-8182169) from my new email address which I will be using from now on in this ML. I was notified my OCA has been approved. The patches have previously been discussed and generally approved. I recreated them against the recent tip, and also removed overloaded constructors from CHeapBitmap by using default parameters, as discussed. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: heapBitMap_nmt.diff Type: text/x-patch Size: 16567 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: refactor_array_allocator.diff Type: text/x-patch Size: 11799 bytes Desc: not available URL: From email.sundarms at gmail.com Wed Jul 19 20:24:50 2017 From: email.sundarms at gmail.com (Sundara Mohan M) Date: Wed, 19 Jul 2017 13:24:50 -0700 Subject: G1MonitoringSupport unused generation counter Message-ID: Hi, Was trying to understand why old generation mx bean was notified in case of G1GC and saw following code G1MonitoringSupport.hpp // young collection set counters. The _eden_counters, // _from_counters, and _to_counters are associated with // this "generational" counter. GenerationCounters* _young_collection_counters; // old collection set counters. The _old_space_counters // below are associated with this "generational" counter. GenerationCounters* _old_collection_counters; I don't see these counters updated anywhere. What is the use of these counters in G1GC. only following is updated in g1CollectedHeap.cpp // incremental collections both young and mixed CollectorCounters* _incremental_collection_counters; // full stop-the-world collections CollectorCounters* _full_collection_counters; Is there any mail thread/doc which explains more about this. Thanks, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Thu Jul 20 07:37:14 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 20 Jul 2017 09:37:14 +0200 Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as belonging to mtGC In-Reply-To: References: Message-ID: <1500536234.2924.0.camel@oracle.com> Hi Milan, On Wed, 2017-07-19 at 17:05 +0000, Milan Mimica wrote: > Hello > > I'm resending the two patches (JDK-8176571, JDK-8182169) from my new > email address which I will be using from now on in this ML. I was > notified my OCA has been approved. > > The patches have previously been discussed and generally approved. I > recreated them against the recent tip, and also removed overloaded > constructors from CHeapBitmap by using default parameters, as > discussed. ? great! Looks good. I can sponsor as soon as Kim or anybody else gives his okay. Thanks, ? Thomas From erik.helin at oracle.com Thu Jul 20 08:44:38 2017 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 20 Jul 2017 10:44:38 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com> <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com> Message-ID: <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com> On 07/17/2017 02:07 PM, Roman Kennke wrote: >>> Ok, added those and some more that I found. Not sure why we'd need >>> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out >>> for now. >> >> Because you are accessing CMSCollcetor in: >> >> 99 NOT_PRODUCT( >> 100 virtual size_t skip_header_HeapWords() { return >> CMSCollector::skip_header_HeapWords(); } >> 101 ) >> >> and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An >> alternative would of course be to just declare skip_header_HeapWords() >> in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then >> you only need to include concurrentMarkSweeoGeneration.hpp in >> cmsHeap.cpp. > Ah ok, I've missed that one. Added it now. Where did you add it? I don't see any include of "gc/cms/concurrentMarkSweepGeneration.hpp" in cmsHeap.hpp? Thanks, Erik From mikael.gerdin at oracle.com Thu Jul 20 09:55:35 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 20 Jul 2017 11:55:35 +0200 Subject: RFR (S): 8183121: Add information about scanned and skipped cards during UpdateRS In-Reply-To: <1499861583.6693.3.camel@oracle.com> References: <1499861583.6693.3.camel@oracle.com> Message-ID: <0f0689fe-b532-9708-dca9-70d4ed01415b@oracle.com> Hi Thomas, On 2017-07-12 14:13, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this small change that adds some information > about how many cards were scanned/skipped during Update RS. > > This information is much better than just the number of processed > buffers, although I kept them for now. > > This change is based on Erik's changes for JDK-8183539. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8183121 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8183121/webrev Looks fine to me. /Mikael > Testing: > jprt, test case > > Thanks, > Thomas > From thomas.schatzl at oracle.com Thu Jul 20 10:13:59 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 20 Jul 2017 12:13:59 +0200 Subject: RFR (S): 8183121: Add information about scanned and skipped cards during UpdateRS In-Reply-To: <0f0689fe-b532-9708-dca9-70d4ed01415b@oracle.com> References: <1499861583.6693.3.camel@oracle.com> <0f0689fe-b532-9708-dca9-70d4ed01415b@oracle.com> Message-ID: <1500545639.2924.2.camel@oracle.com> Hi, On Thu, 2017-07-20 at 11:55 +0200, Mikael Gerdin wrote: > Hi Thomas, > > On 2017-07-12 14:13, Thomas Schatzl wrote: > > > > Hi all, > > > > ???can I have reviews for this small change that adds some > > information > > about how many cards were scanned/skipped during Update RS. > > > > This information is much better than just the number of processed > > buffers, although I kept them for now. > > > > This change is based on Erik's changes for JDK-8183539. > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8183121 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8183121/webrev > Looks fine to me. > /Mikael ? thanks for your review. Thomas From rkennke at redhat.com Thu Jul 20 10:46:34 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 20 Jul 2017 12:46:34 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com> <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com> <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com> Message-ID: <5bed5268-d690-9cdd-c2aa-e9b822687378@redhat.com> Am 20.07.2017 um 10:44 schrieb Erik Helin: > On 07/17/2017 02:07 PM, Roman Kennke wrote: >>>> Ok, added those and some more that I found. Not sure why we'd need >>>> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out >>>> for now. >>> >>> Because you are accessing CMSCollcetor in: >>> >>> 99 NOT_PRODUCT( >>> 100 virtual size_t skip_header_HeapWords() { return >>> CMSCollector::skip_header_HeapWords(); } >>> 101 ) >>> >>> and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An >>> alternative would of course be to just declare skip_header_HeapWords() >>> in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then >>> you only need to include concurrentMarkSweeoGeneration.hpp in >>> cmsHeap.cpp. >> Ah ok, I've missed that one. Added it now. > > Where did you add it? I don't see any include of > "gc/cms/concurrentMarkSweepGeneration.hpp" in cmsHeap.hpp? Hmm. I honestly don't know how that disappeared :-) Differential: http://cr.openjdk.java.net/~rkennke/8179387/webrev.09.diff/ Full: http://cr.openjdk.java.net/~rkennke/8179387/webrev.09/ I hope it's ok now. Cheers, Roman From rkennke at redhat.com Thu Jul 20 10:53:18 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 20 Jul 2017 12:53:18 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com> References: <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com> <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com> Message-ID: Hi all, Robbin found some more missing includes in jprt testing (thanks!!) Differential: http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/ Full: http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ Am I breaking the record for most webrev revisions? :-P According the Robbin, builds are now all clean. Can I get final reviews and then a sponsor? Thanks, Roman Am 16.07.2017 um 10:25 schrieb Robbin Ehn: > Hi Roman, > > On 2017-07-12 15:32, Roman Kennke wrote: >> Hi Robbin and all, >> >> I fixed the 32bit failures by using jlong in all relevant places: >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ >> >> >> then Robbin found another problem. SafepointCleanupTest started to fail, >> because "mark nmethods" is no longer printed. This made me think that >> we're not measuring the conflated (and possibly parallelized) >> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with >> "safepoint cleanup tasks" which measures the total duration of safepoint >> cleanup. We can't reasonably measure a possibly parallel and conflated >> pass standalone, but we can measure all and by subtrating all the other >> subphases, get an idea how long deflation and nmethod marking take up. >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ >> >> >> The full webrev is now: >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ >> >> >> Hope that's all ;-) > > With this changeset something always pop-ups. > > Failure reason: Targets failed. Target macosx_x64_10.9-fastdebug FAILED. > > /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ > -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS > -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE > -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions > -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16 > -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070 > -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN > -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef > -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS > -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86 > -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64 > -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED > -DINCLUDE_AOT > -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm > -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22: > error: variable has incomplete type 'StrongRootsScope' > StrongRootsScope srs(num_cleanup_workers); > ^ > /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: > note: forward declaration of 'StrongRootsScope' > class StrongRootsScope; > ^ > /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22: > error: variable has incomplete type 'StrongRootsScope' > StrongRootsScope srs(1); > ^ > /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: > note: forward declaration of 'StrongRootsScope' > class StrongRootsScope; > ^ > 2 errors generated. > make[3]: *** > [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o] > Error 1 > make[3]: *** Waiting for unfinished jobs.... > make[2]: *** [hotspot-server-libs] Error 2 > > Send me the new webrev and I'll test it before the 16th round of > review :) > > /Robbin > >> >> Roman >> >> Am 10.07.2017 um 21:22 schrieb Robbin Ehn: >>> Hi, unfortunately the push failed on 32-bit. >>> >>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) >>> >>> I do not have anytime to look at this, so here is the error. >>> >>> /Robbin >>> >>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' >>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>> member function 'long int nmethod::stack_traversal_mark()': >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>> >>> error: call of overloaded 'load_acquire(volatile long int*)' is >>> ambiguous >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>> >>> note: candidates are: >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>> >>> note: static jint OrderAccess::load_acquire(const volatile jint*) >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>> >>> note: no known conversion for argument 1 from 'volatile long int*' >>> to 'const volatile jint* {aka const volatile int*}' >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>> >>> note: static juint OrderAccess::load_acquire(const volatile juint*) >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>> >>> note: no known conversion for argument 1 from 'volatile long int*' >>> to 'const volatile juint* {aka const volatile unsigned int*}' >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>> member function 'void nmethod::set_stack_traversal_mark(long int)': >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>> >>> error: call of overloaded 'release_store(volatile long int*, long >>> int&)' is ambiguous >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>> >>> note: candidates are: >>> In file included from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>> >>> from >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>> >>> note: static void OrderAccess::release_store(volatile jint*, jint) >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>> >>> note: no known conversion for argument 1 from 'volatile long int*' >>> to 'volatile jint* {aka volatile int*}' >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>> >>> note: static void OrderAccess::release_store(volatile juint*, juint) >>> >>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>> >>> note: no known conversion for argument 1 from 'volatile long int*' >>> to 'volatile juint* {aka volatile unsigned int*}' >>> >>> On 2017-07-10 20:50, Robbin Ehn wrote: >>>> I'll start a push now. >>>> >>>> /Robbin >>>> >>>> On 2017-07-10 12:38, Roman Kennke wrote: >>>>> Ok, so I guess I need a sponsor for this now: >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>> >>>>> >>>>> Roman >>>>> >>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>>>> >>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>>>> > wrote: >>>>>>> >>>>>>> Hi Roman, >>>>>>> >>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>>>> Hi Robbin, >>>>>>>>> >>>>>>>>> Far down -> >>>>>>>>> >>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm not happy about this change: >>>>>>>>>>> >>>>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>>>> + // TODO: Is this really needed? >>>>>>>>>>> + OrderAccess::storestore(); >>>>>>>>>>> + } >>>>>>>>>>> >>>>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>>>> consistent >>>>>>>>>>> with an OrderAccess::storestore() that's not properly >>>>>>>>>>> documented >>>>>>>>>>> which is only increasing the technical debt. >>>>>>>>>>> >>>>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>>>> >>>>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case >>>>>>>>>>>> that >>>>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>>>> sweeper) >>>>>>>>>>>> is holding still. >>>>>>>>>>> >>>>>>>>>>> and: >>>>>>>>>>> >>>>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>>>> sweeper.cpp... >>>>>>>>>> >>>>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>>>> marking >>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>>>> (outside >>>>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>>>> storestore() >>>>>>>>>> should be necessary. >>>>>>>>>> >>>>>>>>>> From Igor's comment I can see how it happened though: >>>>>>>>>> Apparently >>>>>>>>>> there >>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>>>> with >>>>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>>>> required >>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>>>> also put >>>>>>>>>> a storestore() in the other places that call >>>>>>>>>> mark_as_seen_on_stack(), >>>>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>>>> storestore() >>>>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>>>> 'for >>>>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>>>> necessary in >>>>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>>>> >>>>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>>>> Refactor the >>>>>>>>>> code so that both paths at least call the storestore() in the >>>>>>>>>> same >>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and >>>>>>>>>> call >>>>>>>>>> storestore() in the dtor as proposed?) >>>>>>>>> >>>>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>>>> >>>>>>>>> So there is a slight optimization when not running sweeper to >>>>>>>>> skip >>>>>>>>> compiler barrier/fence in stw. >>>>>>>>> >>>>>>>>> Don't think that matter, so I propose something like: >>>>>>>>> - long stack_traversal_mark() { return >>>>>>>>> _stack_traversal_mark; } >>>>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>>>> _stack_traversal_mark = l; } >>>>>>>>> + long stack_traversal_mark() { return >>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>>>> >>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>>>> that >>>>>>>>> it is concurrent accessed. >>>>>>>>> And remove both storestore. >>>>>>>>> >>>>>>>>> "Also neither of these state variables are volatile in >>>>>>>>> nmethod, so >>>>>>>>> even the compiler may reorder the stores" >>>>>>>>> Fortunately at least _state is volatile now. >>>>>>>>> >>>>>>>>> I think _state also should use la/rs semantics instead, but >>>>>>>>> that's >>>>>>>>> another story. >>>>>>>> Like this? >>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> Yes, exactly, I like this! >>>>>>> Dan? Igor ? Tobias? >>>>>>> >>>>>> >>>>>> That seems correct. >>>>>> >>>>>> igor >>>>>> >>>>>>> Thanks Roman! >>>>>>> >>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow >>>>>>> this >>>>>>> thread/changeset to the end! >>>>>>> >>>>>>> /Robbin >>>>>>> >>>>>>>> Roman >>>>>> >>>>> >> From thomas.schatzl at oracle.com Thu Jul 20 11:06:31 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 20 Jul 2017 13:06:31 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com> Message-ID: <1500548791.2924.6.camel@oracle.com> Hi all, ? Erik and Mikael had a look at it and suggested several further cleanups, removing about 40 LOC. These included: - instead of G1CMBitMapR0 use properly const'ified G1CMBitmaps - change the _start and _word_size members into an equivalent MemRegion - minor cleanups, removing obsolete asserts. Webrevs: http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1_to_2/?(diff) http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2/?(full) Testing: jprt Thanks, ? Thomas From mikael.gerdin at oracle.com Thu Jul 20 13:04:11 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 20 Jul 2017 15:04:11 +0200 Subject: Request for Comments: 8184734: Rework G1 root scanning to avoid multiple CLD passes Message-ID: Hi all, Please review this preliminary change to clean up G1 root processing a bit. I've not run this through a lot of testing but this will give you a general idea about where I think we should be going. The basic idea is explained in the bug text but I'll reproduce it here as well: > After JDK-8154580 we no longer need the multi-pass CLD scanning in G1. > The reason for this is that classes which are strongly reachable from interpreter frames are kept alive by marking the mirror in the initial mark pause. > > The current solution to this was to first ensure that in an initial step all CLDs which were strongly reachable had to be scanned and claimed before any weakly reachable CLDs could be scanned and claimed. This code can now be simplified and we can walk all the CLDs in one go, only doing strong marking on the ones which are strong as per always_strong_cld_do. > This cleanup also allows us to remove the claimed marks clearing since CLD scanning is now completely single threaded. > > Waiting for strong classes to be discovered is still needed for the case where an nmethod on the stack is the single root to a class. Webrev: http://cr.openjdk.java.net/~mgerdin/8184734/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8184734 Testing: jprt, some local tonga tests, kitchensink and runThese Suggestions on further testing would be much appreciated! Thanks /Mikael From rkennke at redhat.com Thu Jul 20 14:58:51 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 20 Jul 2017 16:58:51 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <1500034180.3458.67.camel@oracle.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> <1500034180.3458.67.camel@oracle.com> Message-ID: <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com> Am 14.07.2017 um 14:09 schrieb Thomas Schatzl: > Hi Roman, > > On Fri, 2017-07-14 at 13:24 +0200, Roman Kennke wrote: >> Am 14.07.2017 um 13:12 schrieb Aleksey Shipilev: >>> Hi Thomas, >>> >>> On 07/14/2017 12:58 PM, Thomas Schatzl wrote: >>>>>> The next CR JDK-8184347 will deal with moving G1CMBitmap* >>>>>> into separate files. >>>>> And while you're at it, you may want to move it to gc/shared >>>>> and renamed it to something like MarkBitmap? >>>>> https://bugs.openjdk.java.net/browse/JDK-8180193 >>>>> >>>> Not particularly against this change, but I think we should do >>>> the move and renaming separately when the change is actually >>>> required, i.e. just before there is another dependency on it. >>> I think this would be inconvenient, because when "another >>> dependency" would come in a large webrev, it would have to include >>> the CMBitmap move too, complicating reviews. >> I understood it such that we would post the moving around of gc/g1 >> files to gc/shared right before we'd post Shenandoah (in the not-so- >> distant future, hopefully). That would work for me. I wouldn't like >> to include everything in a giant webrev :-P >> > that is exactly what I meant - thanks for your understanding. > > Thomas > I just found out that CMS has its own bitmap class too, and it looks mostly like a copy of the G1 bitmap class :-) So that would be another user of a gc/shared bitmap class in the future. Roman From daniel.daugherty at oracle.com Thu Jul 20 16:43:34 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 20 Jul 2017 10:43:34 -0600 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com> Message-ID: <0c200752-6e45-6132-d937-4e9429ed9f95@oracle.com> On 7/20/17 4:53 AM, Roman Kennke wrote: > Hi all, > > Robbin found some more missing includes in jprt testing (thanks!!) > > Differential: > http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/ > > Full: > http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ > > > Am I breaking the record for most webrev revisions? :-P > > According the Robbin, builds are now all clean. > > Can I get final reviews and then a sponsor? src/share/vm/runtime/safepoint.cpp No comments. Only reviewed the one file that changed since webrev.15. Thumbs up! Dan > > Thanks, > Roman > > Am 16.07.2017 um 10:25 schrieb Robbin Ehn: >> Hi Roman, >> >> On 2017-07-12 15:32, Roman Kennke wrote: >>> Hi Robbin and all, >>> >>> I fixed the 32bit failures by using jlong in all relevant places: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ >>> >>> >>> then Robbin found another problem. SafepointCleanupTest started to fail, >>> because "mark nmethods" is no longer printed. This made me think that >>> we're not measuring the conflated (and possibly parallelized) >>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with >>> "safepoint cleanup tasks" which measures the total duration of safepoint >>> cleanup. We can't reasonably measure a possibly parallel and conflated >>> pass standalone, but we can measure all and by subtrating all the other >>> subphases, get an idea how long deflation and nmethod marking take up. >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ >>> >>> >>> The full webrev is now: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ >>> >>> >>> Hope that's all ;-) >> With this changeset something always pop-ups. >> >> Failure reason: Targets failed. Target macosx_x64_10.9-fastdebug FAILED. >> >> /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ >> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS >> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE >> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions >> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16 >> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070 >> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN >> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef >> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS >> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86 >> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64 >> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED >> -DINCLUDE_AOT >> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm >> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22: >> error: variable has incomplete type 'StrongRootsScope' >> StrongRootsScope srs(num_cleanup_workers); >> ^ >> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: >> note: forward declaration of 'StrongRootsScope' >> class StrongRootsScope; >> ^ >> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22: >> error: variable has incomplete type 'StrongRootsScope' >> StrongRootsScope srs(1); >> ^ >> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: >> note: forward declaration of 'StrongRootsScope' >> class StrongRootsScope; >> ^ >> 2 errors generated. >> make[3]: *** >> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o] >> Error 1 >> make[3]: *** Waiting for unfinished jobs.... >> make[2]: *** [hotspot-server-libs] Error 2 >> >> Send me the new webrev and I'll test it before the 16th round of >> review :) >> >> /Robbin >> >>> Roman >>> >>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn: >>>> Hi, unfortunately the push failed on 32-bit. >>>> >>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) >>>> >>>> I do not have anytime to look at this, so here is the error. >>>> >>>> /Robbin >>>> >>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' >>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>>> member function 'long int nmethod::stack_traversal_mark()': >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>>> >>>> error: call of overloaded 'load_acquire(volatile long int*)' is >>>> ambiguous >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>>> >>>> note: candidates are: >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>>> >>>> note: static jint OrderAccess::load_acquire(const volatile jint*) >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>>> >>>> note: no known conversion for argument 1 from 'volatile long int*' >>>> to 'const volatile jint* {aka const volatile int*}' >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>>> >>>> note: static juint OrderAccess::load_acquire(const volatile juint*) >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>>> >>>> note: no known conversion for argument 1 from 'volatile long int*' >>>> to 'const volatile juint* {aka const volatile unsigned int*}' >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>>> member function 'void nmethod::set_stack_traversal_mark(long int)': >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>>> >>>> error: call of overloaded 'release_store(volatile long int*, long >>>> int&)' is ambiguous >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>>> >>>> note: candidates are: >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>>> >>>> note: static void OrderAccess::release_store(volatile jint*, jint) >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>>> >>>> note: no known conversion for argument 1 from 'volatile long int*' >>>> to 'volatile jint* {aka volatile int*}' >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>>> >>>> note: static void OrderAccess::release_store(volatile juint*, juint) >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>>> >>>> note: no known conversion for argument 1 from 'volatile long int*' >>>> to 'volatile juint* {aka volatile unsigned int*}' >>>> >>>> On 2017-07-10 20:50, Robbin Ehn wrote: >>>>> I'll start a push now. >>>>> >>>>> /Robbin >>>>> >>>>> On 2017-07-10 12:38, Roman Kennke wrote: >>>>>> Ok, so I guess I need a sponsor for this now: >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>> >>>>>> >>>>>> Roman >>>>>> >>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>>>>> > wrote: >>>>>>>> >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>>>>> Hi Robbin, >>>>>>>>>> Far down -> >>>>>>>>>> >>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>>>>>> I'm not happy about this change: >>>>>>>>>>>> >>>>>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>>>>> + // TODO: Is this really needed? >>>>>>>>>>>> + OrderAccess::storestore(); >>>>>>>>>>>> + } >>>>>>>>>>>> >>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>>>>> consistent >>>>>>>>>>>> with an OrderAccess::storestore() that's not properly >>>>>>>>>>>> documented >>>>>>>>>>>> which is only increasing the technical debt. >>>>>>>>>>>> >>>>>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>>>>> >>>>>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case >>>>>>>>>>>>> that >>>>>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>>>>> sweeper) >>>>>>>>>>>>> is holding still. >>>>>>>>>>>> and: >>>>>>>>>>>> >>>>>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>>>>> sweeper.cpp... >>>>>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>>>>> marking >>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>>>>> (outside >>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>>>>> storestore() >>>>>>>>>>> should be necessary. >>>>>>>>>>> >>>>>>>>>>> From Igor's comment I can see how it happened though: >>>>>>>>>>> Apparently >>>>>>>>>>> there >>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>>>>> with >>>>>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>>>>> required >>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>>>>> also put >>>>>>>>>>> a storestore() in the other places that call >>>>>>>>>>> mark_as_seen_on_stack(), >>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>>>>> storestore() >>>>>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>>>>> 'for >>>>>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>>>>> necessary in >>>>>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>>>>> >>>>>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>>>>> Refactor the >>>>>>>>>>> code so that both paths at least call the storestore() in the >>>>>>>>>>> same >>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and >>>>>>>>>>> call >>>>>>>>>>> storestore() in the dtor as proposed?) >>>>>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>>>>> >>>>>>>>>> So there is a slight optimization when not running sweeper to >>>>>>>>>> skip >>>>>>>>>> compiler barrier/fence in stw. >>>>>>>>>> >>>>>>>>>> Don't think that matter, so I propose something like: >>>>>>>>>> - long stack_traversal_mark() { return >>>>>>>>>> _stack_traversal_mark; } >>>>>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>>>>> _stack_traversal_mark = l; } >>>>>>>>>> + long stack_traversal_mark() { return >>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>>>>> >>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>>>>> that >>>>>>>>>> it is concurrent accessed. >>>>>>>>>> And remove both storestore. >>>>>>>>>> >>>>>>>>>> "Also neither of these state variables are volatile in >>>>>>>>>> nmethod, so >>>>>>>>>> even the compiler may reorder the stores" >>>>>>>>>> Fortunately at least _state is volatile now. >>>>>>>>>> >>>>>>>>>> I think _state also should use la/rs semantics instead, but >>>>>>>>>> that's >>>>>>>>>> another story. >>>>>>>>> Like this? >>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>>>> >>>>>>>>> >>>>>>>> Yes, exactly, I like this! >>>>>>>> Dan? Igor ? Tobias? >>>>>>>> >>>>>>> That seems correct. >>>>>>> >>>>>>> igor >>>>>>> >>>>>>>> Thanks Roman! >>>>>>>> >>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow >>>>>>>> this >>>>>>>> thread/changeset to the end! >>>>>>>> >>>>>>>> /Robbin >>>>>>>> >>>>>>>>> Roman From rkennke at redhat.com Thu Jul 20 16:50:58 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 20 Jul 2017 18:50:58 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com> References: <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com> <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com> <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com> <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com> Message-ID: Hi Erik, as discussed on IRC, I also changed references to GenCollectedHeap inside gc/cms to use CMSHeap instead, where applicable. Differential: http://cr.openjdk.java.net/~rkennke/8179387/webrev.10.diff/ Full: http://cr.openjdk.java.net/~rkennke/8179387/webrev.10/ I also need a 2nd reviewer. Roman Am 20.07.2017 um 10:44 schrieb Erik Helin: > On 07/17/2017 02:07 PM, Roman Kennke wrote: >>>> Ok, added those and some more that I found. Not sure why we'd need >>>> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out >>>> for now. >>> >>> Because you are accessing CMSCollcetor in: >>> >>> 99 NOT_PRODUCT( >>> 100 virtual size_t skip_header_HeapWords() { return >>> CMSCollector::skip_header_HeapWords(); } >>> 101 ) >>> >>> and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An >>> alternative would of course be to just declare skip_header_HeapWords() >>> in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then >>> you only need to include concurrentMarkSweeoGeneration.hpp in >>> cmsHeap.cpp. >> Ah ok, I've missed that one. Added it now. > > Where did you add it? I don't see any include of > "gc/cms/concurrentMarkSweepGeneration.hpp" in cmsHeap.hpp? > > Thanks, > Erik From kim.barrett at oracle.com Thu Jul 20 17:34:13 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 20 Jul 2017 13:34:13 -0400 Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as belonging to mtGC In-Reply-To: <1500536234.2924.0.camel@oracle.com> References: <1500536234.2924.0.camel@oracle.com> Message-ID: > On Jul 20, 2017, at 3:37 AM, Thomas Schatzl wrote: > > Hi Milan, > > On Wed, 2017-07-19 at 17:05 +0000, Milan Mimica wrote: >> Hello >> >> I'm resending the two patches (JDK-8176571, JDK-8182169) from my new >> email address which I will be using from now on in this ML. I was >> notified my OCA has been approved. >> >> The patches have previously been discussed and generally approved. I >> recreated them against the recent tip, and also removed overloaded >> constructors from CHeapBitmap by using default parameters, as >> discussed. > > great! > > Looks good. I can sponsor as soon as Kim or anybody else gives his > okay. > > Thanks, > Thomas Looks good. From thomas.schatzl at oracle.com Thu Jul 20 18:40:42 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 20 Jul 2017 20:40:42 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <1500548791.2924.6.camel@oracle.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com> <1500548791.2924.6.camel@oracle.com> Message-ID: <1500576042.2688.10.camel@oracle.com> Hi again, ? a few more cleanups could be found that were worth picking up here. On Thu, 2017-07-20 at 13:06 +0200, Thomas Schatzl wrote: > Hi all, > > ? Erik and Mikael had a look at it and suggested several further > cleanups, removing about 40 LOC more. These included: > > - instead of G1CMBitMapR0 use properly const'ified G1CMBitmaps > - change the _start and _word_size members into an equivalent > MemRegion > - minor cleanups, removing obsolete asserts, simplify code. > > Webrevs: > http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1_to_2/?(diff) > http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2/?(full) > Testing: > jprt Webrevs: http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/?(diff) http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/?(full) Testing: jprt Thanks, ? Thomas From thomas.schatzl at oracle.com Thu Jul 20 18:41:00 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 20 Jul 2017 20:41:00 +0200 Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their own files Message-ID: <1500576060.2688.11.camel@oracle.com> Hi all, ? can I have reviews for this wrap-up of the G1CMBitmap cleanup? It simply moves all G1CMBitmap related code into their own files. Although it's a large change, it's really only moving code. Depends on JDK-8184346, based on webrev.3. CR: https://bugs.openjdk.java.net/browse/JDK-8184347 Webrev: http://cr.openjdk.java.net/~tschatzl/8184347/webrev/ Testing: jprt Thomas From shade at redhat.com Thu Jul 20 18:46:38 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 20 Jul 2017 20:46:38 +0200 Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their own files In-Reply-To: <1500576060.2688.11.camel@oracle.com> References: <1500576060.2688.11.camel@oracle.com> Message-ID: <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com> On 07/20/2017 08:41 PM, Thomas Schatzl wrote: > Webrev: > http://cr.openjdk.java.net/~tschatzl/8184347/webrev/ Looks good to me. Would you like us to RFE moving this to gc/shared some time later? This would quire probably need to decouple listeners from the otherwise GC agnostic code. Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Thu Jul 20 18:50:50 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 20 Jul 2017 20:50:50 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <1500576042.2688.10.camel@oracle.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com> <1500548791.2924.6.camel@oracle.com> <1500576042.2688.10.camel@oracle.com> Message-ID: On 07/20/2017 08:40 PM, Thomas Schatzl wrote: > Webrevs: > http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/ (diff) > http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/ (full) Generally good, comments: *) Long log_debug, log_warning, assert lines in g1CollectedHeap.cpp, g1ConcurrentMark.cpp, g1ConcurrentMark.hpp *) It seems the field and method names are camel-cased and thus style-inconsistent with the rest of the code? 625 const G1CMBitMap* const prevMarkBitMap() const { return _prevMarkBitMap; } 626 G1CMBitMap* nextMarkBitMap() const { return _nextMarkBitMap; } Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From robbin.ehn at oracle.com Thu Jul 20 21:52:28 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 20 Jul 2017 23:52:28 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: <0c200752-6e45-6132-d937-4e9429ed9f95@oracle.com> References: <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com> <0c200752-6e45-6132-d937-4e9429ed9f95@oracle.com> Message-ID: <4715671c-82bd-914c-edf0-0ad616723a16@oracle.com> On 07/20/2017 06:43 PM, Daniel D. Daugherty wrote: > On 7/20/17 4:53 AM, Roman Kennke wrote: >> Hi all, >> >> Robbin found some more missing includes in jprt testing (thanks!!) >> >> Differential: >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/ >> >> Full: >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ >> >> >> Am I breaking the record for most webrev revisions? :-P >> >> According the Robbin, builds are now all clean. >> >> Can I get final reviews and then a sponsor? > > src/share/vm/runtime/safepoint.cpp > No comments. > > Only reviewed the one file that changed since webrev.15. > > Thumbs up! +1, since the incremental changes are trivial I'll sponsor the push now. We seem to have an issue with: gc/arguments/TestAggressiveHeap.java (8183910) So push might need a couple of reruns. /Robbin > > Dan > > > >> >> Thanks, >> Roman >> >> Am 16.07.2017 um 10:25 schrieb Robbin Ehn: >>> Hi Roman, >>> >>> On 2017-07-12 15:32, Roman Kennke wrote: >>>> Hi Robbin and all, >>>> >>>> I fixed the 32bit failures by using jlong in all relevant places: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ >>>> >>>> >>>> then Robbin found another problem. SafepointCleanupTest started to fail, >>>> because "mark nmethods" is no longer printed. This made me think that >>>> we're not measuring the conflated (and possibly parallelized) >>>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with >>>> "safepoint cleanup tasks" which measures the total duration of safepoint >>>> cleanup. We can't reasonably measure a possibly parallel and conflated >>>> pass standalone, but we can measure all and by subtrating all the other >>>> subphases, get an idea how long deflation and nmethod marking take up. >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ >>>> >>>> >>>> The full webrev is now: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ >>>> >>>> >>>> Hope that's all ;-) >>> With this changeset something always pop-ups. >>> >>> Failure reason: Targets failed. Target macosx_x64_10.9-fastdebug FAILED. >>> >>> /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ >>> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS >>> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE >>> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions >>> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16 >>> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070 >>> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN >>> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef >>> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS >>> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86 >>> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64 >>> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED >>> -DINCLUDE_AOT >>> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm >>> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22: >>> error: variable has incomplete type 'StrongRootsScope' >>> StrongRootsScope srs(num_cleanup_workers); >>> ^ >>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: >>> note: forward declaration of 'StrongRootsScope' >>> class StrongRootsScope; >>> ^ >>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22: >>> error: variable has incomplete type 'StrongRootsScope' >>> StrongRootsScope srs(1); >>> ^ >>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: >>> note: forward declaration of 'StrongRootsScope' >>> class StrongRootsScope; >>> ^ >>> 2 errors generated. >>> make[3]: *** >>> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o] >>> Error 1 >>> make[3]: *** Waiting for unfinished jobs.... >>> make[2]: *** [hotspot-server-libs] Error 2 >>> >>> Send me the new webrev and I'll test it before the 16th round of >>> review :) >>> >>> /Robbin >>> >>>> Roman >>>> >>>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn: >>>>> Hi, unfortunately the push failed on 32-bit. >>>>> >>>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) >>>>> >>>>> I do not have anytime to look at this, so here is the error. >>>>> >>>>> /Robbin >>>>> >>>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' >>>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>>>> member function 'long int nmethod::stack_traversal_mark()': >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>>>> >>>>> error: call of overloaded 'load_acquire(volatile long int*)' is >>>>> ambiguous >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>>>> >>>>> note: candidates are: >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>>>> >>>>> note: static jint OrderAccess::load_acquire(const volatile jint*) >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>>>> >>>>> note: no known conversion for argument 1 from 'volatile long int*' >>>>> to 'const volatile jint* {aka const volatile int*}' >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>>>> >>>>> note: static juint OrderAccess::load_acquire(const volatile juint*) >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>>>> >>>>> note: no known conversion for argument 1 from 'volatile long int*' >>>>> to 'const volatile juint* {aka const volatile unsigned int*}' >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>>>> member function 'void nmethod::set_stack_traversal_mark(long int)': >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>>>> >>>>> error: call of overloaded 'release_store(volatile long int*, long >>>>> int&)' is ambiguous >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>>>> >>>>> note: candidates are: >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>>>> >>>>> note: static void OrderAccess::release_store(volatile jint*, jint) >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>>>> >>>>> note: no known conversion for argument 1 from 'volatile long int*' >>>>> to 'volatile jint* {aka volatile int*}' >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>>>> >>>>> note: static void OrderAccess::release_store(volatile juint*, juint) >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>>>> >>>>> note: no known conversion for argument 1 from 'volatile long int*' >>>>> to 'volatile juint* {aka volatile unsigned int*}' >>>>> >>>>> On 2017-07-10 20:50, Robbin Ehn wrote: >>>>>> I'll start a push now. >>>>>> >>>>>> /Robbin >>>>>> >>>>>> On 2017-07-10 12:38, Roman Kennke wrote: >>>>>>> Ok, so I guess I need a sponsor for this now: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>> >>>>>>> >>>>>>> Roman >>>>>>> >>>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>>>>>> > wrote: >>>>>>>>> >>>>>>>>> Hi Roman, >>>>>>>>> >>>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>>>>>> Hi Robbin, >>>>>>>>>>> Far down -> >>>>>>>>>>> >>>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>>>>>>> I'm not happy about this change: >>>>>>>>>>>>> >>>>>>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>>>>>> + // TODO: Is this really needed? >>>>>>>>>>>>> + OrderAccess::storestore(); >>>>>>>>>>>>> + } >>>>>>>>>>>>> >>>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>>>>>> consistent >>>>>>>>>>>>> with an OrderAccess::storestore() that's not properly >>>>>>>>>>>>> documented >>>>>>>>>>>>> which is only increasing the technical debt. >>>>>>>>>>>>> >>>>>>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>>>>>> >>>>>>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case >>>>>>>>>>>>>> that >>>>>>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>>>>>> sweeper) >>>>>>>>>>>>>> is holding still. >>>>>>>>>>>>> and: >>>>>>>>>>>>> >>>>>>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>>>>>> sweeper.cpp... >>>>>>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>>>>>> marking >>>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>>>>>> (outside >>>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>>>>>> storestore() >>>>>>>>>>>> should be necessary. >>>>>>>>>>>> >>>>>>>>>>>> From Igor's comment I can see how it happened though: >>>>>>>>>>>> Apparently >>>>>>>>>>>> there >>>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>>>>>> with >>>>>>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>>>>>> required >>>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>>>>>> also put >>>>>>>>>>>> a storestore() in the other places that call >>>>>>>>>>>> mark_as_seen_on_stack(), >>>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>>>>>> storestore() >>>>>>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>>>>>> 'for >>>>>>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>>>>>> necessary in >>>>>>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>>>>>> >>>>>>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>>>>>> Refactor the >>>>>>>>>>>> code so that both paths at least call the storestore() in the >>>>>>>>>>>> same >>>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and >>>>>>>>>>>> call >>>>>>>>>>>> storestore() in the dtor as proposed?) >>>>>>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>>>>>> >>>>>>>>>>> So there is a slight optimization when not running sweeper to >>>>>>>>>>> skip >>>>>>>>>>> compiler barrier/fence in stw. >>>>>>>>>>> >>>>>>>>>>> Don't think that matter, so I propose something like: >>>>>>>>>>> - long stack_traversal_mark() { return >>>>>>>>>>> _stack_traversal_mark; } >>>>>>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>>>>>> _stack_traversal_mark = l; } >>>>>>>>>>> + long stack_traversal_mark() { return >>>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>>>>>> >>>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>>>>>> that >>>>>>>>>>> it is concurrent accessed. >>>>>>>>>>> And remove both storestore. >>>>>>>>>>> >>>>>>>>>>> "Also neither of these state variables are volatile in >>>>>>>>>>> nmethod, so >>>>>>>>>>> even the compiler may reorder the stores" >>>>>>>>>>> Fortunately at least _state is volatile now. >>>>>>>>>>> >>>>>>>>>>> I think _state also should use la/rs semantics instead, but >>>>>>>>>>> that's >>>>>>>>>>> another story. >>>>>>>>>> Like this? >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Yes, exactly, I like this! >>>>>>>>> Dan? Igor ? Tobias? >>>>>>>>> >>>>>>>> That seems correct. >>>>>>>> >>>>>>>> igor >>>>>>>> >>>>>>>>> Thanks Roman! >>>>>>>>> >>>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow >>>>>>>>> this >>>>>>>>> thread/changeset to the end! >>>>>>>>> >>>>>>>>> /Robbin >>>>>>>>> >>>>>>>>>> Roman > From kishor.kharbas at intel.com Fri Jul 21 01:34:44 2017 From: kishor.kharbas at intel.com (Kharbas, Kishor) Date: Fri, 21 Jul 2017 01:34:44 +0000 Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory devices In-Reply-To: References: Message-ID: I have a new version of this patch at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.06/ This version has been tested on Windows, Linux, Solaris and Mac OS. I could not get access to AIX for testing. I used tmpfs to test the functionality. Cases that were tested were. 1. Allocation of heap using file mapping when -XX:HeapDir= option is used. 2. Creation of nameless temporary file for Heap allocation which prevents access to file using its name. 3. Correct deletion and freeing up of space allocated for file under different exit conditions. 4. Error handling when path specified is not present, heap size is more than size of file system, etc. - Kishor From: Kharbas, Kishor Sent: Tuesday, July 11, 2017 6:40 PM To: 'hotspot-gc-dev at openjdk.java.net' Cc: Kharbas, Kishor Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory devices Greetings, I have an updated patch for JEP https://bugs.openjdk.java.net/browse/JDK-8171181 at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.05 This patch fixes the bugs pointed earlier and other suggestions to make the code less intrusive. I have also sent this to 'hotspot-runtime-dev' mailing list (included below). I would appreciate comments and feedback. Thanks Kishor From: Kharbas, Kishor Sent: Monday, July 10, 2017 1:53 PM To: hotspot-runtime-dev at openjdk.java.net Cc: Kharbas, Kishor > Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory devices Hello all! I have an updated patch for https://bugs.openjdk.java.net/browse/JDK-8171181 at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.05 I have lost the old email chain so had to start a fresh one. The archived conversation can be found at - http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-March/022733.html 1. I have worked on all the comments and fixed the bugs. Mainly bugs fixed are related to sigprocmask() and changed the implementation such that 'fd' is not passed all the way down the call stack. Thus minimizing function signature changes. 2. Patch supports all OS'es. Consolidated all Posix compliant OS's implementation in os_posix.cpp. 3. The patch is tested on Windows and Linux. Working on testing it on other OS'es. Let me know if this version looks clean and correct. Thanks Kishor -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikael.gerdin at oracle.com Fri Jul 21 07:42:36 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Fri, 21 Jul 2017 09:42:36 +0200 Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their own files In-Reply-To: <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com> References: <1500576060.2688.11.camel@oracle.com> <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com> Message-ID: Hi, On 2017-07-20 20:46, Aleksey Shipilev wrote: > On 07/20/2017 08:41 PM, Thomas Schatzl wrote: >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8184347/webrev/ > > Looks good to me. +1 /Mikael > > Would you like us to RFE moving this to gc/shared some time later? This would > quire probably need to decouple listeners from the otherwise GC agnostic code. > > Thanks, > -Aleksey > From rkennke at redhat.com Fri Jul 21 08:02:58 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 21 Jul 2017 10:02:58 +0200 Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their own files In-Reply-To: <1500576060.2688.11.camel@oracle.com> References: <1500576060.2688.11.camel@oracle.com> Message-ID: <66187932-2b94-edfc-4910-18acaf5a61a2@redhat.com> Hi Thomas, this change looks good to me. Roman (not an official reviewer) > Hi all, > > can I have reviews for this wrap-up of the G1CMBitmap cleanup? It > simply moves all G1CMBitmap related code into their own files. > > Although it's a large change, it's really only moving code. > > Depends on JDK-8184346, based on webrev.3. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8184347 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8184347/webrev/ > Testing: > jprt > > Thomas > From rkennke at redhat.com Fri Jul 21 08:06:17 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 21 Jul 2017 10:06:17 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <1500576042.2688.10.camel@oracle.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com> <1500548791.2924.6.camel@oracle.com> <1500576042.2688.10.camel@oracle.com> Message-ID: <0ac30a4c-752d-d27f-ce57-748265ac8eb6@redhat.com> Looks good to me. Roman > Hi again, > > a few more cleanups could be found that were worth picking up here. > > On Thu, 2017-07-20 at 13:06 +0200, Thomas Schatzl wrote: >> Hi all, >> >> Erik and Mikael had a look at it and suggested several further >> cleanups, removing about 40 LOC more. These included: >> >> - instead of G1CMBitMapR0 use properly const'ified G1CMBitmaps >> - change the _start and _word_size members into an equivalent >> MemRegion >> - minor cleanups, removing obsolete asserts, simplify code. >> >> Webrevs: >> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1_to_2/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2/ (full) >> Testing: >> jprt > Webrevs: > http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/ (diff) > http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/ (full) > Testing: > jprt > > Thanks, > Thomas From rkennke at redhat.com Fri Jul 21 08:07:14 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 21 Jul 2017 10:07:14 +0200 Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their own files In-Reply-To: <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com> References: <1500576060.2688.11.camel@oracle.com> <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com> Message-ID: <6e8ff36c-8501-9b7c-9a86-efa19524c728@redhat.com> Am 20.07.2017 um 20:46 schrieb Aleksey Shipilev: > On 07/20/2017 08:41 PM, Thomas Schatzl wrote: >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8184347/webrev/ > Looks good to me. > > Would you like us to RFE moving this to gc/shared some time later? I think we already discussed this, and I believe the answer was yes? ;-) https://bugs.openjdk.java.net/browse/JDK-8180193 Roman From shade at redhat.com Fri Jul 21 08:08:55 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 21 Jul 2017 10:08:55 +0200 Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their own files In-Reply-To: <6e8ff36c-8501-9b7c-9a86-efa19524c728@redhat.com> References: <1500576060.2688.11.camel@oracle.com> <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com> <6e8ff36c-8501-9b7c-9a86-efa19524c728@redhat.com> Message-ID: <123819a2-f2ca-e9ad-6ad4-e84bd8ba1231@redhat.com> On 07/21/2017 10:07 AM, Roman Kennke wrote: > Am 20.07.2017 um 20:46 schrieb Aleksey Shipilev: >> On 07/20/2017 08:41 PM, Thomas Schatzl wrote: >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8184347/webrev/ >> Looks good to me. >> >> Would you like us to RFE moving this to gc/shared some time later? > > I think we already discussed this, and I believe the answer was yes? ;-) > > https://bugs.openjdk.java.net/browse/JDK-8180193 Missed that! :) -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From mikael.gerdin at oracle.com Fri Jul 21 08:15:48 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Fri, 21 Jul 2017 10:15:48 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com> <1500548791.2924.6.camel@oracle.com> <1500576042.2688.10.camel@oracle.com> Message-ID: <9d2dbdbb-354a-bd0c-3bb1-dad8902c13b6@oracle.com> Hi Aleksey, On 2017-07-20 20:50, Aleksey Shipilev wrote: > On 07/20/2017 08:40 PM, Thomas Schatzl wrote: >> Webrevs: >> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/ (full) > Looks fine to me too. > Generally good, comments: > > *) Long log_debug, log_warning, assert lines in g1CollectedHeap.cpp, > g1ConcurrentMark.cpp, g1ConcurrentMark.hpp > > *) It seems the field and method names are camel-cased and thus > style-inconsistent with the rest of the code? > 625 const G1CMBitMap* const prevMarkBitMap() const { return _prevMarkBitMap; } > 626 G1CMBitMap* nextMarkBitMap() const { return _nextMarkBitMap; } I think the idea is to perform that renaming in G1ConcurrentMark in a later change since this one tries to only concern G1CMBitMap. /Mikael > > Thanks, > -Aleksey > From shade at redhat.com Fri Jul 21 08:17:22 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 21 Jul 2017 10:17:22 +0200 Subject: RFR (M): 8184346: Clean up G1CMBitmap In-Reply-To: <9d2dbdbb-354a-bd0c-3bb1-dad8902c13b6@oracle.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com> <1500548791.2924.6.camel@oracle.com> <1500576042.2688.10.camel@oracle.com> <9d2dbdbb-354a-bd0c-3bb1-dad8902c13b6@oracle.com> Message-ID: <90f10619-2265-a6a9-7a6c-dd4c5e2a6082@redhat.com> On 07/21/2017 10:15 AM, Mikael Gerdin wrote: > On 2017-07-20 20:50, Aleksey Shipilev wrote: >> On 07/20/2017 08:40 PM, Thomas Schatzl wrote: >>> Webrevs: >>> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/ (diff) >>> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/ (full) >> > > Looks fine to me too. > >> Generally good, comments: >> >> *) Long log_debug, log_warning, assert lines in g1CollectedHeap.cpp, >> g1ConcurrentMark.cpp, g1ConcurrentMark.hpp >> >> *) It seems the field and method names are camel-cased and thus >> style-inconsistent with the rest of the code? >> 625 const G1CMBitMap* const prevMarkBitMap() const { return _prevMarkBitMap; } >> 626 G1CMBitMap* nextMarkBitMap() const { return _nextMarkBitMap; } > > I think the idea is to perform that renaming in G1ConcurrentMark in a later > change since this one tries to only concern G1CMBitMap. No problem! Fix the long asserts, and I am happy with the patch. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From kirk at kodewerk.com Fri Jul 21 07:34:02 2017 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Fri, 21 Jul 2017 10:34:02 +0300 Subject: Bug in G1 In-Reply-To: <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> <1500034180.3458.67.camel@oracle.com> <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com> Message-ID: <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com> An HTML attachment was scrubbed... URL: From rkennke at redhat.com Fri Jul 21 10:13:24 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 21 Jul 2017 12:13:24 +0200 Subject: RFR: 8184751: Provide thread pool for parallel safepoint cleanup Message-ID: <8ec1092c-b01e-80a9-23dd-8447e30c675e@redhat.com> This is a follow-up to 8180932: Parallelize safepoint cleanup, which should land in JDK10 real soon now. In order to actually be able to parallelize safepoint cleanup, we now need the GC to provide some worker threads. In this change, I propose to create one globally (i.e. for all GCs) in CollectedHeap, if ParallelSafepointCleanupThreads>1. The flag defaults to 0, which means it's doing cleanup using the VMThread (i.e. exactly current behaviour). We have already discussed this, and came to the conclusion that it does not really make sense to share the GC's worker threads here, because they may not be idle, but only suspended from concurrent work (i.e. by SuspendibleThreadSet::synchronize() or similar). http://cr.openjdk.java.net/~rkennke/8184751/webrev.00/ What do you think? Roman From thomas.schatzl at oracle.com Fri Jul 21 14:34:27 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 21 Jul 2017 16:34:27 +0200 Subject: Bug in G1 In-Reply-To: <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> <1500034180.3458.67.camel@oracle.com> <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com> <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com> Message-ID: <1500647667.2385.33.camel@oracle.com> Hi Kirk, On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote: > Hi all, > > A while back I mentioned to Erik at JFokus that I was seeing a > puzzling behavior in the G1 where without any obvious failure, heap > occupancy after collections would spike which would trigger a full > which would (unexpectedly) completely recover everything down to the > expected live set. Yesterday while working with Simone Bordet on the > problem we came to the realization that we were seeing a pattern > prior to the ramp up to the Full, Survivor space would be > ergonomically resized to 0 -> 0. The only way to reset the situation > was to run a full collection. In our minds this doesn?t make any > sense to reset survivor space to 0. So far this is an observation > from a single GC log but I recall seeing the pattern in many other > logs. Before I go through the exercise of building a super grep to > run over my G1 log repo I?d like to ask; under what conditions would > it make sense to have the survivor space resized to 0? And if not, > ?would this be bug in G1? We tried reproducing the behavior in some > test applications but I fear we often only see this happening in > production applications that have been running for several days. It?s > a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9. ? sounds similar to?https://bugs.openjdk.java.net/browse/JDK-8037500. Could you please post the type of collections for a few more gcs before the zero-sized ones? It would be particularly interesting if there is a mixed gc with to-space exhaustion just before this sequence. And if there are log messages with attempts to start marking too. As for why that bug has been closed as "won't fix" because we do not have a reproducer (any more) to test any changes in addition to the stated reasons that the performance impact seemed minor at that time. There have been some changes in how the next gc is calculated in 9 too, so I do not know either if 9 is also affected (particularly one of these young-only gc's would not be issued any more). I can think of at least one more reasons other than stated in the CR why this occurs at least for 8u60+ builds. There is the possibility particularly in conjunction with humongous object allocation that after starting the mutator, immediately afterwards a young gc that reclaims zero space is issued, e.g.: young-gc, has X regions left at the end, starts mutators mutator 1 allocates exactly X regions as humongous objects mutator 2 allocates, finds that there are no regions left, issues young-gc request; in this young-gc eden and survivor are of obviously of zero size [...and so on...] Note that this pattern could repeat multiple times as young gc may reclaim space from humongous objects (eager reclaim!) until at some point it ran into full gc. The logging that shows humongous object allocation (something about reaching threshold and starting marking) could confirm this situation. No guarantees about that being the actual issue though. Thanks, ? Thomas From milan.mimica at gmail.com Sun Jul 23 08:31:37 2017 From: milan.mimica at gmail.com (Milan Mimica) Date: Sun, 23 Jul 2017 08:31:37 +0000 Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as belonging to mtGC In-Reply-To: <1500536234.2924.0.camel@oracle.com> References: <1500536234.2924.0.camel@oracle.com> Message-ID: ?et, 20. srp 2017. u 09:37 Thomas Schatzl napisao je: > > great! > > Looks good. I can sponsor as soon as Kim or anybody else gives his > okay. > Hi I just noticed my heapBitMap_nmt.diff includes the other one. Find the corrected one in attachment. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: heapBitMap_nmt.diff Type: text/x-patch Size: 5361 bytes Desc: not available URL: From kirk at kodewerk.com Sun Jul 23 10:51:39 2017 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Sun, 23 Jul 2017 13:51:39 +0300 Subject: Bug in G1 In-Reply-To: <1500647667.2385.33.camel@oracle.com> References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> <1500034180.3458.67.camel@oracle.com> <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com> <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com> <1500647667.2385.33.camel@oracle.com> Message-ID: Thanks for the information. I?ve shared the entire log with you on dropbox. Feel free to distribute it as you see fit. I see the to-space exhausted but there doesn?t appear to be a mixed collection involved. Below is a single sequence up to and including the Full. Kind regards, Kirk 2017-05-23T20:42:55.303-0400: 40580.396: Application time: 0.8539675 seconds 2017-05-23T20:42:55.304-0400: 40580.398: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 169869312 bytes, new threshold 15 (max 15) - age 1: 3278808 bytes, 3278808 total - age 2: 71278552 bytes, 74557360 total - age 3: 533720 bytes, 75091080 total - age 4: 12897544 bytes, 87988624 total - age 5: 796672 bytes, 88785296 total - age 6: 503288 bytes, 89288584 total 2017-05-23T20:42:55.363-0400: 40580.457: [SoftReference, 0 refs, 0.0010011 secs]2017-05-23T20:42:55.364-0400: 40580.458: [WeakReference, 367 refs, 0.0006136 secs]2017-05-23T20:42:55.365-0400: 40580.458: [FinalReference, 7659 refs, 0.0014460 secs]2017-05-23T20:42:55.366-0400: 40580.460: [PhantomReference, 0 refs, 0 refs, 0.0011060 secs]2017-05-23T20:42:55.367-0400: 40580.461: [JNI Weak Reference, 0.0000647 secs], 0.0669684 secs] [Parallel Time: 57.9 ms, GC Workers: 8] [GC Worker Start (ms): Min: 40580398.1, Avg: 40580398.2, Max: 40580398.3, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 2.7, Diff: 1.0, Sum: 15.2] [Update RS (ms): Min: 15.2, Avg: 15.7, Max: 15.8, Diff: 0.6, Sum: 125.4] [Processed Buffers: Min: 44, Avg: 50.1, Max: 62, Diff: 18, Sum: 401] [Scan RS (ms): Min: 1.9, Avg: 2.0, Max: 2.1, Diff: 0.2, Sum: 15.9] [Code Root Scanning (ms): Min: 1.6, Avg: 1.7, Max: 1.7, Diff: 0.1, Sum: 13.5] [Object Copy (ms): Min: 36.0, Avg: 36.2, Max: 36.2, Diff: 0.2, Sum: 289.2] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 1.0] [GC Worker Total (ms): Min: 57.4, Avg: 57.5, Max: 57.6, Diff: 0.2, Sum: 460.3] [GC Worker End (ms): Min: 40580455.7, Avg: 40580455.7, Max: 40580455.8, Diff: 0.1] [Code Root Fixup: 0.2 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.8 ms] [Other: 8.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.7 ms] [Ref Enq: 0.3 ms] [Redirty Cards: 0.3 ms] [Humongous Register: 0.2 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 1.9 ms] [Eden: 2484.0M(2484.0M)->0.0B(3544.0M) Survivors: 98.0M->100.0M Heap: 5189.0M(7168.0M)->2708.0M(7168.0M)] [Times: user=0.45 sys=0.03, real=0.07 secs] 2017-05-23T20:42:55.372-0400: 40580.465: Total time for which application threads were stopped: 0.0685303 seconds, Stopping threads took: 0.0001346 seconds 2017-05-23T20:42:59.372-0400: 40584.465: Application time: 4.0004774 seconds 2017-05-23T20:42:59.376-0400: 40584.469: Total time for which application threads were stopped: 0.0036324 seconds, Stopping threads took: 0.0023017 seconds 2017-05-23T20:43:05.891-0400: 40590.984: Application time: 6.5149722 seconds 2017-05-23T20:43:05.892-0400: 40590.985: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 239075328 bytes, new threshold 15 (max 15) - age 1: 4934368 bytes, 4934368 total - age 2: 2633808 bytes, 7568176 total - age 3: 71264464 bytes, 78832640 total - age 4: 527368 bytes, 79360008 total - age 5: 12893400 bytes, 92253408 total - age 6: 750128 bytes, 93003536 total - age 7: 432784 bytes, 93436320 total 2017-05-23T20:43:09.493-0400: 40594.586: [SoftReference, 0 refs, 0.0067938 secs]2017-05-23T20:43:09.500-0400: 40594.593: [WeakReference, 0 refs, 0.0033881 secs]2017-05-23T20:43:09.503-0400: 40594.597: [FinalReference, 0 refs, 0.0005787 secs]2017-05-23T20:43:09.504-0400: 40594.597: [PhantomReference, 0 refs, 0 refs, 0.0011377 secs]2017-05-23T20:43:09.505-0400: 40594.598: [JNI Weak Reference, 0.0000618 secs] (to-space exhausted), 4.8672247 secs] [Parallel Time: 3599.9 ms, GC Workers: 8] [GC Worker Start (ms): Min: 40590985.9, Avg: 40590986.0, Max: 40590986.1, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.3, Diff: 0.6, Sum: 15.2] [Update RS (ms): Min: 68.3, Avg: 68.4, Max: 68.5, Diff: 0.2, Sum: 547.6] [Processed Buffers: Min: 32, Avg: 49.0, Max: 72, Diff: 40, Sum: 392] [Scan RS (ms): Min: 2.8, Avg: 2.9, Max: 3.0, Diff: 0.1, Sum: 23.2] [Code Root Scanning (ms): Min: 2.4, Avg: 2.5, Max: 2.5, Diff: 0.1, Sum: 19.7] [Object Copy (ms): Min: 3523.7, Avg: 3523.8, Max: 3523.9, Diff: 0.2, Sum: 28190.6] [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.5] [GC Worker Total (ms): Min: 3599.6, Avg: 3599.7, Max: 3599.8, Diff: 0.2, Sum: 28797.6] [GC Worker End (ms): Min: 40594585.6, Avg: 40594585.7, Max: 40594585.7, Diff: 0.1] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 1.2 ms] [Other: 1265.8 ms] [Evacuation Failure: 1248.2 ms] [Choose CSet: 0.0 ms] [Ref Proc: 12.4 ms] [Ref Enq: 0.5 ms] [Redirty Cards: 2.1 ms] [Humongous Register: 0.2 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 1.5 ms] [Eden: 3544.0M(3544.0M)->0.0B(976.0M) Survivors: 100.0M->456.0M Heap: 6274.3M(7168.0M)->5978.2M(7168.0M)] [Times: user=13.58 sys=0.11, real=4.86 secs] 2017-05-23T20:43:10.760-0400: 40595.853: Total time for which application threads were stopped: 4.8690628 seconds, Stopping threads took: 0.0002136 seconds 2017-05-23T20:43:11.762-0400: 40596.855: Application time: 1.0019247 seconds 2017-05-23T20:43:11.763-0400: 40596.856: Total time for which application threads were stopped: 0.0015356 seconds, Stopping threads took: 0.0003107 seconds 2017-05-23T20:43:11.880-0400: 40596.973: Application time: 0.1164884 seconds 2017-05-23T20:43:11.881-0400: 40596.974: [GC pause (G1 Humongous Allocation) (young) (initial-mark) Desired survivor size 94371840 bytes, new threshold 1 (max 15) - age 1: 477501112 bytes, 477501112 total - age 2: 182296 bytes, 477683408 total - age 3: 78880 bytes, 477762288 total - age 4: 45376 bytes, 477807664 total - age 5: 92304 bytes, 477899968 total - age 6: 75448 bytes, 477975416 total - age 7: 86752 bytes, 478062168 total - age 8: 71408 bytes, 478133576 total 2017-05-23T20:43:17.335-0400: 40602.428: [SoftReference, 0 refs, 0.0071133 secs]2017-05-23T20:43:17.342-0400: 40602.435: [WeakReference, 3 refs, 0.0007987 secs]2017-05-23T20:43:17.343-0400: 40602.436: [FinalReference, 182 refs, 0.0017603 secs]2017-05-23T20:43:17.345-0400: 40602.438: [PhantomReference, 0 refs, 0 refs, 0.0015961 secs]2017-05-23T20:43:17.346-0400: 40602.440: [JNI Weak Reference, 0.0000730 secs] (to-space exhausted), 6.1987667 secs] [Parallel Time: 5446.3 ms, GC Workers: 8] [GC Worker Start (ms): Min: 40596975.6, Avg: 40596975.7, Max: 40596975.8, Diff: 0.2] [Ext Root Scanning (ms): Min: 2.9, Avg: 3.1, Max: 3.2, Diff: 0.3, Sum: 24.4] [Update RS (ms): Min: 10.1, Avg: 10.3, Max: 10.5, Diff: 0.4, Sum: 82.6] [Processed Buffers: Min: 33, Avg: 40.2, Max: 51, Diff: 18, Sum: 322] [Scan RS (ms): Min: 30.7, Avg: 31.1, Max: 32.4, Diff: 1.8, Sum: 249.0] [Code Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.6, Diff: 0.5, Sum: 2.8] [Object Copy (ms): Min: 5399.2, Avg: 5400.6, Max: 5400.9, Diff: 1.7, Sum: 43204.5] [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.3] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5] [GC Worker Total (ms): Min: 5445.5, Avg: 5445.6, Max: 5445.7, Diff: 0.2, Sum: 43565.0] [GC Worker End (ms): Min: 40602421.3, Avg: 40602421.4, Max: 40602421.4, Diff: 0.1] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.8 ms] [Other: 751.4 ms] [Evacuation Failure: 728.5 ms] [Choose CSet: 0.0 ms] [Ref Proc: 17.8 ms] [Ref Enq: 0.5 ms] [Redirty Cards: 2.1 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.2 ms] [Free CSet: 0.8 ms] [Eden: 878.0M(976.0M)->0.0B(1424.0M) Survivors: 456.0M->8192.0K Heap: 6856.2M(7168.0M)->6908.2M(7168.0M)] [Times: user=11.66 sys=1.15, real=6.19 secs] 2017-05-23T20:43:18.080-0400: 40603.173: [GC concurrent-root-region-scan-start] 2017-05-23T20:43:18.080-0400: 40603.173: Total time for which application threads were stopped: 6.2005443 seconds, Stopping threads took: 0.0002322 seconds 2017-05-23T20:43:18.080-0400: 40603.174: Application time: 0.0002882 seconds 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-root-region-scan-end, 0.0339339 secs] 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-mark-start] 2017-05-23T20:43:18.142-0400: 40603.235: Total time for which application threads were stopped: 0.0613820 seconds, Stopping threads took: 0.0001677 seconds 2017-05-23T20:43:18.142-0400: 40603.236: Application time: 0.0005017 seconds 2017-05-23T20:43:18.144-0400: 40603.237: Total time for which application threads were stopped: 0.0013197 seconds, Stopping threads took: 0.0001188 seconds 2017-05-23T20:43:18.144-0400: 40603.237: Application time: 0.0001781 seconds 2017-05-23T20:43:18.144-0400: 40603.238: Total time for which application threads were stopped: 0.0005735 seconds, Stopping threads took: 0.0000568 seconds 2017-05-23T20:43:18.728-0400: 40603.821: Application time: 0.5835349 seconds 2017-05-23T20:43:18.730-0400: 40603.823: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 94371840 bytes, new threshold 15 (max 15) - age 1: 8388248 bytes, 8388248 total 2017-05-23T20:43:19.821-0400: 40604.914: [SoftReference, 0 refs, 0.0009673 secs]2017-05-23T20:43:19.822-0400: 40604.915: [WeakReference, 0 refs, 0.0006733 secs]2017-05-23T20:43:19.823-0400: 40604.916: [FinalReference, 0 refs, 0.0006260 secs]2017-05-23T20:43:19.823-0400: 40604.917: [PhantomReference, 0 refs, 0 refs, 0.0013002 secs]2017-05-23T20:43:19.825-0400: 40604.918: [JNI Weak Reference, 0.0000642 secs] (to-space exhausted), 1.2567408 secs] [Parallel Time: 1084.5 ms, GC Workers: 8] [GC Worker Start (ms): Min: 40603823.4, Avg: 40603823.5, Max: 40603823.6, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.7, Sum: 15.3] [Update RS (ms): Min: 23.8, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum: 191.7] [Processed Buffers: Min: 49, Avg: 53.5, Max: 60, Diff: 11, Sum: 428] [Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 8.6] [Code Root Scanning (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.8] [Object Copy (ms): Min: 1056.4, Avg: 1056.8, Max: 1057.2, Diff: 0.8, Sum: 8454.7] [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.0] [Termination Attempts: Min: 1, Avg: 3.8, Max: 7, Diff: 6, Sum: 30] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.5] [GC Worker Total (ms): Min: 1084.0, Avg: 1084.1, Max: 1084.2, Diff: 0.2, Sum: 8673.2] [GC Worker End (ms): Min: 40604907.6, Avg: 40604907.7, Max: 40604907.7, Diff: 0.0] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.3 ms] [Other: 171.7 ms] [Evacuation Failure: 159.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 9.9 ms] [Ref Enq: 0.6 ms] [Redirty Cards: 0.6 ms] [Humongous Register: 0.2 ms] [Humongous Reclaim: 0.3 ms] [Free CSet: 0.2 ms] [Eden: 230.0M(1424.0M)->0.0B(1432.0M) Survivors: 8192.0K->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] [Times: user=2.33 sys=0.34, real=1.26 secs] 2017-05-23T20:43:19.987-0400: 40605.080: Total time for which application threads were stopped: 1.2587489 seconds, Stopping threads took: 0.0002182 seconds 2017-05-23T20:43:19.987-0400: 40605.080: Application time: 0.0003101 seconds 2017-05-23T20:43:19.988-0400: 40605.082: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 94371840 bytes, new threshold 15 (max 15) 2017-05-23T20:43:20.020-0400: 40605.113: [SoftReference, 0 refs, 0.0008856 secs]2017-05-23T20:43:20.020-0400: 40605.114: [WeakReference, 0 refs, 0.0005588 secs]2017-05-23T20:43:20.021-0400: 40605.114: [FinalReference, 0 refs, 0.0006006 secs]2017-05-23T20:43:20.022-0400: 40605.115: [PhantomReference, 0 refs, 0 refs, 0.0010837 secs]2017-05-23T20:43:20.023-0400: 40605.116: [JNI Weak Reference, 0.0000610 secs], 0.0356212 secs] [Parallel Time: 30.0 ms, GC Workers: 8] [GC Worker Start (ms): Min: 40605081.9, Avg: 40605082.0, Max: 40605082.1, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.9, Avg: 2.0, Max: 2.5, Diff: 0.6, Sum: 16.1] [Update RS (ms): Min: 27.3, Avg: 27.4, Max: 27.5, Diff: 0.2, Sum: 219.3] [Processed Buffers: Min: 82, Avg: 87.4, Max: 92, Diff: 10, Sum: 699] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.4] [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.4] [GC Worker Total (ms): Min: 29.7, Avg: 29.8, Max: 29.9, Diff: 0.2, Sum: 238.5] [GC Worker End (ms): Min: 40605111.8, Avg: 40605111.8, Max: 40605111.8, Diff: 0.0] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.2 ms] [Other: 5.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.0 ms] [Ref Enq: 0.2 ms] [Redirty Cards: 0.2 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.2 ms] [Free CSet: 0.1 ms] [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] [Times: user=0.25 sys=0.00, real=0.04 secs] 2017-05-23T20:43:20.024-0400: 40605.118: Total time for which application threads were stopped: 0.0372043 seconds, Stopping threads took: 0.0001640 seconds 2017-05-23T20:43:20.025-0400: 40605.118: Application time: 0.0002435 seconds 2017-05-23T20:43:20.026-0400: 40605.119: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 94371840 bytes, new threshold 15 (max 15) 2017-05-23T20:43:20.030-0400: 40605.123: [SoftReference, 0 refs, 0.0009405 secs]2017-05-23T20:43:20.031-0400: 40605.124: [WeakReference, 0 refs, 0.0005771 secs]2017-05-23T20:43:20.032-0400: 40605.125: [FinalReference, 0 refs, 0.0005766 secs]2017-05-23T20:43:20.032-0400: 40605.125: [PhantomReference, 0 refs, 0 refs, 0.0011847 secs]2017-05-23T20:43:20.033-0400: 40605.127: [JNI Weak Reference, 0.0000549 secs], 0.0087717 secs] [Parallel Time: 3.0 ms, GC Workers: 8] [GC Worker Start (ms): Min: 40605119.3, Avg: 40605119.4, Max: 40605119.5, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 2.6, Diff: 1.0, Sum: 14.8] [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [Processed Buffers: Min: 0, Avg: 0.5, Max: 2, Diff: 2, Sum: 4] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] [Termination (ms): Min: 0.0, Avg: 0.6, Max: 0.6, Diff: 0.6, Sum: 4.4] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5] [GC Worker Total (ms): Min: 2.6, Avg: 2.6, Max: 2.7, Diff: 0.1, Sum: 21.1] [GC Worker End (ms): Min: 40605122.0, Avg: 40605122.1, Max: 40605122.1, Diff: 0.1] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.3 ms] [Other: 5.2 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.1 ms] [Ref Enq: 0.3 ms] [Redirty Cards: 0.3 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.1 ms] [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] [Times: user=0.03 sys=0.00, real=0.01 secs] 2017-05-23T20:43:20.035-0400: 40605.128: Total time for which application threads were stopped: 0.0102350 seconds, Stopping threads took: 0.0000635 seconds 2017-05-23T20:43:20.035-0400: 40605.128: Application time: 0.0002150 seconds 2017-05-23T20:43:20.036-0400: 40605.129: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 94371840 bytes, new threshold 15 (max 15) 2017-05-23T20:43:20.040-0400: 40605.133: [SoftReference, 0 refs, 0.0010156 secs]2017-05-23T20:43:20.041-0400: 40605.134: [WeakReference, 0 refs, 0.0006580 secs]2017-05-23T20:43:20.042-0400: 40605.135: [FinalReference, 0 refs, 0.0006435 secs]2017-05-23T20:43:20.042-0400: 40605.136: [PhantomReference, 0 refs, 0 refs, 0.0012604 secs]2017-05-23T20:43:20.044-0400: 40605.137: [JNI Weak Reference, 0.0000513 secs], 0.0087896 secs] [Parallel Time: 2.7 ms, GC Workers: 8] [GC Worker Start (ms): Min: 40605129.6, Avg: 40605129.7, Max: 40605129.8, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.5, Diff: 0.8, Sum: 14.9] [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2] [Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.3] [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.5] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5] [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: 19.5] [GC Worker End (ms): Min: 40605132.1, Avg: 40605132.2, Max: 40605132.2, Diff: 0.0] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.3 ms] [Other: 5.5 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.4 ms] [Ref Enq: 0.3 ms] [Redirty Cards: 0.3 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.1 ms] [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] [Times: user=0.04 sys=0.00, real=0.01 secs] 2017-05-23T20:43:20.045-0400: 40605.138: Total time for which application threads were stopped: 0.0101403 seconds, Stopping threads took: 0.0000614 seconds 2017-05-23T20:43:20.045-0400: 40605.139: Application time: 0.0001681 seconds 2017-05-23T20:43:20.046-0400: 40605.140: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 94371840 bytes, new threshold 15 (max 15) 2017-05-23T20:43:20.050-0400: 40605.144: [SoftReference, 0 refs, 0.0008321 secs]2017-05-23T20:43:20.051-0400: 40605.145: [WeakReference, 0 refs, 0.0006103 secs]2017-05-23T20:43:20.052-0400: 40605.145: [FinalReference, 0 refs, 0.0007194 secs]2017-05-23T20:43:20.053-0400: 40605.146: [PhantomReference, 0 refs, 0 refs, 0.0010705 secs]2017-05-23T20:43:20.054-0400: 40605.147: [JNI Weak Reference, 0.0000508 secs], 0.0084107 secs] [Parallel Time: 2.7 ms, GC Workers: 8] [GC Worker Start (ms): Min: 40605139.9, Avg: 40605140.0, Max: 40605140.1, Diff: 0.2] [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.8, Sum: 15.1] [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.2] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5] [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: 19.2] [GC Worker End (ms): Min: 40605142.4, Avg: 40605142.4, Max: 40605142.5, Diff: 0.1] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.2 ms] [Other: 5.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.1 ms] [Ref Enq: 0.3 ms] [Redirty Cards: 0.2 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.1 ms] [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] [Times: user=0.03 sys=0.01, real=0.01 secs] 2017-05-23T20:43:20.055-0400: 40605.148: Total time for which application threads were stopped: 0.0097185 seconds, Stopping threads took: 0.0001029 seconds 2017-05-23T20:43:20.055-0400: 40605.148: Application time: 0.0001505 seconds 2017-05-23T20:43:20.056-0400: 40605.149: [Full GC (Allocation Failure) 2017-05-23T20:43:22.446-0400: 40607.540: [SoftReference, 1667 refs, 0.0003772 secs]2017-05-23T20:43:22.447-0400: 40607.541: [WeakReference, 5626 refs, 0.0008068 secs]2017-05-23T20:43:22.448-0400: 40607.541: [FinalReference, 4015 refs, 0.0015169 secs]2017-05-23T20:43:22.450-0400: 40607.543: [PhantomReference, 1 refs, 372 refs, 0.0001585 secs]2017-05-23T20:43:22.450-0400: 40607.543: [JNI Weak Reference, 0.0000963 secs] 7139M->2327M(7168M), 9.7036499 secs] [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->2327.6M(7168.0M)], [Metaspace: 108907K->108428K(1150976K)] [Times: user=13.22 sys=0.00, real=9.70 secs] 2017-05-23T20:43:29.760-0400: 40614.853: Total time for which application threads were stopped: 9.7047785 seconds, Stopping threads took: 0.0000566 seconds 2017-05-23T20:43:29.760-0400: 40614.854: [GC concurrent-mark-abort] 2017-05-23T20:43:29.763-0400: 40614.856: Application time: 0.0029444 seconds > On Jul 21, 2017, at 5:34 PM, Thomas Schatzl wrote: > > Hi Kirk, > > On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote: >> Hi all, >> >> A while back I mentioned to Erik at JFokus that I was seeing a >> puzzling behavior in the G1 where without any obvious failure, heap >> occupancy after collections would spike which would trigger a full >> which would (unexpectedly) completely recover everything down to the >> expected live set. Yesterday while working with Simone Bordet on the >> problem we came to the realization that we were seeing a pattern >> prior to the ramp up to the Full, Survivor space would be >> ergonomically resized to 0 -> 0. The only way to reset the situation >> was to run a full collection. In our minds this doesn?t make any >> sense to reset survivor space to 0. So far this is an observation >> from a single GC log but I recall seeing the pattern in many other >> logs. Before I go through the exercise of building a super grep to >> run over my G1 log repo I?d like to ask; under what conditions would >> it make sense to have the survivor space resized to 0? And if not, >> would this be bug in G1? We tried reproducing the behavior in some >> test applications but I fear we often only see this happening in >> production applications that have been running for several days. It?s >> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9. > > sounds similar to https://bugs.openjdk.java.net/browse/JDK-8037500. > Could you please post the type of collections for a few more gcs before > the zero-sized ones? It would be particularly interesting if there is a > mixed gc with to-space exhaustion just before this sequence. And if > there are log messages with attempts to start marking too. > > As for why that bug has been closed as "won't fix" because we do not > have a reproducer (any more) to test any changes in addition to the > stated reasons that the performance impact seemed minor at that time. > > There have been some changes in how the next gc is calculated in 9 too, > so I do not know either if 9 is also affected (particularly one of > these young-only gc's would not be issued any more). > > I can think of at least one more reasons other than stated in the CR > why this occurs at least for 8u60+ builds. There is the possibility > particularly in conjunction with humongous object allocation that after > starting the mutator, immediately afterwards a young gc that reclaims > zero space is issued, e.g.: > > young-gc, has X regions left at the end, starts mutators > mutator 1 allocates exactly X regions as humongous objects > mutator 2 allocates, finds that there are no regions left, issues > young-gc request; in this young-gc eden and survivor are of obviously > of zero size > [...and so on...] > > Note that this pattern could repeat multiple times as young gc may > reclaim space from humongous objects (eager reclaim!) until at some > point it ran into full gc. > > The logging that shows humongous object allocation (something about > reaching threshold and starting marking) could confirm this situation. > > No guarantees about that being the actual issue though. > > Thanks, > Thomas > From vitalyd at gmail.com Sun Jul 23 16:43:50 2017 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Sun, 23 Jul 2017 16:43:50 +0000 Subject: Bug in G1 In-Reply-To: References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> <1500034180.3458.67.camel@oracle.com> <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com> <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com> <1500647667.2385.33.camel@oracle.com> Message-ID: I've seen G1 get into a similar loop. Do you see any concurrent mark initiation? It's possible conc marking is still running and therefore mixed GCs aren't possible yet. There are some ways to tune G1 to initiate concurrent marking sooner (or more "aggressively" with more conc GC threads), but would be good to first know if you're seeing that. On Sun, Jul 23, 2017 at 6:52 AM Kirk Pepperdine wrote: > Thanks for the information. I?ve shared the entire log with you on > dropbox. Feel free to distribute it as you see fit. > > I see the to-space exhausted but there doesn?t appear to be a mixed > collection involved. Below is a single sequence up to and including the > Full. > > Kind regards, > Kirk > > > 2017-05-23T20:42:55.303-0400: 40580.396: Application time: 0.8539675 > seconds > 2017-05-23T20:42:55.304-0400: 40580.398: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 169869312 bytes, new threshold 15 (max 15) > - age 1: 3278808 bytes, 3278808 total > - age 2: 71278552 bytes, 74557360 total > - age 3: 533720 bytes, 75091080 total > - age 4: 12897544 bytes, 87988624 total > - age 5: 796672 bytes, 88785296 total > - age 6: 503288 bytes, 89288584 total > 2017-05-23T20:42:55.363-0400: 40580.457: [SoftReference, 0 refs, 0.0010011 > secs]2017-05-23T20:42:55.364-0400: 40580.458: [WeakReference, 367 refs, > 0.0006136 secs]2017-05-23T20:42:55.365-0400: 40580.458: [FinalReference, > 7659 refs, 0.0014460 secs]2017-05-23T20:42:55.366-0400: 40580.460: > [PhantomReference, 0 refs, 0 refs, 0.0011060 > secs]2017-05-23T20:42:55.367-0400: 40580.461: [JNI Weak Reference, > 0.0000647 secs], 0.0669684 secs] > [Parallel Time: 57.9 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40580398.1, Avg: 40580398.2, Max: > 40580398.3, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 2.7, Diff: 1.0, > Sum: 15.2] > [Update RS (ms): Min: 15.2, Avg: 15.7, Max: 15.8, Diff: 0.6, Sum: > 125.4] > [Processed Buffers: Min: 44, Avg: 50.1, Max: 62, Diff: 18, Sum: > 401] > [Scan RS (ms): Min: 1.9, Avg: 2.0, Max: 2.1, Diff: 0.2, Sum: 15.9] > [Code Root Scanning (ms): Min: 1.6, Avg: 1.7, Max: 1.7, Diff: 0.1, > Sum: 13.5] > [Object Copy (ms): Min: 36.0, Avg: 36.2, Max: 36.2, Diff: 0.2, Sum: > 289.2] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: > 1.0] > [GC Worker Total (ms): Min: 57.4, Avg: 57.5, Max: 57.6, Diff: 0.2, > Sum: 460.3] > [GC Worker End (ms): Min: 40580455.7, Avg: 40580455.7, Max: > 40580455.8, Diff: 0.1] > [Code Root Fixup: 0.2 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.8 ms] > [Other: 8.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.7 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 1.9 ms] > [Eden: 2484.0M(2484.0M)->0.0B(3544.0M) Survivors: 98.0M->100.0M Heap: > 5189.0M(7168.0M)->2708.0M(7168.0M)] > [Times: user=0.45 sys=0.03, real=0.07 secs] > 2017-05-23T20:42:55.372-0400: 40580.465: Total time for which application > threads were stopped: 0.0685303 seconds, Stopping threads took: 0.0001346 > seconds > 2017-05-23T20:42:59.372-0400: 40584.465: Application time: 4.0004774 > seconds > 2017-05-23T20:42:59.376-0400: 40584.469: Total time for which application > threads were stopped: 0.0036324 seconds, Stopping threads took: 0.0023017 > seconds > 2017-05-23T20:43:05.891-0400: 40590.984: Application time: 6.5149722 > seconds > 2017-05-23T20:43:05.892-0400: 40590.985: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 239075328 bytes, new threshold 15 (max 15) > - age 1: 4934368 bytes, 4934368 total > - age 2: 2633808 bytes, 7568176 total > - age 3: 71264464 bytes, 78832640 total > - age 4: 527368 bytes, 79360008 total > - age 5: 12893400 bytes, 92253408 total > - age 6: 750128 bytes, 93003536 total > - age 7: 432784 bytes, 93436320 total > 2017-05-23T20:43:09.493-0400: 40594.586: [SoftReference, 0 refs, 0.0067938 > secs]2017-05-23T20:43:09.500-0400: 40594.593: [WeakReference, 0 refs, > 0.0033881 secs]2017-05-23T20:43:09.503-0400: 40594.597: [FinalReference, 0 > refs, 0.0005787 secs]2017-05-23T20:43:09.504-0400: 40594.597: > [PhantomReference, 0 refs, 0 refs, 0.0011377 > secs]2017-05-23T20:43:09.505-0400: 40594.598: [JNI Weak Reference, > 0.0000618 secs] (to-space exhausted), 4.8672247 secs] > [Parallel Time: 3599.9 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40590985.9, Avg: 40590986.0, Max: > 40590986.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.3, Diff: 0.6, > Sum: 15.2] > [Update RS (ms): Min: 68.3, Avg: 68.4, Max: 68.5, Diff: 0.2, Sum: > 547.6] > [Processed Buffers: Min: 32, Avg: 49.0, Max: 72, Diff: 40, Sum: > 392] > [Scan RS (ms): Min: 2.8, Avg: 2.9, Max: 3.0, Diff: 0.1, Sum: 23.2] > [Code Root Scanning (ms): Min: 2.4, Avg: 2.5, Max: 2.5, Diff: 0.1, > Sum: 19.7] > [Object Copy (ms): Min: 3523.7, Avg: 3523.8, Max: 3523.9, Diff: 0.2, > Sum: 28190.6] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: > 0.5] > [GC Worker Total (ms): Min: 3599.6, Avg: 3599.7, Max: 3599.8, Diff: > 0.2, Sum: 28797.6] > [GC Worker End (ms): Min: 40594585.6, Avg: 40594585.7, Max: > 40594585.7, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 1.2 ms] > [Other: 1265.8 ms] > [Evacuation Failure: 1248.2 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 12.4 ms] > [Ref Enq: 0.5 ms] > [Redirty Cards: 2.1 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 1.5 ms] > [Eden: 3544.0M(3544.0M)->0.0B(976.0M) Survivors: 100.0M->456.0M Heap: > 6274.3M(7168.0M)->5978.2M(7168.0M)] > [Times: user=13.58 sys=0.11, real=4.86 secs] > 2017-05-23T20:43:10.760-0400: 40595.853: Total time for which application > threads were stopped: 4.8690628 seconds, Stopping threads took: 0.0002136 > seconds > 2017-05-23T20:43:11.762-0400: 40596.855: Application time: 1.0019247 > seconds > 2017-05-23T20:43:11.763-0400: 40596.856: Total time for which application > threads were stopped: 0.0015356 seconds, Stopping threads took: 0.0003107 > seconds > 2017-05-23T20:43:11.880-0400: 40596.973: Application time: 0.1164884 > seconds > 2017-05-23T20:43:11.881-0400: 40596.974: [GC pause (G1 Humongous > Allocation) (young) (initial-mark) > Desired survivor size 94371840 bytes, new threshold 1 (max 15) > - age 1: 477501112 bytes, 477501112 total > - age 2: 182296 bytes, 477683408 total > - age 3: 78880 bytes, 477762288 total > - age 4: 45376 bytes, 477807664 total > - age 5: 92304 bytes, 477899968 total > - age 6: 75448 bytes, 477975416 total > - age 7: 86752 bytes, 478062168 total > - age 8: 71408 bytes, 478133576 total > 2017-05-23T20:43:17.335-0400: 40602.428: [SoftReference, 0 refs, 0.0071133 > secs]2017-05-23T20:43:17.342-0400: 40602.435: [WeakReference, 3 refs, > 0.0007987 secs]2017-05-23T20:43:17.343-0400: 40602.436: [FinalReference, > 182 refs, 0.0017603 secs]2017-05-23T20:43:17.345-0400: 40602.438: > [PhantomReference, 0 refs, 0 refs, 0.0015961 > secs]2017-05-23T20:43:17.346-0400: 40602.440: [JNI Weak Reference, > 0.0000730 secs] (to-space exhausted), 6.1987667 secs] > [Parallel Time: 5446.3 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40596975.6, Avg: 40596975.7, Max: > 40596975.8, Diff: 0.2] > [Ext Root Scanning (ms): Min: 2.9, Avg: 3.1, Max: 3.2, Diff: 0.3, > Sum: 24.4] > [Update RS (ms): Min: 10.1, Avg: 10.3, Max: 10.5, Diff: 0.4, Sum: > 82.6] > [Processed Buffers: Min: 33, Avg: 40.2, Max: 51, Diff: 18, Sum: > 322] > [Scan RS (ms): Min: 30.7, Avg: 31.1, Max: 32.4, Diff: 1.8, Sum: > 249.0] > [Code Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.6, Diff: 0.5, > Sum: 2.8] > [Object Copy (ms): Min: 5399.2, Avg: 5400.6, Max: 5400.9, Diff: 1.7, > Sum: 43204.5] > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.3] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: > 0.5] > [GC Worker Total (ms): Min: 5445.5, Avg: 5445.6, Max: 5445.7, Diff: > 0.2, Sum: 43565.0] > [GC Worker End (ms): Min: 40602421.3, Avg: 40602421.4, Max: > 40602421.4, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.8 ms] > [Other: 751.4 ms] > [Evacuation Failure: 728.5 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 17.8 ms] > [Ref Enq: 0.5 ms] > [Redirty Cards: 2.1 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.2 ms] > [Free CSet: 0.8 ms] > [Eden: 878.0M(976.0M)->0.0B(1424.0M) Survivors: 456.0M->8192.0K Heap: > 6856.2M(7168.0M)->6908.2M(7168.0M)] > [Times: user=11.66 sys=1.15, real=6.19 secs] > 2017-05-23T20:43:18.080-0400: 40603.173: [GC > concurrent-root-region-scan-start] > 2017-05-23T20:43:18.080-0400: 40603.173: Total time for which application > threads were stopped: 6.2005443 seconds, Stopping threads took: 0.0002322 > seconds > 2017-05-23T20:43:18.080-0400: 40603.174: Application time: 0.0002882 > seconds > 2017-05-23T20:43:18.114-0400: 40603.207: [GC > concurrent-root-region-scan-end, 0.0339339 secs] > 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-mark-start] > 2017-05-23T20:43:18.142-0400: 40603.235: Total time for which application > threads were stopped: 0.0613820 seconds, Stopping threads took: 0.0001677 > seconds > 2017-05-23T20:43:18.142-0400: 40603.236: Application time: 0.0005017 > seconds > 2017-05-23T20:43:18.144-0400: 40603.237: Total time for which application > threads were stopped: 0.0013197 seconds, Stopping threads took: 0.0001188 > seconds > 2017-05-23T20:43:18.144-0400: 40603.237: Application time: 0.0001781 > seconds > 2017-05-23T20:43:18.144-0400: 40603.238: Total time for which application > threads were stopped: 0.0005735 seconds, Stopping threads took: 0.0000568 > seconds > 2017-05-23T20:43:18.728-0400: 40603.821: Application time: 0.5835349 > seconds > 2017-05-23T20:43:18.730-0400: 40603.823: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > - age 1: 8388248 bytes, 8388248 total > 2017-05-23T20:43:19.821-0400: 40604.914: [SoftReference, 0 refs, 0.0009673 > secs]2017-05-23T20:43:19.822-0400: 40604.915: [WeakReference, 0 refs, > 0.0006733 secs]2017-05-23T20:43:19.823-0400: 40604.916: [FinalReference, 0 > refs, 0.0006260 secs]2017-05-23T20:43:19.823-0400: 40604.917: > [PhantomReference, 0 refs, 0 refs, 0.0013002 > secs]2017-05-23T20:43:19.825-0400: 40604.918: [JNI Weak Reference, > 0.0000642 secs] (to-space exhausted), 1.2567408 secs] > [Parallel Time: 1084.5 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40603823.4, Avg: 40603823.5, Max: > 40603823.6, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.7, > Sum: 15.3] > [Update RS (ms): Min: 23.8, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum: > 191.7] > [Processed Buffers: Min: 49, Avg: 53.5, Max: 60, Diff: 11, Sum: > 428] > [Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 8.6] > [Code Root Scanning (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1, > Sum: 0.8] > [Object Copy (ms): Min: 1056.4, Avg: 1056.8, Max: 1057.2, Diff: 0.8, > Sum: 8454.7] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.0] > [Termination Attempts: Min: 1, Avg: 3.8, Max: 7, Diff: 6, Sum: 30] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: > 0.5] > [GC Worker Total (ms): Min: 1084.0, Avg: 1084.1, Max: 1084.2, Diff: > 0.2, Sum: 8673.2] > [GC Worker End (ms): Min: 40604907.6, Avg: 40604907.7, Max: > 40604907.7, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 171.7 ms] > [Evacuation Failure: 159.4 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 9.9 ms] > [Ref Enq: 0.6 ms] > [Redirty Cards: 0.6 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.3 ms] > [Free CSet: 0.2 ms] > [Eden: 230.0M(1424.0M)->0.0B(1432.0M) Survivors: 8192.0K->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=2.33 sys=0.34, real=1.26 secs] > 2017-05-23T20:43:19.987-0400: 40605.080: Total time for which application > threads were stopped: 1.2587489 seconds, Stopping threads took: 0.0002182 > seconds > 2017-05-23T20:43:19.987-0400: 40605.080: Application time: 0.0003101 > seconds > 2017-05-23T20:43:19.988-0400: 40605.082: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.020-0400: 40605.113: [SoftReference, 0 refs, 0.0008856 > secs]2017-05-23T20:43:20.020-0400: 40605.114: [WeakReference, 0 refs, > 0.0005588 secs]2017-05-23T20:43:20.021-0400: 40605.114: [FinalReference, 0 > refs, 0.0006006 secs]2017-05-23T20:43:20.022-0400: 40605.115: > [PhantomReference, 0 refs, 0 refs, 0.0010837 > secs]2017-05-23T20:43:20.023-0400: 40605.116: [JNI Weak Reference, > 0.0000610 secs], 0.0356212 secs] > [Parallel Time: 30.0 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605081.9, Avg: 40605082.0, Max: > 40605082.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.9, Avg: 2.0, Max: 2.5, Diff: 0.6, > Sum: 16.1] > [Update RS (ms): Min: 27.3, Avg: 27.4, Max: 27.5, Diff: 0.2, Sum: > 219.3] > [Processed Buffers: Min: 82, Avg: 87.4, Max: 92, Diff: 10, Sum: > 699] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.4] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: > 0.4] > [GC Worker Total (ms): Min: 29.7, Avg: 29.8, Max: 29.9, Diff: 0.2, > Sum: 238.5] > [GC Worker End (ms): Min: 40605111.8, Avg: 40605111.8, Max: > 40605111.8, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.2 ms] > [Other: 5.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.0 ms] > [Ref Enq: 0.2 ms] > [Redirty Cards: 0.2 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.2 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.25 sys=0.00, real=0.04 secs] > 2017-05-23T20:43:20.024-0400: 40605.118: Total time for which application > threads were stopped: 0.0372043 seconds, Stopping threads took: 0.0001640 > seconds > 2017-05-23T20:43:20.025-0400: 40605.118: Application time: 0.0002435 > seconds > 2017-05-23T20:43:20.026-0400: 40605.119: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.030-0400: 40605.123: [SoftReference, 0 refs, 0.0009405 > secs]2017-05-23T20:43:20.031-0400: 40605.124: [WeakReference, 0 refs, > 0.0005771 secs]2017-05-23T20:43:20.032-0400: 40605.125: [FinalReference, 0 > refs, 0.0005766 secs]2017-05-23T20:43:20.032-0400: 40605.125: > [PhantomReference, 0 refs, 0 refs, 0.0011847 > secs]2017-05-23T20:43:20.033-0400: 40605.127: [JNI Weak Reference, > 0.0000549 secs], 0.0087717 secs] > [Parallel Time: 3.0 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605119.3, Avg: 40605119.4, Max: > 40605119.5, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 2.6, Diff: 1.0, > Sum: 14.8] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] > [Processed Buffers: Min: 0, Avg: 0.5, Max: 2, Diff: 2, Sum: 4] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination (ms): Min: 0.0, Avg: 0.6, Max: 0.6, Diff: 0.6, Sum: 4.4] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: > 0.5] > [GC Worker Total (ms): Min: 2.6, Avg: 2.6, Max: 2.7, Diff: 0.1, Sum: > 21.1] > [GC Worker End (ms): Min: 40605122.0, Avg: 40605122.1, Max: > 40605122.1, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 5.2 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.1 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.03 sys=0.00, real=0.01 secs] > 2017-05-23T20:43:20.035-0400: 40605.128: Total time for which application > threads were stopped: 0.0102350 seconds, Stopping threads took: 0.0000635 > seconds > 2017-05-23T20:43:20.035-0400: 40605.128: Application time: 0.0002150 > seconds > 2017-05-23T20:43:20.036-0400: 40605.129: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.040-0400: 40605.133: [SoftReference, 0 refs, 0.0010156 > secs]2017-05-23T20:43:20.041-0400: 40605.134: [WeakReference, 0 refs, > 0.0006580 secs]2017-05-23T20:43:20.042-0400: 40605.135: [FinalReference, 0 > refs, 0.0006435 secs]2017-05-23T20:43:20.042-0400: 40605.136: > [PhantomReference, 0 refs, 0 refs, 0.0012604 > secs]2017-05-23T20:43:20.044-0400: 40605.137: [JNI Weak Reference, > 0.0000513 secs], 0.0087896 secs] > [Parallel Time: 2.7 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605129.6, Avg: 40605129.7, Max: > 40605129.8, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.5, Diff: 0.8, > Sum: 14.9] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2] > [Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.3] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.5] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: > 0.5] > [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: > 19.5] > [GC Worker End (ms): Min: 40605132.1, Avg: 40605132.2, Max: > 40605132.2, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 5.5 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.4 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.04 sys=0.00, real=0.01 secs] > 2017-05-23T20:43:20.045-0400: 40605.138: Total time for which application > threads were stopped: 0.0101403 seconds, Stopping threads took: 0.0000614 > seconds > 2017-05-23T20:43:20.045-0400: 40605.139: Application time: 0.0001681 > seconds > 2017-05-23T20:43:20.046-0400: 40605.140: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.050-0400: 40605.144: [SoftReference, 0 refs, 0.0008321 > secs]2017-05-23T20:43:20.051-0400: 40605.145: [WeakReference, 0 refs, > 0.0006103 secs]2017-05-23T20:43:20.052-0400: 40605.145: [FinalReference, 0 > refs, 0.0007194 secs]2017-05-23T20:43:20.053-0400: 40605.146: > [PhantomReference, 0 refs, 0 refs, 0.0010705 > secs]2017-05-23T20:43:20.054-0400: 40605.147: [JNI Weak Reference, > 0.0000508 secs], 0.0084107 secs] > [Parallel Time: 2.7 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605139.9, Avg: 40605140.0, Max: > 40605140.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.8, > Sum: 15.1] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] > [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.2] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: > 0.5] > [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: > 19.2] > [GC Worker End (ms): Min: 40605142.4, Avg: 40605142.4, Max: > 40605142.5, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.2 ms] > [Other: 5.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.1 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.2 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.03 sys=0.01, real=0.01 secs] > 2017-05-23T20:43:20.055-0400: 40605.148: Total time for which application > threads were stopped: 0.0097185 seconds, Stopping threads took: 0.0001029 > seconds > 2017-05-23T20:43:20.055-0400: 40605.148: Application time: 0.0001505 > seconds > 2017-05-23T20:43:20.056-0400: 40605.149: [Full GC (Allocation Failure) > 2017-05-23T20:43:22.446-0400: 40607.540: [SoftReference, 1667 refs, > 0.0003772 secs]2017-05-23T20:43:22.447-0400: 40607.541: [WeakReference, > 5626 refs, 0.0008068 secs]2017-05-23T20:43:22.448-0400: 40607.541: > [FinalReference, 4015 refs, 0.0015169 secs]2017-05-23T20:43:22.450-0400: > 40607.543: [PhantomReference, 1 refs, 372 refs, 0.0001585 > secs]2017-05-23T20:43:22.450-0400: 40607.543: [JNI Weak Reference, > 0.0000963 secs] 7139M->2327M(7168M), 9.7036499 secs] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->2327.6M(7168.0M)], [Metaspace: 108907K->108428K(1150976K)] > [Times: user=13.22 sys=0.00, real=9.70 secs] > 2017-05-23T20:43:29.760-0400: 40614.853: Total time for which application > threads were stopped: 9.7047785 seconds, Stopping threads took: 0.0000566 > seconds > 2017-05-23T20:43:29.760-0400: 40614.854: [GC concurrent-mark-abort] > 2017-05-23T20:43:29.763-0400: 40614.856: Application time: 0.0029444 > seconds > > On Jul 21, 2017, at 5:34 PM, Thomas Schatzl > wrote: > > > > Hi Kirk, > > > > On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote: > >> Hi all, > >> > >> A while back I mentioned to Erik at JFokus that I was seeing a > >> puzzling behavior in the G1 where without any obvious failure, heap > >> occupancy after collections would spike which would trigger a full > >> which would (unexpectedly) completely recover everything down to the > >> expected live set. Yesterday while working with Simone Bordet on the > >> problem we came to the realization that we were seeing a pattern > >> prior to the ramp up to the Full, Survivor space would be > >> ergonomically resized to 0 -> 0. The only way to reset the situation > >> was to run a full collection. In our minds this doesn?t make any > >> sense to reset survivor space to 0. So far this is an observation > >> from a single GC log but I recall seeing the pattern in many other > >> logs. Before I go through the exercise of building a super grep to > >> run over my G1 log repo I?d like to ask; under what conditions would > >> it make sense to have the survivor space resized to 0? And if not, > >> would this be bug in G1? We tried reproducing the behavior in some > >> test applications but I fear we often only see this happening in > >> production applications that have been running for several days. It?s > >> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9. > > > > sounds similar to https://bugs.openjdk.java.net/browse/JDK-8037500. > > Could you please post the type of collections for a few more gcs before > > the zero-sized ones? It would be particularly interesting if there is a > > mixed gc with to-space exhaustion just before this sequence. And if > > there are log messages with attempts to start marking too. > > > > As for why that bug has been closed as "won't fix" because we do not > > have a reproducer (any more) to test any changes in addition to the > > stated reasons that the performance impact seemed minor at that time. > > > > There have been some changes in how the next gc is calculated in 9 too, > > so I do not know either if 9 is also affected (particularly one of > > these young-only gc's would not be issued any more). > > > > I can think of at least one more reasons other than stated in the CR > > why this occurs at least for 8u60+ builds. There is the possibility > > particularly in conjunction with humongous object allocation that after > > starting the mutator, immediately afterwards a young gc that reclaims > > zero space is issued, e.g.: > > > > young-gc, has X regions left at the end, starts mutators > > mutator 1 allocates exactly X regions as humongous objects > > mutator 2 allocates, finds that there are no regions left, issues > > young-gc request; in this young-gc eden and survivor are of obviously > > of zero size > > [...and so on...] > > > > Note that this pattern could repeat multiple times as young gc may > > reclaim space from humongous objects (eager reclaim!) until at some > > point it ran into full gc. > > > > The logging that shows humongous object allocation (something about > > reaching threshold and starting marking) could confirm this situation. > > > > No guarantees about that being the actual issue though. > > > > Thanks, > > Thomas > > > > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.beckwith at gmail.com Sun Jul 23 19:09:14 2017 From: monica.beckwith at gmail.com (monica beckwith) Date: Sun, 23 Jul 2017 21:09:14 +0200 Subject: Bug in G1 In-Reply-To: References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> <1500034180.3458.67.camel@oracle.com> <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com> <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com> <1500647667.2385.33.camel@oracle.com> Message-ID: Hello Kirk and Thomas, I think the problem is that the heap is not sized to accommodate the humongous objects. I think this log is post 8 update 40, and that's why you see those young collections at the lowest young occupancy since it's trying to reclaim humongous regions. Kirk, can you please show a log prior to 8u40? Thanks, Monica On Jul 23, 2017 5:52 AM, "Kirk Pepperdine" wrote: > Thanks for the information. I?ve shared the entire log with you on > dropbox. Feel free to distribute it as you see fit. > > I see the to-space exhausted but there doesn?t appear to be a mixed > collection involved. Below is a single sequence up to and including the > Full. > > Kind regards, > Kirk > > > 2017-05-23T20:42:55.303-0400: 40580.396: Application time: 0.8539675 > seconds > 2017-05-23T20:42:55.304-0400: 40580.398: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 169869312 bytes, new threshold 15 (max 15) > - age 1: 3278808 bytes, 3278808 total > - age 2: 71278552 bytes, 74557360 total > - age 3: 533720 bytes, 75091080 total > - age 4: 12897544 bytes, 87988624 total > - age 5: 796672 bytes, 88785296 total > - age 6: 503288 bytes, 89288584 total > 2017-05-23T20:42:55.363-0400: 40580.457: [SoftReference, 0 refs, 0.0010011 > secs]2017-05-23T20:42:55.364-0400: 40580.458: [WeakReference, 367 refs, > 0.0006136 secs]2017-05-23T20:42:55.365-0400: 40580.458: [FinalReference, > 7659 refs, 0.0014460 secs]2017-05-23T20:42:55.366-0400: 40580.460: > [PhantomReference, 0 refs, 0 refs, 0.0011060 secs]2017-05-23T20:42:55.367-0400: > 40580.461: [JNI Weak Reference, 0.0000647 secs], 0.0669684 secs] > [Parallel Time: 57.9 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40580398.1, Avg: 40580398.2, Max: > 40580398.3, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 2.7, Diff: 1.0, > Sum: 15.2] > [Update RS (ms): Min: 15.2, Avg: 15.7, Max: 15.8, Diff: 0.6, Sum: > 125.4] > [Processed Buffers: Min: 44, Avg: 50.1, Max: 62, Diff: 18, Sum: > 401] > [Scan RS (ms): Min: 1.9, Avg: 2.0, Max: 2.1, Diff: 0.2, Sum: 15.9] > [Code Root Scanning (ms): Min: 1.6, Avg: 1.7, Max: 1.7, Diff: 0.1, > Sum: 13.5] > [Object Copy (ms): Min: 36.0, Avg: 36.2, Max: 36.2, Diff: 0.2, Sum: > 289.2] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: > 1.0] > [GC Worker Total (ms): Min: 57.4, Avg: 57.5, Max: 57.6, Diff: 0.2, > Sum: 460.3] > [GC Worker End (ms): Min: 40580455.7, Avg: 40580455.7, Max: > 40580455.8, Diff: 0.1] > [Code Root Fixup: 0.2 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.8 ms] > [Other: 8.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.7 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 1.9 ms] > [Eden: 2484.0M(2484.0M)->0.0B(3544.0M) Survivors: 98.0M->100.0M Heap: > 5189.0M(7168.0M)->2708.0M(7168.0M)] > [Times: user=0.45 sys=0.03, real=0.07 secs] > 2017-05-23T20:42:55.372-0400: 40580.465: Total time for which application > threads were stopped: 0.0685303 seconds, Stopping threads took: 0.0001346 > seconds > 2017-05-23T20:42:59.372-0400: 40584.465: Application time: 4.0004774 > seconds > 2017-05-23T20:42:59.376-0400: 40584.469: Total time for which application > threads were stopped: 0.0036324 seconds, Stopping threads took: 0.0023017 > seconds > 2017-05-23T20:43:05.891-0400: 40590.984: Application time: 6.5149722 > seconds > 2017-05-23T20:43:05.892-0400: 40590.985: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 239075328 bytes, new threshold 15 (max 15) > - age 1: 4934368 bytes, 4934368 total > - age 2: 2633808 bytes, 7568176 total > - age 3: 71264464 bytes, 78832640 total > - age 4: 527368 bytes, 79360008 total > - age 5: 12893400 bytes, 92253408 total > - age 6: 750128 bytes, 93003536 total > - age 7: 432784 bytes, 93436320 total > 2017-05-23T20:43:09.493-0400: 40594.586: [SoftReference, 0 refs, 0.0067938 > secs]2017-05-23T20:43:09.500-0400: 40594.593: [WeakReference, 0 refs, > 0.0033881 secs]2017-05-23T20:43:09.503-0400: 40594.597: [FinalReference, > 0 refs, 0.0005787 secs]2017-05-23T20:43:09.504-0400: 40594.597: > [PhantomReference, 0 refs, 0 refs, 0.0011377 secs]2017-05-23T20:43:09.505-0400: > 40594.598: [JNI Weak Reference, 0.0000618 secs] (to-space exhausted), > 4.8672247 secs] > [Parallel Time: 3599.9 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40590985.9, Avg: 40590986.0, Max: > 40590986.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.3, Diff: 0.6, > Sum: 15.2] > [Update RS (ms): Min: 68.3, Avg: 68.4, Max: 68.5, Diff: 0.2, Sum: > 547.6] > [Processed Buffers: Min: 32, Avg: 49.0, Max: 72, Diff: 40, Sum: > 392] > [Scan RS (ms): Min: 2.8, Avg: 2.9, Max: 3.0, Diff: 0.1, Sum: 23.2] > [Code Root Scanning (ms): Min: 2.4, Avg: 2.5, Max: 2.5, Diff: 0.1, > Sum: 19.7] > [Object Copy (ms): Min: 3523.7, Avg: 3523.8, Max: 3523.9, Diff: 0.2, > Sum: 28190.6] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: > 0.5] > [GC Worker Total (ms): Min: 3599.6, Avg: 3599.7, Max: 3599.8, Diff: > 0.2, Sum: 28797.6] > [GC Worker End (ms): Min: 40594585.6, Avg: 40594585.7, Max: > 40594585.7, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 1.2 ms] > [Other: 1265.8 ms] > [Evacuation Failure: 1248.2 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 12.4 ms] > [Ref Enq: 0.5 ms] > [Redirty Cards: 2.1 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 1.5 ms] > [Eden: 3544.0M(3544.0M)->0.0B(976.0M) Survivors: 100.0M->456.0M Heap: > 6274.3M(7168.0M)->5978.2M(7168.0M)] > [Times: user=13.58 sys=0.11, real=4.86 secs] > 2017-05-23T20:43:10.760-0400: 40595.853: Total time for which application > threads were stopped: 4.8690628 seconds, Stopping threads took: 0.0002136 > seconds > 2017-05-23T20:43:11.762-0400: 40596.855: Application time: 1.0019247 > seconds > 2017-05-23T20:43:11.763-0400: 40596.856: Total time for which application > threads were stopped: 0.0015356 seconds, Stopping threads took: 0.0003107 > seconds > 2017-05-23T20:43:11.880-0400: 40596.973: Application time: 0.1164884 > seconds > 2017-05-23T20:43:11.881-0400: 40596.974: [GC pause (G1 Humongous > Allocation) (young) (initial-mark) > Desired survivor size 94371840 bytes, new threshold 1 (max 15) > - age 1: 477501112 bytes, 477501112 total > - age 2: 182296 bytes, 477683408 total > - age 3: 78880 bytes, 477762288 total > - age 4: 45376 bytes, 477807664 total > - age 5: 92304 bytes, 477899968 total > - age 6: 75448 bytes, 477975416 total > - age 7: 86752 bytes, 478062168 total > - age 8: 71408 bytes, 478133576 total > 2017-05-23T20:43:17.335-0400: 40602.428: [SoftReference, 0 refs, 0.0071133 > secs]2017-05-23T20:43:17.342-0400: 40602.435: [WeakReference, 3 refs, > 0.0007987 secs]2017-05-23T20:43:17.343-0400: 40602.436: [FinalReference, > 182 refs, 0.0017603 secs]2017-05-23T20:43:17.345-0400: 40602.438: > [PhantomReference, 0 refs, 0 refs, 0.0015961 secs]2017-05-23T20:43:17.346-0400: > 40602.440: [JNI Weak Reference, 0.0000730 secs] (to-space exhausted), > 6.1987667 secs] > [Parallel Time: 5446.3 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40596975.6, Avg: 40596975.7, Max: > 40596975.8, Diff: 0.2] > [Ext Root Scanning (ms): Min: 2.9, Avg: 3.1, Max: 3.2, Diff: 0.3, > Sum: 24.4] > [Update RS (ms): Min: 10.1, Avg: 10.3, Max: 10.5, Diff: 0.4, Sum: > 82.6] > [Processed Buffers: Min: 33, Avg: 40.2, Max: 51, Diff: 18, Sum: > 322] > [Scan RS (ms): Min: 30.7, Avg: 31.1, Max: 32.4, Diff: 1.8, Sum: > 249.0] > [Code Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.6, Diff: 0.5, > Sum: 2.8] > [Object Copy (ms): Min: 5399.2, Avg: 5400.6, Max: 5400.9, Diff: 1.7, > Sum: 43204.5] > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.3] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: > 0.5] > [GC Worker Total (ms): Min: 5445.5, Avg: 5445.6, Max: 5445.7, Diff: > 0.2, Sum: 43565.0] > [GC Worker End (ms): Min: 40602421.3, Avg: 40602421.4, Max: > 40602421.4, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.8 ms] > [Other: 751.4 ms] > [Evacuation Failure: 728.5 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 17.8 ms] > [Ref Enq: 0.5 ms] > [Redirty Cards: 2.1 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.2 ms] > [Free CSet: 0.8 ms] > [Eden: 878.0M(976.0M)->0.0B(1424.0M) Survivors: 456.0M->8192.0K Heap: > 6856.2M(7168.0M)->6908.2M(7168.0M)] > [Times: user=11.66 sys=1.15, real=6.19 secs] > 2017-05-23T20:43:18.080-0400: 40603.173: [GC concurrent-root-region-scan- > start] > 2017-05-23T20:43:18.080-0400: 40603.173: Total time for which application > threads were stopped: 6.2005443 seconds, Stopping threads took: 0.0002322 > seconds > 2017-05-23T20:43:18.080-0400: 40603.174: Application time: 0.0002882 > seconds > 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-root-region-scan-end, > 0.0339339 secs] > 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-mark-start] > 2017-05-23T20:43:18.142-0400: 40603.235: Total time for which application > threads were stopped: 0.0613820 seconds, Stopping threads took: 0.0001677 > seconds > 2017-05-23T20:43:18.142-0400: 40603.236: Application time: 0.0005017 > seconds > 2017-05-23T20:43:18.144-0400: 40603.237: Total time for which application > threads were stopped: 0.0013197 seconds, Stopping threads took: 0.0001188 > seconds > 2017-05-23T20:43:18.144-0400: 40603.237: Application time: 0.0001781 > seconds > 2017-05-23T20:43:18.144-0400: 40603.238: Total time for which application > threads were stopped: 0.0005735 seconds, Stopping threads took: 0.0000568 > seconds > 2017-05-23T20:43:18.728-0400: 40603.821: Application time: 0.5835349 > seconds > 2017-05-23T20:43:18.730-0400: 40603.823: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > - age 1: 8388248 bytes, 8388248 total > 2017-05-23T20:43:19.821-0400: 40604.914: [SoftReference, 0 refs, 0.0009673 > secs]2017-05-23T20:43:19.822-0400: 40604.915: [WeakReference, 0 refs, > 0.0006733 secs]2017-05-23T20:43:19.823-0400: 40604.916: [FinalReference, > 0 refs, 0.0006260 secs]2017-05-23T20:43:19.823-0400: 40604.917: > [PhantomReference, 0 refs, 0 refs, 0.0013002 secs]2017-05-23T20:43:19.825-0400: > 40604.918: [JNI Weak Reference, 0.0000642 secs] (to-space exhausted), > 1.2567408 secs] > [Parallel Time: 1084.5 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40603823.4, Avg: 40603823.5, Max: > 40603823.6, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.7, > Sum: 15.3] > [Update RS (ms): Min: 23.8, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum: > 191.7] > [Processed Buffers: Min: 49, Avg: 53.5, Max: 60, Diff: 11, Sum: > 428] > [Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 8.6] > [Code Root Scanning (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1, > Sum: 0.8] > [Object Copy (ms): Min: 1056.4, Avg: 1056.8, Max: 1057.2, Diff: 0.8, > Sum: 8454.7] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.0] > [Termination Attempts: Min: 1, Avg: 3.8, Max: 7, Diff: 6, Sum: 30] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: > 0.5] > [GC Worker Total (ms): Min: 1084.0, Avg: 1084.1, Max: 1084.2, Diff: > 0.2, Sum: 8673.2] > [GC Worker End (ms): Min: 40604907.6, Avg: 40604907.7, Max: > 40604907.7, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 171.7 ms] > [Evacuation Failure: 159.4 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 9.9 ms] > [Ref Enq: 0.6 ms] > [Redirty Cards: 0.6 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.3 ms] > [Free CSet: 0.2 ms] > [Eden: 230.0M(1424.0M)->0.0B(1432.0M) Survivors: 8192.0K->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=2.33 sys=0.34, real=1.26 secs] > 2017-05-23T20:43:19.987-0400: 40605.080: Total time for which application > threads were stopped: 1.2587489 seconds, Stopping threads took: 0.0002182 > seconds > 2017-05-23T20:43:19.987-0400: 40605.080: Application time: 0.0003101 > seconds > 2017-05-23T20:43:19.988-0400: 40605.082: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.020-0400: 40605.113: [SoftReference, 0 refs, 0.0008856 > secs]2017-05-23T20:43:20.020-0400: 40605.114: [WeakReference, 0 refs, > 0.0005588 secs]2017-05-23T20:43:20.021-0400: 40605.114: [FinalReference, > 0 refs, 0.0006006 secs]2017-05-23T20:43:20.022-0400: 40605.115: > [PhantomReference, 0 refs, 0 refs, 0.0010837 secs]2017-05-23T20:43:20.023-0400: > 40605.116: [JNI Weak Reference, 0.0000610 secs], 0.0356212 secs] > [Parallel Time: 30.0 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605081.9, Avg: 40605082.0, Max: > 40605082.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.9, Avg: 2.0, Max: 2.5, Diff: 0.6, > Sum: 16.1] > [Update RS (ms): Min: 27.3, Avg: 27.4, Max: 27.5, Diff: 0.2, Sum: > 219.3] > [Processed Buffers: Min: 82, Avg: 87.4, Max: 92, Diff: 10, Sum: > 699] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.4] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: > 0.4] > [GC Worker Total (ms): Min: 29.7, Avg: 29.8, Max: 29.9, Diff: 0.2, > Sum: 238.5] > [GC Worker End (ms): Min: 40605111.8, Avg: 40605111.8, Max: > 40605111.8, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.2 ms] > [Other: 5.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.0 ms] > [Ref Enq: 0.2 ms] > [Redirty Cards: 0.2 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.2 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.25 sys=0.00, real=0.04 secs] > 2017-05-23T20:43:20.024-0400: 40605.118: Total time for which application > threads were stopped: 0.0372043 seconds, Stopping threads took: 0.0001640 > seconds > 2017-05-23T20:43:20.025-0400: 40605.118: Application time: 0.0002435 > seconds > 2017-05-23T20:43:20.026-0400: 40605.119: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.030-0400: 40605.123: [SoftReference, 0 refs, 0.0009405 > secs]2017-05-23T20:43:20.031-0400: 40605.124: [WeakReference, 0 refs, > 0.0005771 secs]2017-05-23T20:43:20.032-0400: 40605.125: [FinalReference, > 0 refs, 0.0005766 secs]2017-05-23T20:43:20.032-0400: 40605.125: > [PhantomReference, 0 refs, 0 refs, 0.0011847 secs]2017-05-23T20:43:20.033-0400: > 40605.127: [JNI Weak Reference, 0.0000549 secs], 0.0087717 secs] > [Parallel Time: 3.0 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605119.3, Avg: 40605119.4, Max: > 40605119.5, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 2.6, Diff: 1.0, > Sum: 14.8] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] > [Processed Buffers: Min: 0, Avg: 0.5, Max: 2, Diff: 2, Sum: 4] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination (ms): Min: 0.0, Avg: 0.6, Max: 0.6, Diff: 0.6, Sum: 4.4] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: > 0.5] > [GC Worker Total (ms): Min: 2.6, Avg: 2.6, Max: 2.7, Diff: 0.1, Sum: > 21.1] > [GC Worker End (ms): Min: 40605122.0, Avg: 40605122.1, Max: > 40605122.1, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 5.2 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.1 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.03 sys=0.00, real=0.01 secs] > 2017-05-23T20:43:20.035-0400: 40605.128: Total time for which application > threads were stopped: 0.0102350 seconds, Stopping threads took: 0.0000635 > seconds > 2017-05-23T20:43:20.035-0400: 40605.128: Application time: 0.0002150 > seconds > 2017-05-23T20:43:20.036-0400: 40605.129: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.040-0400: 40605.133: [SoftReference, 0 refs, 0.0010156 > secs]2017-05-23T20:43:20.041-0400: 40605.134: [WeakReference, 0 refs, > 0.0006580 secs]2017-05-23T20:43:20.042-0400: 40605.135: [FinalReference, > 0 refs, 0.0006435 secs]2017-05-23T20:43:20.042-0400: 40605.136: > [PhantomReference, 0 refs, 0 refs, 0.0012604 secs]2017-05-23T20:43:20.044-0400: > 40605.137: [JNI Weak Reference, 0.0000513 secs], 0.0087896 secs] > [Parallel Time: 2.7 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605129.6, Avg: 40605129.7, Max: > 40605129.8, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.5, Diff: 0.8, > Sum: 14.9] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2] > [Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.3] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.5] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: > 0.5] > [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: > 19.5] > [GC Worker End (ms): Min: 40605132.1, Avg: 40605132.2, Max: > 40605132.2, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 5.5 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.4 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.04 sys=0.00, real=0.01 secs] > 2017-05-23T20:43:20.045-0400: 40605.138: Total time for which application > threads were stopped: 0.0101403 seconds, Stopping threads took: 0.0000614 > seconds > 2017-05-23T20:43:20.045-0400: 40605.139: Application time: 0.0001681 > seconds > 2017-05-23T20:43:20.046-0400: 40605.140: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.050-0400: 40605.144: [SoftReference, 0 refs, 0.0008321 > secs]2017-05-23T20:43:20.051-0400: 40605.145: [WeakReference, 0 refs, > 0.0006103 secs]2017-05-23T20:43:20.052-0400: 40605.145: [FinalReference, > 0 refs, 0.0007194 secs]2017-05-23T20:43:20.053-0400: 40605.146: > [PhantomReference, 0 refs, 0 refs, 0.0010705 secs]2017-05-23T20:43:20.054-0400: > 40605.147: [JNI Weak Reference, 0.0000508 secs], 0.0084107 secs] > [Parallel Time: 2.7 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605139.9, Avg: 40605140.0, Max: > 40605140.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.8, > Sum: 15.1] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] > [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.2] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: > 0.5] > [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: > 19.2] > [GC Worker End (ms): Min: 40605142.4, Avg: 40605142.4, Max: > 40605142.5, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.2 ms] > [Other: 5.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.1 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.2 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.03 sys=0.01, real=0.01 secs] > 2017-05-23T20:43:20.055-0400: 40605.148: Total time for which application > threads were stopped: 0.0097185 seconds, Stopping threads took: 0.0001029 > seconds > 2017-05-23T20:43:20.055-0400: 40605.148: Application time: 0.0001505 > seconds > 2017-05-23T20:43:20.056-0400: 40605.149: [Full GC (Allocation Failure) > 2017-05-23T20:43:22.446-0400: 40607.540: [SoftReference, 1667 refs, > 0.0003772 secs]2017-05-23T20:43:22.447-0400: 40607.541: [WeakReference, > 5626 refs, 0.0008068 secs]2017-05-23T20:43:22.448-0400: 40607.541: > [FinalReference, 4015 refs, 0.0015169 secs]2017-05-23T20:43:22.450-0400: > 40607.543: [PhantomReference, 1 refs, 372 refs, 0.0001585 > secs]2017-05-23T20:43:22.450-0400: 40607.543: [JNI Weak Reference, > 0.0000963 secs] 7139M->2327M(7168M), 9.7036499 secs] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: > 7139.5M(7168.0M)->2327.6M(7168.0M)], [Metaspace: > 108907K->108428K(1150976K)] > [Times: user=13.22 sys=0.00, real=9.70 secs] > 2017-05-23T20:43:29.760-0400: 40614.853: Total time for which application > threads were stopped: 9.7047785 seconds, Stopping threads took: 0.0000566 > seconds > 2017-05-23T20:43:29.760-0400: 40614.854: [GC concurrent-mark-abort] > 2017-05-23T20:43:29.763-0400: 40614.856: Application time: 0.0029444 > seconds > > On Jul 21, 2017, at 5:34 PM, Thomas Schatzl > wrote: > > > > Hi Kirk, > > > > On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote: > >> Hi all, > >> > >> A while back I mentioned to Erik at JFokus that I was seeing a > >> puzzling behavior in the G1 where without any obvious failure, heap > >> occupancy after collections would spike which would trigger a full > >> which would (unexpectedly) completely recover everything down to the > >> expected live set. Yesterday while working with Simone Bordet on the > >> problem we came to the realization that we were seeing a pattern > >> prior to the ramp up to the Full, Survivor space would be > >> ergonomically resized to 0 -> 0. The only way to reset the situation > >> was to run a full collection. In our minds this doesn?t make any > >> sense to reset survivor space to 0. So far this is an observation > >> from a single GC log but I recall seeing the pattern in many other > >> logs. Before I go through the exercise of building a super grep to > >> run over my G1 log repo I?d like to ask; under what conditions would > >> it make sense to have the survivor space resized to 0? And if not, > >> would this be bug in G1? We tried reproducing the behavior in some > >> test applications but I fear we often only see this happening in > >> production applications that have been running for several days. It?s > >> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9. > > > > sounds similar to https://bugs.openjdk.java.net/browse/JDK-8037500. > > Could you please post the type of collections for a few more gcs before > > the zero-sized ones? It would be particularly interesting if there is a > > mixed gc with to-space exhaustion just before this sequence. And if > > there are log messages with attempts to start marking too. > > > > As for why that bug has been closed as "won't fix" because we do not > > have a reproducer (any more) to test any changes in addition to the > > stated reasons that the performance impact seemed minor at that time. > > > > There have been some changes in how the next gc is calculated in 9 too, > > so I do not know either if 9 is also affected (particularly one of > > these young-only gc's would not be issued any more). > > > > I can think of at least one more reasons other than stated in the CR > > why this occurs at least for 8u60+ builds. There is the possibility > > particularly in conjunction with humongous object allocation that after > > starting the mutator, immediately afterwards a young gc that reclaims > > zero space is issued, e.g.: > > > > young-gc, has X regions left at the end, starts mutators > > mutator 1 allocates exactly X regions as humongous objects > > mutator 2 allocates, finds that there are no regions left, issues > > young-gc request; in this young-gc eden and survivor are of obviously > > of zero size > > [...and so on...] > > > > Note that this pattern could repeat multiple times as young gc may > > reclaim space from humongous objects (eager reclaim!) until at some > > point it ran into full gc. > > > > The logging that shows humongous object allocation (something about > > reaching threshold and starting marking) could confirm this situation. > > > > No guarantees about that being the actual issue though. > > > > Thanks, > > Thomas > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirk at kodewerk.com Mon Jul 24 19:43:59 2017 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Mon, 24 Jul 2017 21:43:59 +0200 Subject: Bug in G1 In-Reply-To: References: <1500024904.3458.8.camel@oracle.com> <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com> <1500029912.3458.26.camel@oracle.com> <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com> <1500034180.3458.67.camel@oracle.com> <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com> <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com> <1500647667.2385.33.camel@oracle.com> Message-ID: <2479C8EB-F38C-4804-94E5-EC613BC6457E@kodewerk.com> Hi Monica et all? I see this bug in all versions of 7 and 8. I can put up more GC logs once I get to a more stable internet connection. Kind regards, Kirk > On Jul 23, 2017, at 9:09 PM, monica beckwith wrote: > > Hello Kirk and Thomas, > > I think the problem is that the heap is not sized to accommodate the humongous objects. I think this log is post 8 update 40, and that's why you see those young collections at the lowest young occupancy since it's trying to reclaim humongous regions. Kirk, can you please show a log prior to 8u40? > > Thanks, > Monica > > On Jul 23, 2017 5:52 AM, "Kirk Pepperdine" > wrote: > Thanks for the information. I?ve shared the entire log with you on dropbox. Feel free to distribute it as you see fit. > > I see the to-space exhausted but there doesn?t appear to be a mixed collection involved. Below is a single sequence up to and including the Full. > > Kind regards, > Kirk > > > 2017-05-23T20:42:55.303-0400: 40580.396: Application time: 0.8539675 seconds > 2017-05-23T20:42:55.304-0400: 40580.398: [GC pause (G1 Evacuation Pause) (young) > Desired survivor size 169869312 bytes, new threshold 15 (max 15) > - age 1: 3278808 bytes, 3278808 total > - age 2: 71278552 bytes, 74557360 total > - age 3: 533720 bytes, 75091080 total > - age 4: 12897544 bytes, 87988624 total > - age 5: 796672 bytes, 88785296 total > - age 6: 503288 bytes, 89288584 total > 2017-05-23T20:42:55.363-0400: 40580.457: [SoftReference, 0 refs, 0.0010011 secs]2017-05-23T20:42:55.364-0400: 40580.458: [WeakReference, 367 refs, 0.0006136 secs]2017-05-23T20:42:55.365-0400: 40580.458: [FinalReference, 7659 refs, 0.0014460 secs]2017-05-23T20:42:55.366-0400: 40580.460: [PhantomReference, 0 refs, 0 refs, 0.0011060 secs]2017-05-23T20:42:55.367-0400: 40580.461: [JNI Weak Reference, 0.0000647 secs], 0.0669684 secs] > [Parallel Time: 57.9 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40580398.1, Avg: 40580398.2, Max: 40580398.3, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 2.7, Diff: 1.0, Sum: 15.2] > [Update RS (ms): Min: 15.2, Avg: 15.7, Max: 15.8, Diff: 0.6, Sum: 125.4] > [Processed Buffers: Min: 44, Avg: 50.1, Max: 62, Diff: 18, Sum: 401] > [Scan RS (ms): Min: 1.9, Avg: 2.0, Max: 2.1, Diff: 0.2, Sum: 15.9] > [Code Root Scanning (ms): Min: 1.6, Avg: 1.7, Max: 1.7, Diff: 0.1, Sum: 13.5] > [Object Copy (ms): Min: 36.0, Avg: 36.2, Max: 36.2, Diff: 0.2, Sum: 289.2] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 1.0] > [GC Worker Total (ms): Min: 57.4, Avg: 57.5, Max: 57.6, Diff: 0.2, Sum: 460.3] > [GC Worker End (ms): Min: 40580455.7, Avg: 40580455.7, Max: 40580455.8, Diff: 0.1] > [Code Root Fixup: 0.2 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.8 ms] > [Other: 8.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.7 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 1.9 ms] > [Eden: 2484.0M(2484.0M)->0.0B(3544.0M) Survivors: 98.0M->100.0M Heap: 5189.0M(7168.0M)->2708.0M(7168.0M)] > [Times: user=0.45 sys=0.03, real=0.07 secs] > 2017-05-23T20:42:55.372-0400: 40580.465: Total time for which application threads were stopped: 0.0685303 seconds, Stopping threads took: 0.0001346 seconds > 2017-05-23T20:42:59.372-0400: 40584.465: Application time: 4.0004774 seconds > 2017-05-23T20:42:59.376-0400: 40584.469: Total time for which application threads were stopped: 0.0036324 seconds, Stopping threads took: 0.0023017 seconds > 2017-05-23T20:43:05.891-0400: 40590.984: Application time: 6.5149722 seconds > 2017-05-23T20:43:05.892-0400: 40590.985: [GC pause (G1 Evacuation Pause) (young) > Desired survivor size 239075328 bytes, new threshold 15 (max 15) > - age 1: 4934368 bytes, 4934368 total > - age 2: 2633808 bytes, 7568176 total > - age 3: 71264464 bytes, 78832640 total > - age 4: 527368 bytes, 79360008 total > - age 5: 12893400 bytes, 92253408 total > - age 6: 750128 bytes, 93003536 total > - age 7: 432784 bytes, 93436320 total > 2017-05-23T20:43:09.493-0400: 40594.586: [SoftReference, 0 refs, 0.0067938 secs]2017-05-23T20:43:09.500-0400: 40594.593: [WeakReference, 0 refs, 0.0033881 secs]2017-05-23T20:43:09.503-0400: 40594.597: [FinalReference, 0 refs, 0.0005787 secs]2017-05-23T20:43:09.504-0400: 40594.597: [PhantomReference, 0 refs, 0 refs, 0.0011377 secs]2017-05-23T20:43:09.505-0400: 40594.598: [JNI Weak Reference, 0.0000618 secs] (to-space exhausted), 4.8672247 secs] > [Parallel Time: 3599.9 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40590985.9, Avg: 40590986.0, Max: 40590986.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.3, Diff: 0.6, Sum: 15.2] > [Update RS (ms): Min: 68.3, Avg: 68.4, Max: 68.5, Diff: 0.2, Sum: 547.6] > [Processed Buffers: Min: 32, Avg: 49.0, Max: 72, Diff: 40, Sum: 392] > [Scan RS (ms): Min: 2.8, Avg: 2.9, Max: 3.0, Diff: 0.1, Sum: 23.2] > [Code Root Scanning (ms): Min: 2.4, Avg: 2.5, Max: 2.5, Diff: 0.1, Sum: 19.7] > [Object Copy (ms): Min: 3523.7, Avg: 3523.8, Max: 3523.9, Diff: 0.2, Sum: 28190.6] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.5] > [GC Worker Total (ms): Min: 3599.6, Avg: 3599.7, Max: 3599.8, Diff: 0.2, Sum: 28797.6] > [GC Worker End (ms): Min: 40594585.6, Avg: 40594585.7, Max: 40594585.7, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 1.2 ms] > [Other: 1265.8 ms] > [Evacuation Failure: 1248.2 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 12.4 ms] > [Ref Enq: 0.5 ms] > [Redirty Cards: 2.1 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 1.5 ms] > [Eden: 3544.0M(3544.0M)->0.0B(976.0M) Survivors: 100.0M->456.0M Heap: 6274.3M(7168.0M)->5978.2M(7168.0M)] > [Times: user=13.58 sys=0.11, real=4.86 secs] > 2017-05-23T20:43:10.760-0400: 40595.853: Total time for which application threads were stopped: 4.8690628 seconds, Stopping threads took: 0.0002136 seconds > 2017-05-23T20:43:11.762-0400: 40596.855: Application time: 1.0019247 seconds > 2017-05-23T20:43:11.763-0400: 40596.856: Total time for which application threads were stopped: 0.0015356 seconds, Stopping threads took: 0.0003107 seconds > 2017-05-23T20:43:11.880-0400: 40596.973: Application time: 0.1164884 seconds > 2017-05-23T20:43:11.881-0400: 40596.974: [GC pause (G1 Humongous Allocation) (young) (initial-mark) > Desired survivor size 94371840 bytes, new threshold 1 (max 15) > - age 1: 477501112 bytes, 477501112 total > - age 2: 182296 bytes, 477683408 total > - age 3: 78880 bytes, 477762288 total > - age 4: 45376 bytes, 477807664 total > - age 5: 92304 bytes, 477899968 total > - age 6: 75448 bytes, 477975416 total > - age 7: 86752 bytes, 478062168 total > - age 8: 71408 bytes, 478133576 total > 2017-05-23T20:43:17.335-0400: 40602.428: [SoftReference, 0 refs, 0.0071133 secs]2017-05-23T20:43:17.342-0400: 40602.435: [WeakReference, 3 refs, 0.0007987 secs]2017-05-23T20:43:17.343-0400: 40602.436: [FinalReference, 182 refs, 0.0017603 secs]2017-05-23T20:43:17.345-0400: 40602.438: [PhantomReference, 0 refs, 0 refs, 0.0015961 secs]2017-05-23T20:43:17.346-0400: 40602.440: [JNI Weak Reference, 0.0000730 secs] (to-space exhausted), 6.1987667 secs] > [Parallel Time: 5446.3 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40596975.6, Avg: 40596975.7, Max: 40596975.8, Diff: 0.2] > [Ext Root Scanning (ms): Min: 2.9, Avg: 3.1, Max: 3.2, Diff: 0.3, Sum: 24.4] > [Update RS (ms): Min: 10.1, Avg: 10.3, Max: 10.5, Diff: 0.4, Sum: 82.6] > [Processed Buffers: Min: 33, Avg: 40.2, Max: 51, Diff: 18, Sum: 322] > [Scan RS (ms): Min: 30.7, Avg: 31.1, Max: 32.4, Diff: 1.8, Sum: 249.0] > [Code Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.6, Diff: 0.5, Sum: 2.8] > [Object Copy (ms): Min: 5399.2, Avg: 5400.6, Max: 5400.9, Diff: 1.7, Sum: 43204.5] > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.3] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5] > [GC Worker Total (ms): Min: 5445.5, Avg: 5445.6, Max: 5445.7, Diff: 0.2, Sum: 43565.0] > [GC Worker End (ms): Min: 40602421.3, Avg: 40602421.4, Max: 40602421.4, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.8 ms] > [Other: 751.4 ms] > [Evacuation Failure: 728.5 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 17.8 ms] > [Ref Enq: 0.5 ms] > [Redirty Cards: 2.1 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.2 ms] > [Free CSet: 0.8 ms] > [Eden: 878.0M(976.0M)->0.0B(1424.0M) Survivors: 456.0M->8192.0K Heap: 6856.2M(7168.0M)->6908.2M(7168.0M)] > [Times: user=11.66 sys=1.15, real=6.19 secs] > 2017-05-23T20:43:18.080-0400: 40603.173: [GC concurrent-root-region-scan-start] > 2017-05-23T20:43:18.080-0400: 40603.173: Total time for which application threads were stopped: 6.2005443 seconds, Stopping threads took: 0.0002322 seconds > 2017-05-23T20:43:18.080-0400: 40603.174: Application time: 0.0002882 seconds > 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-root-region-scan-end, 0.0339339 secs] > 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-mark-start] > 2017-05-23T20:43:18.142-0400: 40603.235: Total time for which application threads were stopped: 0.0613820 seconds, Stopping threads took: 0.0001677 seconds > 2017-05-23T20:43:18.142-0400: 40603.236: Application time: 0.0005017 seconds > 2017-05-23T20:43:18.144-0400: 40603.237: Total time for which application threads were stopped: 0.0013197 seconds, Stopping threads took: 0.0001188 seconds > 2017-05-23T20:43:18.144-0400: 40603.237: Application time: 0.0001781 seconds > 2017-05-23T20:43:18.144-0400: 40603.238: Total time for which application threads were stopped: 0.0005735 seconds, Stopping threads took: 0.0000568 seconds > 2017-05-23T20:43:18.728-0400: 40603.821: Application time: 0.5835349 seconds > 2017-05-23T20:43:18.730-0400: 40603.823: [GC pause (G1 Evacuation Pause) (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > - age 1: 8388248 bytes, 8388248 total > 2017-05-23T20:43:19.821-0400: 40604.914: [SoftReference, 0 refs, 0.0009673 secs]2017-05-23T20:43:19.822-0400: 40604.915: [WeakReference, 0 refs, 0.0006733 secs]2017-05-23T20:43:19.823-0400: 40604.916: [FinalReference, 0 refs, 0.0006260 secs]2017-05-23T20:43:19.823-0400: 40604.917: [PhantomReference, 0 refs, 0 refs, 0.0013002 secs]2017-05-23T20:43:19.825-0400: 40604.918: [JNI Weak Reference, 0.0000642 secs] (to-space exhausted), 1.2567408 secs] > [Parallel Time: 1084.5 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40603823.4, Avg: 40603823.5, Max: 40603823.6, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.7, Sum: 15.3] > [Update RS (ms): Min: 23.8, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum: 191.7] > [Processed Buffers: Min: 49, Avg: 53.5, Max: 60, Diff: 11, Sum: 428] > [Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 8.6] > [Code Root Scanning (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.8] > [Object Copy (ms): Min: 1056.4, Avg: 1056.8, Max: 1057.2, Diff: 0.8, Sum: 8454.7] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.0] > [Termination Attempts: Min: 1, Avg: 3.8, Max: 7, Diff: 6, Sum: 30] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.5] > [GC Worker Total (ms): Min: 1084.0, Avg: 1084.1, Max: 1084.2, Diff: 0.2, Sum: 8673.2] > [GC Worker End (ms): Min: 40604907.6, Avg: 40604907.7, Max: 40604907.7, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 171.7 ms] > [Evacuation Failure: 159.4 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 9.9 ms] > [Ref Enq: 0.6 ms] > [Redirty Cards: 0.6 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.3 ms] > [Free CSet: 0.2 ms] > [Eden: 230.0M(1424.0M)->0.0B(1432.0M) Survivors: 8192.0K->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=2.33 sys=0.34, real=1.26 secs] > 2017-05-23T20:43:19.987-0400: 40605.080: Total time for which application threads were stopped: 1.2587489 seconds, Stopping threads took: 0.0002182 seconds > 2017-05-23T20:43:19.987-0400: 40605.080: Application time: 0.0003101 seconds > 2017-05-23T20:43:19.988-0400: 40605.082: [GC pause (G1 Evacuation Pause) (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.020-0400: 40605.113: [SoftReference, 0 refs, 0.0008856 secs]2017-05-23T20:43:20.020-0400: 40605.114: [WeakReference, 0 refs, 0.0005588 secs]2017-05-23T20:43:20.021-0400: 40605.114: [FinalReference, 0 refs, 0.0006006 secs]2017-05-23T20:43:20.022-0400: 40605.115: [PhantomReference, 0 refs, 0 refs, 0.0010837 secs]2017-05-23T20:43:20.023-0400: 40605.116: [JNI Weak Reference, 0.0000610 secs], 0.0356212 secs] > [Parallel Time: 30.0 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605081.9, Avg: 40605082.0, Max: 40605082.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.9, Avg: 2.0, Max: 2.5, Diff: 0.6, Sum: 16.1] > [Update RS (ms): Min: 27.3, Avg: 27.4, Max: 27.5, Diff: 0.2, Sum: 219.3] > [Processed Buffers: Min: 82, Avg: 87.4, Max: 92, Diff: 10, Sum: 699] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.4] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.4] > [GC Worker Total (ms): Min: 29.7, Avg: 29.8, Max: 29.9, Diff: 0.2, Sum: 238.5] > [GC Worker End (ms): Min: 40605111.8, Avg: 40605111.8, Max: 40605111.8, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.2 ms] > [Other: 5.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.0 ms] > [Ref Enq: 0.2 ms] > [Redirty Cards: 0.2 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.2 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.25 sys=0.00, real=0.04 secs] > 2017-05-23T20:43:20.024-0400: 40605.118: Total time for which application threads were stopped: 0.0372043 seconds, Stopping threads took: 0.0001640 seconds > 2017-05-23T20:43:20.025-0400: 40605.118: Application time: 0.0002435 seconds > 2017-05-23T20:43:20.026-0400: 40605.119: [GC pause (G1 Evacuation Pause) (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.030-0400: 40605.123: [SoftReference, 0 refs, 0.0009405 secs]2017-05-23T20:43:20.031-0400: 40605.124: [WeakReference, 0 refs, 0.0005771 secs]2017-05-23T20:43:20.032-0400: 40605.125: [FinalReference, 0 refs, 0.0005766 secs]2017-05-23T20:43:20.032-0400: 40605.125: [PhantomReference, 0 refs, 0 refs, 0.0011847 secs]2017-05-23T20:43:20.033-0400: 40605.127: [JNI Weak Reference, 0.0000549 secs], 0.0087717 secs] > [Parallel Time: 3.0 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605119.3, Avg: 40605119.4, Max: 40605119.5, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 2.6, Diff: 1.0, Sum: 14.8] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] > [Processed Buffers: Min: 0, Avg: 0.5, Max: 2, Diff: 2, Sum: 4] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination (ms): Min: 0.0, Avg: 0.6, Max: 0.6, Diff: 0.6, Sum: 4.4] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5] > [GC Worker Total (ms): Min: 2.6, Avg: 2.6, Max: 2.7, Diff: 0.1, Sum: 21.1] > [GC Worker End (ms): Min: 40605122.0, Avg: 40605122.1, Max: 40605122.1, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 5.2 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.1 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.03 sys=0.00, real=0.01 secs] > 2017-05-23T20:43:20.035-0400: 40605.128: Total time for which application threads were stopped: 0.0102350 seconds, Stopping threads took: 0.0000635 seconds > 2017-05-23T20:43:20.035-0400: 40605.128: Application time: 0.0002150 seconds > 2017-05-23T20:43:20.036-0400: 40605.129: [GC pause (G1 Evacuation Pause) (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.040-0400: 40605.133: [SoftReference, 0 refs, 0.0010156 secs]2017-05-23T20:43:20.041-0400: 40605.134: [WeakReference, 0 refs, 0.0006580 secs]2017-05-23T20:43:20.042-0400: 40605.135: [FinalReference, 0 refs, 0.0006435 secs]2017-05-23T20:43:20.042-0400: 40605.136: [PhantomReference, 0 refs, 0 refs, 0.0012604 secs]2017-05-23T20:43:20.044-0400: 40605.137: [JNI Weak Reference, 0.0000513 secs], 0.0087896 secs] > [Parallel Time: 2.7 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605129.6, Avg: 40605129.7, Max: 40605129.8, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.5, Diff: 0.8, Sum: 14.9] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2] > [Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.3] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.5] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5] > [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: 19.5] > [GC Worker End (ms): Min: 40605132.1, Avg: 40605132.2, Max: 40605132.2, Diff: 0.0] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.3 ms] > [Other: 5.5 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.4 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.04 sys=0.00, real=0.01 secs] > 2017-05-23T20:43:20.045-0400: 40605.138: Total time for which application threads were stopped: 0.0101403 seconds, Stopping threads took: 0.0000614 seconds > 2017-05-23T20:43:20.045-0400: 40605.139: Application time: 0.0001681 seconds > 2017-05-23T20:43:20.046-0400: 40605.140: [GC pause (G1 Evacuation Pause) (young) > Desired survivor size 94371840 bytes, new threshold 15 (max 15) > 2017-05-23T20:43:20.050-0400: 40605.144: [SoftReference, 0 refs, 0.0008321 secs]2017-05-23T20:43:20.051-0400: 40605.145: [WeakReference, 0 refs, 0.0006103 secs]2017-05-23T20:43:20.052-0400: 40605.145: [FinalReference, 0 refs, 0.0007194 secs]2017-05-23T20:43:20.053-0400: 40605.146: [PhantomReference, 0 refs, 0 refs, 0.0010705 secs]2017-05-23T20:43:20.054-0400: 40605.147: [JNI Weak Reference, 0.0000508 secs], 0.0084107 secs] > [Parallel Time: 2.7 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 40605139.9, Avg: 40605140.0, Max: 40605140.1, Diff: 0.2] > [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.8, Sum: 15.1] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] > [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.2] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5] > [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: 19.2] > [GC Worker End (ms): Min: 40605142.4, Avg: 40605142.4, Max: 40605142.5, Diff: 0.1] > [Code Root Fixup: 0.3 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.2 ms] > [Other: 5.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.1 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.2 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.1 ms] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)] > [Times: user=0.03 sys=0.01, real=0.01 secs] > 2017-05-23T20:43:20.055-0400: 40605.148: Total time for which application threads were stopped: 0.0097185 seconds, Stopping threads took: 0.0001029 seconds > 2017-05-23T20:43:20.055-0400: 40605.148: Application time: 0.0001505 seconds > 2017-05-23T20:43:20.056-0400: 40605.149: [Full GC (Allocation Failure) 2017-05-23T20:43:22.446-0400: 40607.540: [SoftReference, 1667 refs, 0.0003772 secs]2017-05-23T20:43:22.447-0400: 40607.541: [WeakReference, 5626 refs, 0.0008068 secs]2017-05-23T20:43:22.448-0400: 40607.541: [FinalReference, 4015 refs, 0.0015169 secs]2017-05-23T20:43:22.450-0400: 40607.543: [PhantomReference, 1 refs, 372 refs, 0.0001585 secs]2017-05-23T20:43:22.450-0400: 40607.543: [JNI Weak Reference, 0.0000963 secs] 7139M->2327M(7168M), 9.7036499 secs] > [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->2327.6M(7168.0M)], [Metaspace: 108907K->108428K(1150976K)] > [Times: user=13.22 sys=0.00, real=9.70 secs] > 2017-05-23T20:43:29.760-0400: 40614.853: Total time for which application threads were stopped: 9.7047785 seconds, Stopping threads took: 0.0000566 seconds > 2017-05-23T20:43:29.760-0400: 40614.854: [GC concurrent-mark-abort] > 2017-05-23T20:43:29.763-0400: 40614.856: Application time: 0.0029444 seconds > > On Jul 21, 2017, at 5:34 PM, Thomas Schatzl > wrote: > > > > Hi Kirk, > > > > On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote: > >> Hi all, > >> > >> A while back I mentioned to Erik at JFokus that I was seeing a > >> puzzling behavior in the G1 where without any obvious failure, heap > >> occupancy after collections would spike which would trigger a full > >> which would (unexpectedly) completely recover everything down to the > >> expected live set. Yesterday while working with Simone Bordet on the > >> problem we came to the realization that we were seeing a pattern > >> prior to the ramp up to the Full, Survivor space would be > >> ergonomically resized to 0 -> 0. The only way to reset the situation > >> was to run a full collection. In our minds this doesn?t make any > >> sense to reset survivor space to 0. So far this is an observation > >> from a single GC log but I recall seeing the pattern in many other > >> logs. Before I go through the exercise of building a super grep to > >> run over my G1 log repo I?d like to ask; under what conditions would > >> it make sense to have the survivor space resized to 0? And if not, > >> would this be bug in G1? We tried reproducing the behavior in some > >> test applications but I fear we often only see this happening in > >> production applications that have been running for several days. It?s > >> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9. > > > > sounds similar to https://bugs.openjdk.java.net/browse/JDK-8037500 . > > Could you please post the type of collections for a few more gcs before > > the zero-sized ones? It would be particularly interesting if there is a > > mixed gc with to-space exhaustion just before this sequence. And if > > there are log messages with attempts to start marking too. > > > > As for why that bug has been closed as "won't fix" because we do not > > have a reproducer (any more) to test any changes in addition to the > > stated reasons that the performance impact seemed minor at that time. > > > > There have been some changes in how the next gc is calculated in 9 too, > > so I do not know either if 9 is also affected (particularly one of > > these young-only gc's would not be issued any more). > > > > I can think of at least one more reasons other than stated in the CR > > why this occurs at least for 8u60+ builds. There is the possibility > > particularly in conjunction with humongous object allocation that after > > starting the mutator, immediately afterwards a young gc that reclaims > > zero space is issued, e.g.: > > > > young-gc, has X regions left at the end, starts mutators > > mutator 1 allocates exactly X regions as humongous objects > > mutator 2 allocates, finds that there are no regions left, issues > > young-gc request; in this young-gc eden and survivor are of obviously > > of zero size > > [...and so on...] > > > > Note that this pattern could repeat multiple times as young gc may > > reclaim space from humongous objects (eager reclaim!) until at some > > point it ran into full gc. > > > > The logging that shows humongous object allocation (something about > > reaching threshold and starting marking) could confirm this situation. > > > > No guarantees about that being the actual issue though. > > > > Thanks, > > Thomas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kim.barrett at oracle.com Mon Jul 24 20:41:02 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 24 Jul 2017 16:41:02 -0400 Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as belonging to mtGC In-Reply-To: References: <1500536234.2924.0.camel@oracle.com> Message-ID: > On Jul 23, 2017, at 4:31 AM, Milan Mimica wrote: > > ?et, 20. srp 2017. u 09:37 Thomas Schatzl napisao je: > > great! > > Looks good. I can sponsor as soon as Kim or anybody else gives his > okay. > > Hi > > I just noticed my heapBitMap_nmt.diff includes the other one. Find the corrected one in attachment. Thomas passed off the sponsoring to me. I noticed that problem as well, and had adjusted for it. Unfortunately, I ran into a test failure, which I haven?t had time yet to really investigate. I doubt it?s related to these changes, but won?t really know until I get time to dig into it, which might be a couple of days. From rkennke at redhat.com Tue Jul 25 10:15:00 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 25 Jul 2017 12:15:00 +0200 Subject: RFR: 8184751: Provide thread pool for parallel safepoint cleanup In-Reply-To: <8ec1092c-b01e-80a9-23dd-8447e30c675e@redhat.com> References: <8ec1092c-b01e-80a9-23dd-8447e30c675e@redhat.com> Message-ID: I have discussed this with Robbin Ehn offline. There is not much interest in this change from Oracle engineering to have this upstream. Unless somebody speaks up, I will close the bug and withdraw the review by the end of today. I will build this into Shenandoah-only instead in this case. Roman > This is a follow-up to 8180932: Parallelize safepoint cleanup, which > should land in JDK10 real soon now. > > In order to actually be able to parallelize safepoint cleanup, we now > need the GC to provide some worker threads. > > In this change, I propose to create one globally (i.e. for all GCs) in > CollectedHeap, if ParallelSafepointCleanupThreads>1. The flag defaults > to 0, which means it's doing cleanup using the VMThread (i.e. exactly > current behaviour). > > We have already discussed this, and came to the conclusion that it does > not really make sense to share the GC's worker threads here, because > they may not be idle, but only suspended from concurrent work (i.e. by > SuspendibleThreadSet::synchronize() or similar). > > http://cr.openjdk.java.net/~rkennke/8184751/webrev.00/ > > > What do you think? > > > Roman > > From erik.osterlund at oracle.com Tue Jul 25 11:29:33 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 25 Jul 2017 13:29:33 +0200 Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling Message-ID: <59772B9D.9000100@oracle.com> Hi, Bug: https://bugs.openjdk.java.net/browse/JDK-8185141 Webrev: http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/ There seems to be different ways of handling scavengeable nmethod roots in hotspot. The primary way of dealing with them is to use the CodeCache scavenge root nmethod list that maintains a list of all nmethods with scavengeable nmethods. However, G1 does not use this list as it has its own mechanism of keeping track of nmethods with scavengeable roots pointing into the heap. To handle this, the current CodeCache code is full of special cases for G1. In multiple cases we check if (UseG1GC) and then return. ,m We seemingly need a better way of communicating to the GC what scavengeable nmethod roots there are to be able to get rid of the if (UseG1GC)... code. As a solution, I propose to make CollectedHeap::register_nmethod the primary way of registering to the GC that there might be a new nmethod to keep track of. It is then up to the specific GC to take appropriate action. The default appropriate action of CollectedHeap is to add the nmethod to the shared scavenge root nmethod list if it is not already on the list and it detected the existence of a scavengeable root oop in the nmethod. G1 on the other hand, will use its closures to figure out what remembered set it should be added to. When using G1, the CodeCache scavenge list will be empty, and so a lot of G1-centric code for exiting before we walk the list of nmethods on the list can be removed where the list is processed in a for loop. Because since the list is empty, it does not matter that G1 runs this code too - it will just iterate 0 times in the loop since it is empty. But that's because the list was empty, not because we are using G1 - it just happens to be that the list is always empty when we use G1. Testing: JPRT with hotspot testset, RBT hs-tier3. Thanks, /Erik From rkennke at redhat.com Tue Jul 25 11:36:08 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 25 Jul 2017 13:36:08 +0200 Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling In-Reply-To: <59772B9D.9000100@oracle.com> References: <59772B9D.9000100@oracle.com> Message-ID: <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com> Hi Erik, the change looks mostly good to me. This really needed cleanup. However, I question to do the default impl in CollectedHeap, and rely on G1 to override it. Shenandoah's not using the scavenge roots list either. It seems odd to have a default impl in the superclass that is used by only 2 subclasses (GCH and PSH), and 2 other subclasses not using it. And potential future implementors require to override it to not do that stuff. Think Epsilon GC too: it doesn't need it, and must add code to not do it. It just seems wrong. I'd just add the impl to both GCH and PSH, and leave the superclass empty. Roman Am 25.07.2017 um 13:29 schrieb Erik ?sterlund: > Hi, > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8185141 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/ > > There seems to be different ways of handling scavengeable nmethod > roots in hotspot. > > The primary way of dealing with them is to use the CodeCache scavenge > root nmethod list that maintains a list of all nmethods with > scavengeable nmethods. > However, G1 does not use this list as it has its own mechanism of > keeping track of nmethods with scavengeable roots pointing into the heap. > To handle this, the current CodeCache code is full of special cases > for G1. In multiple cases we check if (UseG1GC) and then return. > ,m > We seemingly need a better way of communicating to the GC what > scavengeable nmethod roots there are to be able to get rid of the if > (UseG1GC)... code. > > As a solution, I propose to make CollectedHeap::register_nmethod the > primary way of registering to the GC that there might be a new nmethod > to keep track of. It is then up to the specific GC to take appropriate > action. The default appropriate action of CollectedHeap is to add the > nmethod to the shared scavenge root nmethod list if it is not already > on the list and it detected the existence of a scavengeable root oop > in the nmethod. G1 on the other hand, will use its closures to figure > out what remembered set it should be added to. > > When using G1, the CodeCache scavenge list will be empty, and so a lot > of G1-centric code for exiting before we walk the list of nmethods on > the list can be removed where the list is processed in a for loop. > Because since the list is empty, it does not matter that G1 runs this > code too - it will just iterate 0 times in the loop since it is empty. > But that's because the list was empty, not because we are using G1 - > it just happens to be that the list is always empty when we use G1. > > Testing: JPRT with hotspot testset, RBT hs-tier3. > > Thanks, > /Erik From erik.osterlund at oracle.com Tue Jul 25 12:34:04 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 25 Jul 2017 14:34:04 +0200 Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling In-Reply-To: <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com> References: <59772B9D.9000100@oracle.com> <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com> Message-ID: <59773ABC.1020506@oracle.com> Hi Roman, I see your point. From my perspective, the default for any GC is to use the shared CodeCache scavenge root list, and anything else (G1/Shenandoah) is an exception and can override to do something else instead. Having said that, I agree we could easily move that default implementation to CodeCache from CollectedHeap and call it explicitly where it is used so that we do not accidentally mess up when we build a new GC. However, then I think we should also move verify_nmethod_roots() into those GCs then, as it is closely related to which list it is on. New full webrev: http://cr.openjdk.java.net/~eosterlund/8185141/webrev.01/ New incremental webrev: http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00_01/ What do you think? Thanks, /Erik On 2017-07-25 13:36, Roman Kennke wrote: > Hi Erik, > > the change looks mostly good to me. This really needed cleanup. > > However, I question to do the default impl in CollectedHeap, and rely on > G1 to override it. Shenandoah's not using the scavenge roots list > either. It seems odd to have a default impl in the superclass that is > used by only 2 subclasses (GCH and PSH), and 2 other subclasses not > using it. And potential future implementors require to override it to > not do that stuff. Think Epsilon GC too: it doesn't need it, and must > add code to not do it. It just seems wrong. I'd just add the impl to > both GCH and PSH, and leave the superclass empty. > > Roman > > Am 25.07.2017 um 13:29 schrieb Erik ?sterlund: >> Hi, >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8185141 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/ >> >> There seems to be different ways of handling scavengeable nmethod >> roots in hotspot. >> >> The primary way of dealing with them is to use the CodeCache scavenge >> root nmethod list that maintains a list of all nmethods with >> scavengeable nmethods. >> However, G1 does not use this list as it has its own mechanism of >> keeping track of nmethods with scavengeable roots pointing into the heap. >> To handle this, the current CodeCache code is full of special cases >> for G1. In multiple cases we check if (UseG1GC) and then return. >> ,m >> We seemingly need a better way of communicating to the GC what >> scavengeable nmethod roots there are to be able to get rid of the if >> (UseG1GC)... code. >> >> As a solution, I propose to make CollectedHeap::register_nmethod the >> primary way of registering to the GC that there might be a new nmethod >> to keep track of. It is then up to the specific GC to take appropriate >> action. The default appropriate action of CollectedHeap is to add the >> nmethod to the shared scavenge root nmethod list if it is not already >> on the list and it detected the existence of a scavengeable root oop >> in the nmethod. G1 on the other hand, will use its closures to figure >> out what remembered set it should be added to. >> >> When using G1, the CodeCache scavenge list will be empty, and so a lot >> of G1-centric code for exiting before we walk the list of nmethods on >> the list can be removed where the list is processed in a for loop. >> Because since the list is empty, it does not matter that G1 runs this >> code too - it will just iterate 0 times in the loop since it is empty. >> But that's because the list was empty, not because we are using G1 - >> it just happens to be that the list is always empty when we use G1. >> >> Testing: JPRT with hotspot testset, RBT hs-tier3. >> >> Thanks, >> /Erik > From rkennke at redhat.com Tue Jul 25 13:28:20 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 25 Jul 2017 15:28:20 +0200 Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling In-Reply-To: <59773ABC.1020506@oracle.com> References: <59772B9D.9000100@oracle.com> <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com> <59773ABC.1020506@oracle.com> Message-ID: <8c133671-a0b2-6aac-d474-0aed1a52b931@redhat.com> Much better! Good to go for me. Roman > Hi Roman, > > I see your point. From my perspective, the default for any GC is to > use the shared CodeCache scavenge root list, and anything else > (G1/Shenandoah) is an exception and can override to do something else > instead. > > Having said that, I agree we could easily move that default > implementation to CodeCache from CollectedHeap and call it explicitly > where it is used so that we do not accidentally mess up when we build > a new GC. > > However, then I think we should also move verify_nmethod_roots() into > those GCs then, as it is closely related to which list it is on. > > New full webrev: > http://cr.openjdk.java.net/~eosterlund/8185141/webrev.01/ > > New incremental webrev: > http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00_01/ > > What do you think? > > Thanks, > /Erik > > On 2017-07-25 13:36, Roman Kennke wrote: >> Hi Erik, >> >> the change looks mostly good to me. This really needed cleanup. >> >> However, I question to do the default impl in CollectedHeap, and rely on >> G1 to override it. Shenandoah's not using the scavenge roots list >> either. It seems odd to have a default impl in the superclass that is >> used by only 2 subclasses (GCH and PSH), and 2 other subclasses not >> using it. And potential future implementors require to override it to >> not do that stuff. Think Epsilon GC too: it doesn't need it, and must >> add code to not do it. It just seems wrong. I'd just add the impl to >> both GCH and PSH, and leave the superclass empty. >> >> Roman >> >> Am 25.07.2017 um 13:29 schrieb Erik ?sterlund: >>> Hi, >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8185141 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/ >>> >>> There seems to be different ways of handling scavengeable nmethod >>> roots in hotspot. >>> >>> The primary way of dealing with them is to use the CodeCache scavenge >>> root nmethod list that maintains a list of all nmethods with >>> scavengeable nmethods. >>> However, G1 does not use this list as it has its own mechanism of >>> keeping track of nmethods with scavengeable roots pointing into the >>> heap. >>> To handle this, the current CodeCache code is full of special cases >>> for G1. In multiple cases we check if (UseG1GC) and then return. >>> ,m >>> We seemingly need a better way of communicating to the GC what >>> scavengeable nmethod roots there are to be able to get rid of the if >>> (UseG1GC)... code. >>> >>> As a solution, I propose to make CollectedHeap::register_nmethod the >>> primary way of registering to the GC that there might be a new nmethod >>> to keep track of. It is then up to the specific GC to take appropriate >>> action. The default appropriate action of CollectedHeap is to add the >>> nmethod to the shared scavenge root nmethod list if it is not already >>> on the list and it detected the existence of a scavengeable root oop >>> in the nmethod. G1 on the other hand, will use its closures to figure >>> out what remembered set it should be added to. >>> >>> When using G1, the CodeCache scavenge list will be empty, and so a lot >>> of G1-centric code for exiting before we walk the list of nmethods on >>> the list can be removed where the list is processed in a for loop. >>> Because since the list is empty, it does not matter that G1 runs this >>> code too - it will just iterate 0 times in the loop since it is empty. >>> But that's because the list was empty, not because we are using G1 - >>> it just happens to be that the list is always empty when we use G1. >>> >>> Testing: JPRT with hotspot testset, RBT hs-tier3. >>> >>> Thanks, >>> /Erik >> > From erik.osterlund at oracle.com Tue Jul 25 13:47:38 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 25 Jul 2017 15:47:38 +0200 Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling In-Reply-To: <8c133671-a0b2-6aac-d474-0aed1a52b931@redhat.com> References: <59772B9D.9000100@oracle.com> <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com> <59773ABC.1020506@oracle.com> <8c133671-a0b2-6aac-d474-0aed1a52b931@redhat.com> Message-ID: <59774BFA.50700@oracle.com> Hi, Thanks for the review Roman! /Erik On 2017-07-25 15:28, Roman Kennke wrote: > Much better! Good to go for me. > > Roman > >> Hi Roman, >> >> I see your point. From my perspective, the default for any GC is to >> use the shared CodeCache scavenge root list, and anything else >> (G1/Shenandoah) is an exception and can override to do something else >> instead. >> >> Having said that, I agree we could easily move that default >> implementation to CodeCache from CollectedHeap and call it explicitly >> where it is used so that we do not accidentally mess up when we build >> a new GC. >> >> However, then I think we should also move verify_nmethod_roots() into >> those GCs then, as it is closely related to which list it is on. >> >> New full webrev: >> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.01/ >> >> New incremental webrev: >> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00_01/ >> >> What do you think? >> >> Thanks, >> /Erik >> >> On 2017-07-25 13:36, Roman Kennke wrote: >>> Hi Erik, >>> >>> the change looks mostly good to me. This really needed cleanup. >>> >>> However, I question to do the default impl in CollectedHeap, and rely on >>> G1 to override it. Shenandoah's not using the scavenge roots list >>> either. It seems odd to have a default impl in the superclass that is >>> used by only 2 subclasses (GCH and PSH), and 2 other subclasses not >>> using it. And potential future implementors require to override it to >>> not do that stuff. Think Epsilon GC too: it doesn't need it, and must >>> add code to not do it. It just seems wrong. I'd just add the impl to >>> both GCH and PSH, and leave the superclass empty. >>> >>> Roman >>> >>> Am 25.07.2017 um 13:29 schrieb Erik ?sterlund: >>>> Hi, >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8185141 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/ >>>> >>>> There seems to be different ways of handling scavengeable nmethod >>>> roots in hotspot. >>>> >>>> The primary way of dealing with them is to use the CodeCache scavenge >>>> root nmethod list that maintains a list of all nmethods with >>>> scavengeable nmethods. >>>> However, G1 does not use this list as it has its own mechanism of >>>> keeping track of nmethods with scavengeable roots pointing into the >>>> heap. >>>> To handle this, the current CodeCache code is full of special cases >>>> for G1. In multiple cases we check if (UseG1GC) and then return. >>>> ,m >>>> We seemingly need a better way of communicating to the GC what >>>> scavengeable nmethod roots there are to be able to get rid of the if >>>> (UseG1GC)... code. >>>> >>>> As a solution, I propose to make CollectedHeap::register_nmethod the >>>> primary way of registering to the GC that there might be a new nmethod >>>> to keep track of. It is then up to the specific GC to take appropriate >>>> action. The default appropriate action of CollectedHeap is to add the >>>> nmethod to the shared scavenge root nmethod list if it is not already >>>> on the list and it detected the existence of a scavengeable root oop >>>> in the nmethod. G1 on the other hand, will use its closures to figure >>>> out what remembered set it should be added to. >>>> >>>> When using G1, the CodeCache scavenge list will be empty, and so a lot >>>> of G1-centric code for exiting before we walk the list of nmethods on >>>> the list can be removed where the list is processed in a for loop. >>>> Because since the list is empty, it does not matter that G1 runs this >>>> code too - it will just iterate 0 times in the loop since it is empty. >>>> But that's because the list was empty, not because we are using G1 - >>>> it just happens to be that the list is always empty when we use G1. >>>> >>>> Testing: JPRT with hotspot testset, RBT hs-tier3. >>>> >>>> Thanks, >>>> /Erik From alexander.harlap at oracle.com Tue Jul 25 14:24:18 2017 From: alexander.harlap at oracle.com (Alexander Harlap) Date: Tue, 25 Jul 2017 10:24:18 -0400 Subject: Need sponsor to push attached 8184045 into jdk10/hs/hostspt Message-ID: I need a sponsor to push attached 8184045.patch - . Patch should go into jdk10/hs/hotspot Reviewed by Daniel D. Daugherty and Erik Helin. Thank you, Alex -------------- next part -------------- # HG changeset patch # User aharlap # Date 1500992129 14400 # Node ID a780a9bf31f1ded1d008964d5c079892c0a97590 # Parent 0a22e4ef496e290dc1f4d87b87763c551f72cf23 8184045: TestSystemGCWithG1.java times out on Solaris SPARC Summary: Avoid extra round of stressing Reviewed-by: dcubed, ehelin diff -r 0a22e4ef496e -r a780a9bf31f1 test/gc/stress/systemgc/TestSystemGC.java --- a/test/gc/stress/systemgc/TestSystemGC.java Mon Jul 24 22:56:43 2017 +0000 +++ b/test/gc/stress/systemgc/TestSystemGC.java Tue Jul 25 10:15:29 2017 -0400 @@ -182,9 +182,11 @@ } public static void main(String[] args) { - // First allocate the long lived objects and then run all phases twice. + // First allocate the long lived objects and then run all phases. populateLongLived(); runAllPhases(); - runAllPhases(); + if (args.length > 0 && args[0].equals("long")) { + runAllPhases(); + } } } From alexander.harlap at oracle.com Tue Jul 25 17:37:41 2017 From: alexander.harlap at oracle.com (Alexander Harlap) Date: Tue, 25 Jul 2017 13:37:41 -0400 Subject: Need sponsor to push attached 8183973 into jdk10/hs/hostspt Message-ID: <406c98a0-42a2-729b-1a95-a105c52f5dc5@oracle.com> I need a sponsor to push attached 8183973.patch - gc/TestFullGCALot.java fails in JDK10-hs nightly Patch should go into jdk10/hs/hotspot Reviewed by Mikael Gerdin and Erik Osterlund. Thank you, Alex -------------- next part -------------- # HG changeset patch # User aharlap # Date 1501003694 14400 # Node ID 04c3d66bb13df8553920ec275fb246f96190783a # Parent 0a22e4ef496e290dc1f4d87b87763c551f72cf23 8183973: gc/TestFullGCALot.java fails in JDK10-hs nightly Summary: Provide extra NewSize to avoid failure in running test with UseDeterministicG1GC option. Reviewed-by: mgerdin, eosterlund diff -r 0a22e4ef496e -r 04c3d66bb13d test/gc/TestFullGCALot.java --- a/test/gc/TestFullGCALot.java Mon Jul 24 22:56:43 2017 +0000 +++ b/test/gc/TestFullGCALot.java Tue Jul 25 13:28:14 2017 -0400 @@ -25,9 +25,9 @@ * @test TestFullGCALot * @key gc * @bug 4187687 - * @summary Ensure no acess violation when using FullGCALot + * @summary Ensure no access violation when using FullGCALot * @requires vm.debug - * @run main/othervm -XX:+FullGCALot -XX:FullGCALotInterval=120 TestFullGCALot + * @run main/othervm -XX:NewSize=10m -XX:+FullGCALot -XX:FullGCALotInterval=120 TestFullGCALot */ public class TestFullGCALot { From Derek.White at cavium.com Tue Jul 25 22:08:24 2017 From: Derek.White at cavium.com (White, Derek) Date: Tue, 25 Jul 2017 22:08:24 +0000 Subject: RFR: 8184751: Provide thread pool for parallel safepoint cleanup In-Reply-To: References: <8ec1092c-b01e-80a9-23dd-8447e30c675e@redhat.com> Message-ID: Hi Roman, We might be interested in seeing this in Par GC and/or G1 at some point, but we can push that when the time comes. Thanks for working this issue though Roman, looking forward to trying it out in Shenandoah. - Derek White, Cavium > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Roman Kennke > Sent: Tuesday, July 25, 2017 6:15 AM > To: hotspot-gc-dev openjdk.java.net > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR: 8184751: Provide thread pool for parallel safepoint cleanup > > I have discussed this with Robbin Ehn offline. There is not much interest in > this change from Oracle engineering to have this upstream. > Unless somebody speaks up, I will close the bug and withdraw the review by > the end of today. > > I will build this into Shenandoah-only instead in this case. > > Roman > > > This is a follow-up to 8180932: Parallelize safepoint cleanup, which > > should land in JDK10 real soon now. > > > > In order to actually be able to parallelize safepoint cleanup, we now > > need the GC to provide some worker threads. > > > > In this change, I propose to create one globally (i.e. for all GCs) in > > CollectedHeap, if ParallelSafepointCleanupThreads>1. The flag defaults > > to 0, which means it's doing cleanup using the VMThread (i.e. exactly > > current behaviour). > > > > We have already discussed this, and came to the conclusion that it > > does not really make sense to share the GC's worker threads here, > > because they may not be idle, but only suspended from concurrent work > > (i.e. by > > SuspendibleThreadSet::synchronize() or similar). > > > > http://cr.openjdk.java.net/~rkennke/8184751/webrev.00/ > > > > > > What do you think? > > > > > > Roman > > > > From mark.reinhold at oracle.com Wed Jul 26 21:10:48 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Wed, 26 Jul 2017 14:10:48 -0700 (PDT) Subject: JEP 307: Parallel Full GC for G1 Message-ID: <20170726211048.5A5EA983B1@eggemoggin.niobe.net> New JEP Candidate: http://openjdk.java.net/jeps/307 - Mark From mark.reinhold at oracle.com Wed Jul 26 21:11:46 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Wed, 26 Jul 2017 14:11:46 -0700 (PDT) Subject: JEP 308: G1 Ergonomics Message-ID: <20170726211146.8E6E4983B7@eggemoggin.niobe.net> New JEP Candidate: http://openjdk.java.net/jeps/308 - Mark From kirk at kodewerk.com Thu Jul 27 07:45:46 2017 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Thu, 27 Jul 2017 09:45:46 +0200 Subject: JEP 308: G1 Ergonomics In-Reply-To: <20170726211146.8E6E4983B7@eggemoggin.niobe.net> References: <20170726211146.8E6E4983B7@eggemoggin.niobe.net> Message-ID: <6E6896B5-7658-4ABB-8B6B-B8C63FC64872@kodewerk.com> Hi, Great to see more work being done on improving G1 heuristics. From the data we?ve collected this year I can say that when G1 has needed to be tuned, one of the most useful levers has been -XX:G1NewSizePercent. 5% is often too small which then prematurely pushes data into tenured spaces. The next lever has been increasing reserved size from 5% to something bigger. This seems to help the collector cope with applications that seem to have bursty humongous allocation behavior (aka JSON serialization). Third would be G1MixedGCLiveThresholdPercent as even at 85% that can sometimes be too low a setting. Finally, balancing out mixed collection counts often helps stabilize pause times. Quite frequently the mixed collection count is 1 for just about every collection. Getting that to be mostly 8.. better. Kind regards, Kirk Pepperdine From milan.mimica at gmail.com Thu Jul 27 08:15:49 2017 From: milan.mimica at gmail.com (Milan Mimica) Date: Thu, 27 Jul 2017 08:15:49 +0000 Subject: JEP 307: Parallel Full GC for G1 In-Reply-To: <20170726211048.5A5EA983B1@eggemoggin.niobe.net> References: <20170726211048.5A5EA983B1@eggemoggin.niobe.net> Message-ID: Hi Can I have just a short explanation why G1 Full GC wasn't implemented as parallel in the first place, given "the assumption that nothing in the fundamental design of G1 prevents a parallel full GC."? sri, 26. srp 2017. u 23:11 napisao je: > New JEP Candidate: http://openjdk.java.net/jeps/307 > > - Mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Thu Jul 27 08:30:14 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 27 Jul 2017 10:30:14 +0200 Subject: JEP 307: Parallel Full GC for G1 In-Reply-To: References: <20170726211048.5A5EA983B1@eggemoggin.niobe.net> Message-ID: Hi Milan, I cannot give an authoritative answer to that, but since Shenandoah is very similar in this respect, and from my experience with Shenandoah, I think that full GC is not a very high priority. It is meant as a last-ditch collection, when all else fails to free enough space. In a good world, with perfect GC heuristics and well behaving applications, it should never happen, and thus performance shouldn't matter much. However, this world is not ideal, and full GC performance does matter, especially when you got a large heap, and run into it and lose *seconds* (or even minutes) on it. That being said, we do have a parallel full GC in Shenandoah, and its performance gets close to, and even sometimes exceeds, parallel GC. Maybe it's worth to adopt it for G1? It should be relatively straightforward, because both G1 and Shenandoah are region based. It does compact objects towards the bottom of the heap, while mostly retaining their relative order. Roman Am 27.07.2017 um 10:15 schrieb Milan Mimica: > Hi > > Can I have just a short explanation why G1 Full GC wasn't implemented > as parallel in the first place, given "the assumption that nothing in > the fundamental design of G1 prevents a parallel full GC."? > > > > sri, 26. srp 2017. u 23:11 > napisao je: > > New JEP Candidate: http://openjdk.java.net/jeps/307 > > - Mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Thu Jul 27 15:45:51 2017 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 27 Jul 2017 15:45:51 +0000 Subject: JEP 307: Parallel Full GC for G1 In-Reply-To: References: <20170726211048.5A5EA983B1@eggemoggin.niobe.net> Message-ID: <29745620-D9B1-4CA7-826D-43D09CCE00BE@amazon.com> Imo, all existing collectors can be replaced by variations on G1. The first step was replacing CMS (though admittedly there?s still some way to go with that). The second is to replace the parallel collector with a throughput oriented G1 mode, which requires a parallel STW full GC. Full collections should indeed equal or exceed parallel GC performance, because the old gen is mostly compacted already, so you don?t have to do anything with most old gen regions. You just run the equivalent of a mixed collection that includes all not-mostly-full old regions and promote the entire young gen. If you set throughput mode at VM startup, you shouldn?t need remembered sets either, just the card table. The third step is concurrent/parallel evacuation and continuous concurrent/parallel collection. Shenandoah is almost there, Azul?s C4 is completely there. You can see this progression in Android too, btw. O-dessert (ships next month) includes a concurrent/parallel region-based GC that replaces the previous variation-on-CMS collector. Paul From: hotspot-gc-dev on behalf of Roman Kennke Date: Thursday, July 27, 2017 at 1:30 AM To: Milan Mimica , "hotspot-gc-dev at openjdk.java.net openjdk.java.net" Subject: Re: JEP 307: Parallel Full GC for G1 Hi Milan, I cannot give an authoritative answer to that, but since Shenandoah is very similar in this respect, and from my experience with Shenandoah, I think that full GC is not a very high priority. It is meant as a last-ditch collection, when all else fails to free enough space. In a good world, with perfect GC heuristics and well behaving applications, it should never happen, and thus performance shouldn't matter much. However, this world is not ideal, and full GC performance does matter, especially when you got a large heap, and run into it and lose *seconds* (or even minutes) on it. That being said, we do have a parallel full GC in Shenandoah, and its performance gets close to, and even sometimes exceeds, parallel GC. Maybe it's worth to adopt it for G1? It should be relatively straightforward, because both G1 and Shenandoah are region based. It does compact objects towards the bottom of the heap, while mostly retaining their relative order. Roman Am 27.07.2017 um 10:15 schrieb Milan Mimica: Hi Can I have just a short explanation why G1 Full GC wasn't implemented as parallel in the first place, given "the assumption that nothing in the fundamental design of G1 prevents a parallel full GC."? sri, 26. srp 2017. u 23:11 > napisao je: New JEP Candidate: http://openjdk.java.net/jeps/307 - Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirk at kodewerk.com Tue Jul 11 11:28:09 2017 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Tue, 11 Jul 2017 11:28:09 -0000 Subject: Spikes in G1 Message-ID: Hi all, This is this mysterious G1 behavior that I?ve briefly mentioned to Erik that I keep seeing over and over again. I?ve just seen it again and this time I managed to get enough context to come with a hypothesis of why this is happening. For quite some time I?ve noted that the G1 has a tendency to get into a condition where collections start to fail and occupancy spikes to the point where the condition can only be resolved with a Full GC. The Full GC will typically recover all of the memory consumed by the spike (and then some). This is a a bit unexpected for if the data is referenced, which it appears to be as other (mixed) attempts to collect do fail., then the full gc should fail to collect and occupancy should remain high. In this case weak references appear to be in involved in the sequence of events that lead up to the Full GC. You can see in this case that the number of weak references process do spike during the full. I need to go back and review other logs to see if this is the same for past occurrences. I?m curious to understand if there is some unintended interplay between G1GC and WeakReferences that is ultimately responsible for heap occupancy to suddenly spike only to be completely reclaimed by a Full (even though mixed collections are running prior to the full). Kind regards Kirk -------------- next part -------------- A non-text attachment was scrubbed... Name: gc.log.20170709.150502.zip Type: application/zip Size: 4957843 bytes Desc: not available URL: From jiangli.zhou at oracle.com Thu Jul 27 19:00:19 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 27 Jul 2017 12:00:19 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive Message-ID: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> Hi, Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. RFE: JDK-8179302 hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. Types of Pinned G1 Heap Regions The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. 00100 0 [ 8] Pinned Mask 01000 0 [16] Old Mask 10000 0 [32] Archive Mask 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask 11100 1 [57] Archive : ArchiveMask | PinnedMask | OldMask + 1 Pinned Regions Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. Archive Regions The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. An archive region is also an old region by design. Open Archive (GC-RW) Regions Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. Adjustable Outgoing Pointers As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. Closed Archive (GC-RO) Regions The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. In JDK 9 we support archive Strings with the archive regions. The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. Dormant Objects Dormant objects are unreachable java objects within the open archive heap region. A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. Object State Transition All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. Caching Java Objects at Archive Dump Time The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. Caching Constant Pool resolved_reference Array The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. Runtime Java Heap With Cached Java Objects The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. Preliminary test execution and status: JPRT: passed Tier2-rt: passed Tier2-gc: passed Tier2-comp: passed Tier3-rt: passed Tier3-gc: passed Tier3-comp: passed Tier4-rt: passed Tier4-gc: passed Tier4-comp:6 jobs timed out, all other tests passed Tier5-rt: one test failed but passed when running locally, all other tests passed Tier5-gc: passed Tier5-comp: running hotspot_gc: two jobs timed out, all other tests passed hotspot_gc in CDS mode: two jobs timed out, all other tests passed vm.gc: passed vm.gc in CDS mode: passed Kichensink: passed Kichensink in CDS mode: passed Thanks, Jiangli -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Heap%20Regions-2.jpeg Type: image/jpeg Size: 14517 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Runtime%20Java%20Heap%20with%20Cached%20Objects.jpeg Type: image/jpeg Size: 20448 bytes Desc: not available URL: From jiangli.zhou at oracle.com Thu Jul 27 20:37:16 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 27 Jul 2017 13:37:16 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> Message-ID: <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> Sorry, the mail didn?t handle the rich text well. I fixed the format below. Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. Types of Pinned G1 Heap Regions The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. 00100 0 [ 8] Pinned Mask 01000 0 [16] Old Mask 10000 0 [32] Archive Mask 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 Pinned Regions Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. Archive Regions The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. An archive region is also an old region by design. Open Archive (GC-RW) Regions Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. Adjustable Outgoing Pointers As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. Closed Archive (GC-RO) Regions The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. In JDK 9 we support archive Strings with the archive regions. The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. Dormant Objects Dormant objects are unreachable java objects within the open archive heap region. A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. Object State Transition All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. Caching Java Objects at Archive Dump Time The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. Caching Constant Pool resolved_references Array The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. Runtime Java Heap With Cached Java Objects The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. Preliminary test execution and status: JPRT: passed Tier2-rt: passed Tier2-gc: passed Tier2-comp: passed Tier3-rt: passed Tier3-gc: passed Tier3-comp: passed Tier4-rt: passed Tier4-gc: passed Tier4-comp:6 jobs timed out, all other tests passed Tier5-rt: one test failed but passed when running locally, all other tests passed Tier5-gc: passed Tier5-comp: running hotspot_gc: two jobs timed out, all other tests passed hotspot_gc in CDS mode: two jobs timed out, all other tests passed vm.gc: passed vm.gc in CDS mode: passed Kichensink: passed Kichensink in CDS mode: passed Thanks, Jiangli -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Heap%20Regions-2.jpeg Type: image/jpeg Size: 14517 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Runtime%20Java%20Heap%20with%20Cached%20Objects.jpeg Type: image/jpeg Size: 20448 bytes Desc: not available URL: From mikael.gerdin at oracle.com Fri Jul 28 12:50:46 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Fri, 28 Jul 2017 14:50:46 +0200 Subject: RFR (9) 8185133: Reference pending list root might not get marked Message-ID: Hi all, Please review this fix to a tricky reference processing / conc marking bug affecting G1 in 9. The bug occurs when a weak reference WR is promoted to old and discovered during an initial mark pause. The WR is the referent of a soft reference SR. The concurrent reference processor determines that SR should be treated as a weak reference due to shortage of memory and now WR is reachable only from the reference pending list but not explicitly marked in the bitmap since objects promoted during the initial mark pause are not marked immediately. The reason we are not saved by the SATB pre-barrier here is that clearing of the referent field of a reference object does not trigger the pre-barrier (and that would kind of defeat its purpose). Before JDK-8156500 this worked because the reference pending list was a static field in the Reference class and the reference class was scanned during concurrent marking, so we would never lose track of the pending list head. My suggested fix is to explicitly mark the reference pending list head oop during initial mark, after the reference enqueue phase. This mirrors how other roots are handled in initial mark, see G1Mark::G1MarkPromotedFromRoots. Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0 Bug: https://bugs.openjdk.java.net/browse/JDK-8185133 Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test. Many thanks to Kim and Erik ? for discussions around this issue! Thanks /Mikael From erik.osterlund at oracle.com Fri Jul 28 13:00:23 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 28 Jul 2017 15:00:23 +0200 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: References: Message-ID: <597B3567.50307@oracle.com> Hi Mikael, Looks good. /Erik On 2017-07-28 14:50, Mikael Gerdin wrote: > Hi all, > > Please review this fix to a tricky reference processing / conc marking > bug affecting G1 in 9. > > The bug occurs when a weak reference WR is promoted to old and > discovered during an initial mark pause. The WR is the referent of a > soft reference SR. The concurrent reference processor determines that > SR should be treated as a weak reference due to shortage of memory and > now WR is reachable only from the reference pending list but not > explicitly marked in the bitmap since objects promoted during the > initial mark pause are not marked immediately. > > The reason we are not saved by the SATB pre-barrier here is that > clearing of the referent field of a reference object does not trigger > the pre-barrier (and that would kind of defeat its purpose). > > Before JDK-8156500 this worked because the reference pending list was > a static field in the Reference class and the reference class was > scanned during concurrent marking, so we would never lose track of the > pending list head. > > My suggested fix is to explicitly mark the reference pending list head > oop during initial mark, after the reference enqueue phase. > This mirrors how other roots are handled in initial mark, see > G1Mark::G1MarkPromotedFromRoots. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0 > Bug: https://bugs.openjdk.java.net/browse/JDK-8185133 > > Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test. > > Many thanks to Kim and Erik ? for discussions around this issue! > > Thanks > /Mikael From rkennke at redhat.com Fri Jul 28 14:53:57 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 28 Jul 2017 16:53:57 +0200 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: References: Message-ID: <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com> Hi Mikael, I don't really understand what the problem is. The WR ends up on the RPL, with its referent cleared, i.e. no longer pointing to the SR? But we want to keep the SR alive? Also, Universe::oops_do() already marks the RPL head, doesn't it? Roman > Hi all, > > Please review this fix to a tricky reference processing / conc marking > bug affecting G1 in 9. > > The bug occurs when a weak reference WR is promoted to old and > discovered during an initial mark pause. The WR is the referent of a > soft reference SR. The concurrent reference processor determines that > SR should be treated as a weak reference due to shortage of memory and > now WR is reachable only from the reference pending list but not > explicitly marked in the bitmap since objects promoted during the > initial mark pause are not marked immediately. > > The reason we are not saved by the SATB pre-barrier here is that > clearing of the referent field of a reference object does not trigger > the pre-barrier (and that would kind of defeat its purpose). > > Before JDK-8156500 this worked because the reference pending list was > a static field in the Reference class and the reference class was > scanned during concurrent marking, so we would never lose track of the > pending list head. > > My suggested fix is to explicitly mark the reference pending list head > oop during initial mark, after the reference enqueue phase. > This mirrors how other roots are handled in initial mark, see > G1Mark::G1MarkPromotedFromRoots. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0 > Bug: https://bugs.openjdk.java.net/browse/JDK-8185133 > > Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test. > > Many thanks to Kim and Erik ? for discussions around this issue! > > Thanks > /Mikael From sangheon.kim at oracle.com Fri Jul 28 16:10:20 2017 From: sangheon.kim at oracle.com (sangheon) Date: Fri, 28 Jul 2017 09:10:20 -0700 Subject: Need sponsor to push attached 8184045 into jdk10/hs/hostspt In-Reply-To: References: Message-ID: <097a40d0-b3e7-2a05-e59f-0b9b1b5965b7@oracle.com> Hi Alex, I can sponsor this patch. Thanks, Sangheon On 07/25/2017 07:24 AM, Alexander Harlap wrote: > I need a sponsor to push attached 8184045.patch - . > > Patch should go into jdk10/hs/hotspot > > Reviewed by Daniel D. Daugherty and Erik Helin. > > Thank you, > > Alex > From sangheon.kim at oracle.com Fri Jul 28 16:11:19 2017 From: sangheon.kim at oracle.com (sangheon) Date: Fri, 28 Jul 2017 09:11:19 -0700 Subject: Need sponsor to push attached 8183973 into jdk10/hs/hostspt In-Reply-To: <406c98a0-42a2-729b-1a95-a105c52f5dc5@oracle.com> References: <406c98a0-42a2-729b-1a95-a105c52f5dc5@oracle.com> Message-ID: <08d27861-e447-4f77-cb57-e0c90ba62b72@oracle.com> Hi Alex, I can sponsor this too. Thanks, Sangheon On 07/25/2017 10:37 AM, Alexander Harlap wrote: > I need a sponsor to push attached 8183973.patch - > gc/TestFullGCALot.java fails in JDK10-hs nightly > > Patch should go into jdk10/hs/hotspot > > Reviewed by Mikael Gerdin and Erik Osterlund. > > Thank you, > > Alex > From erik.osterlund at oracle.com Fri Jul 28 17:20:02 2017 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Fri, 28 Jul 2017 19:20:02 +0200 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com> References: <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com> Message-ID: <5420167D-79DF-4D66-8BC6-82B6D55A428D@oracle.com> Hi Roman, > On 28 Jul 2017, at 16:53, Roman Kennke wrote: > > Hi Mikael, > > I don't really understand what the problem is. The WR ends up on the > RPL, with its referent cleared, i.e. no longer pointing to the SR? But > we want to keep the SR alive? No. The WR gets promoted to old during the initial mark evacuation as it was strongly reachable by a SR in young. The referent of the WR died, and therefore it gets discovered. The assumption is then that since it was strongly reachable from the SR in young, the WR will be found during concurrent marking due to SATB. This is normally a safe assumption. However, just before finishing the initial mark pause and letting concurrent marking start trace through the heap, soft references may change strength to suddenly become weak. Therefore, the WR in old never gets marked during concurrent marking unless the GC is made aware of the existence of this new strong edge to the pending list head. This is a problem, because the pending list was in this scenario exposed to Java land through the pending list head, without the concurrent marking knowing about it, violating GC completeness. > Also, Universe::oops_do() already marks the RPL head, doesn't it? Reference processing is done after root processing. Therefore the edge to the pending list, created during reference processing, is not yet made available at that time. Hope that made sense. Thanks, /Erik > Roman > >> Hi all, >> >> Please review this fix to a tricky reference processing / conc marking >> bug affecting G1 in 9. >> >> The bug occurs when a weak reference WR is promoted to old and >> discovered during an initial mark pause. The WR is the referent of a >> soft reference SR. The concurrent reference processor determines that >> SR should be treated as a weak reference due to shortage of memory and >> now WR is reachable only from the reference pending list but not >> explicitly marked in the bitmap since objects promoted during the >> initial mark pause are not marked immediately. >> >> The reason we are not saved by the SATB pre-barrier here is that >> clearing of the referent field of a reference object does not trigger >> the pre-barrier (and that would kind of defeat its purpose). >> >> Before JDK-8156500 this worked because the reference pending list was >> a static field in the Reference class and the reference class was >> scanned during concurrent marking, so we would never lose track of the >> pending list head. >> >> My suggested fix is to explicitly mark the reference pending list head >> oop during initial mark, after the reference enqueue phase. >> This mirrors how other roots are handled in initial mark, see >> G1Mark::G1MarkPromotedFromRoots. >> >> Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0 >> Bug: https://bugs.openjdk.java.net/browse/JDK-8185133 >> >> Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test. >> >> Many thanks to Kim and Erik ? for discussions around this issue! >> >> Thanks >> /Mikael > > From kim.barrett at oracle.com Fri Jul 28 18:52:52 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 28 Jul 2017 14:52:52 -0400 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: References: Message-ID: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com> > On Jul 28, 2017, at 8:50 AM, Mikael Gerdin wrote: > > Hi all, > > Please review this fix to a tricky reference processing / conc marking bug affecting G1 in 9. > > The bug occurs when a weak reference WR is promoted to old and discovered during an initial mark pause. The WR is the referent of a soft reference SR. The concurrent reference processor determines that SR should be treated as a weak reference due to shortage of memory and now WR is reachable only from the reference pending list but not explicitly marked in the bitmap since objects promoted during the initial mark pause are not marked immediately. > > The reason we are not saved by the SATB pre-barrier here is that clearing of the referent field of a reference object does not trigger the pre-barrier (and that would kind of defeat its purpose). > > Before JDK-8156500 this worked because the reference pending list was a static field in the Reference class and the reference class was scanned during concurrent marking, so we would never lose track of the pending list head. > > My suggested fix is to explicitly mark the reference pending list head oop during initial mark, after the reference enqueue phase. > This mirrors how other roots are handled in initial mark, see G1Mark::G1MarkPromotedFromRoots. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0 > Bug: https://bugs.openjdk.java.net/browse/JDK-8185133 > > Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test. > > Many thanks to Kim and Erik ? for discussions around this issue! > > Thanks > /Mikael ------------------------------------------------------------------------------ src/share/vm/memory/universe.cpp 499 oop Universe::reference_pending_list() { 500 if (Thread::current()->is_VM_thread()) { 501 assert_pll_locked(is_locked); 502 } else { 503 assert_pll_ownership(); 504 } 505 return _reference_pending_list; 506 } I was afraid that conditionalization might be needed. I think I'd like distinct readers for the different locking context use cases. However, I'd be fine with such a distinction being deferred to JDK 10. ------------------------------------------------------------------------------ Looks good. From kim.barrett at oracle.com Fri Jul 28 19:53:43 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 28 Jul 2017 15:53:43 -0400 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: <5420167D-79DF-4D66-8BC6-82B6D55A428D@oracle.com> References: <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com> <5420167D-79DF-4D66-8BC6-82B6D55A428D@oracle.com> Message-ID: <2971B73D-8175-4E1B-9C73-BD83454CC024@oracle.com> > On Jul 28, 2017, at 1:20 PM, Erik Osterlund wrote: > > Hi Roman, > >> On 28 Jul 2017, at 16:53, Roman Kennke wrote: >> >> Hi Mikael, >> >> I don't really understand what the problem is. The WR ends up on the >> RPL, with its referent cleared, i.e. no longer pointing to the SR? But >> we want to keep the SR alive? > > No. The WR gets promoted to old during the initial mark evacuation as it was strongly reachable by a SR in young. The referent of the WR died, and therefore it gets discovered. The assumption is then that since it was strongly reachable from the SR in young, the WR will be found during concurrent marking due to SATB. This is normally a safe assumption. > > However, just before finishing the initial mark pause and letting concurrent marking start trace through the heap, soft references may change strength to suddenly become weak. Therefore, the WR in old never gets marked during concurrent marking unless the GC is made aware of the existence of this new strong edge to the pending list head. > > This is a problem, because the pending list was in this scenario exposed to Java land through the pending list head, without the concurrent marking knowing about it, violating GC completeness. I think SR also needs to be promoted by the initial-mark pause. If SR is young and not promoted, then it will be a survivor of the initial-mark pause, and so will be scanned by scan_root_regions. scan_root_regions doesn't do reference processing, so the scan of the survivor SR will mark WR. Here's my understanding of the problem scenario: (1) initial state SR => WR => O WR, and O are young WR and O are unreachable except through the chain from SR SR has not expired (2) initial_mark SR and WR are both promoted to oldgen. SR is not discovered, because it has not expired. WR is discovered and enqueued, because O is unreachable. WR ends up at the head of the pending list. This happens after the initial root scan has examined the head of the pending list. (3) SR expires We now have an oldgen WR in the pending list, and no certain path by which concurrent marking will reach it, even though it is accessible. (The Java reference processing thread might process and discard it before any damage is actually done, but that's far from certain.) So it requires a fairly unlikely sequence of events. Note: If WR ends up anywhere other than at the head of the pending list, it will eventually be visited, either by scan_root_region or normal concurrent marking, depending on its predecessor in the list. (Assuming its predecessor is not another similar case that *did* end up at the head of the list.) From kim.barrett at oracle.com Fri Jul 28 19:56:10 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 28 Jul 2017 15:56:10 -0400 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com> References: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com> Message-ID: <03C36633-1BC2-4787-A541-725ED87C15BE@oracle.com> > On Jul 28, 2017, at 2:52 PM, Kim Barrett wrote: > Looks good. Remember to update copyrights. From ioi.lam at oracle.com Mon Jul 31 04:07:50 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Sun, 30 Jul 2017 21:07:50 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> Message-ID: <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> Hi Jiangli, Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) stringTable.cpp: StringTable::archive_string add assert for DumpSharedSpaces only filemap.cpp 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, 526 int first_region, int num_regions) { When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: 537 int len = regions->length(); 538 if (len > 1) { 539 start = (char*)regions->at(1).start(); 540 size = (char*)regions->at(len - 1).end() - start; 541 } 542 } The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { if (first == MetaspaceShared::first_string) { assert(num_regons <= MetaspaceShared::max_strings, "..."); } else { assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); } .... 756 if (!string_data_mapped) { 757 StringTable::ignore_shared_strings(true); 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); 759 } 760 761 if (open_archive_heap_data_mapped) { 762 MetaspaceShared::set_open_archive_heap_region_mapped(); 763 } else { 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); 765 } Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? FileMapInfo::map_heap_data() -- 818 char* addr = (char*)regions[i].start(); 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, 820 addr, regions[i].byte_size(), si->_read_only, 821 si->_allow_exec); What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. constantPool.cpp Handle refs_handle; ... refs_handle = Handle(THREAD, (oop)archived); This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() I think it's more efficient if you merge these into a single statement Handle refs_handle(THREAD, (oop)archived); Is this experimental code? Maybe it should be removed? 664 if (tag_at(index).is_unresolved_klass()) { 665 #if 0 666 CPSlot entry = cp->slot_at(index); 667 Symbol* name = entry.get_symbol(); 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); 669 if (k != NULL) { 670 klass_at_put(index, k); 671 } 672 #endif 673 } else cpCache.hpp: u8 _archived_references shouldn't this be declared as an narrowOop to avoid the type casts when it's used? cpCache.cpp: add assert so that one of these is used only at dump time and the other only at run time? 610 oop ConstantPoolCache::archived_references() { 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); 612 } 613 614 void ConstantPoolCache::set_archived_references(oop o) { 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); 616 } Thanks! - Ioi On 7/27/17 1:37 PM, Jiangli Zhou wrote: > Sorry, the mail didn?t handle the rich text well. I fixed the format below. > > Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. > > The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. > RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 > hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ > whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ > > Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. > > Types of Pinned G1 Heap Regions > > The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. > > 00100 0 [ 8] Pinned Mask > 01000 0 [16] Old Mask > 10000 0 [32] Archive Mask > 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask > 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 > > > Pinned Regions > > Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. > > Archive Regions > > The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. > > An archive region is also an old region by design. > > Open Archive (GC-RW) Regions > > Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. > Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. > > Adjustable Outgoing Pointers > > As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. > > Closed Archive (GC-RO) Regions > > The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. > In JDK 9 we support archive Strings with the archive regions. > > The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. > > Dormant Objects > > Dormant objects are unreachable java objects within the open archive heap region. > A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. > > Object State Transition > > All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. > > Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. > > Caching Java Objects at Archive Dump Time > > The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. > > Caching Constant Pool resolved_references Array > > The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. > > All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. > At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. > > Runtime Java Heap With Cached Java Objects > > > The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. > > Preliminary test execution and status: > > JPRT: passed > Tier2-rt: passed > Tier2-gc: passed > Tier2-comp: passed > Tier3-rt: passed > Tier3-gc: passed > Tier3-comp: passed > Tier4-rt: passed > Tier4-gc: passed > Tier4-comp:6 jobs timed out, all other tests passed > Tier5-rt: one test failed but passed when running locally, all other tests passed > Tier5-gc: passed > Tier5-comp: running > hotspot_gc: two jobs timed out, all other tests passed > hotspot_gc in CDS mode: two jobs timed out, all other tests passed > vm.gc: passed > vm.gc in CDS mode: passed > Kichensink: passed > Kichensink in CDS mode: passed > > Thanks, > Jiangli From mikael.gerdin at oracle.com Mon Jul 31 05:40:19 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 31 Jul 2017 07:40:19 +0200 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: <2971B73D-8175-4E1B-9C73-BD83454CC024@oracle.com> References: <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com> <5420167D-79DF-4D66-8BC6-82B6D55A428D@oracle.com> <2971B73D-8175-4E1B-9C73-BD83454CC024@oracle.com> Message-ID: Hi Kim, On 2017-07-28 21:53, Kim Barrett wrote: >> On Jul 28, 2017, at 1:20 PM, Erik Osterlund wrote: >> >> Hi Roman, >> >>> On 28 Jul 2017, at 16:53, Roman Kennke wrote: >>> >>> Hi Mikael, >>> >>> I don't really understand what the problem is. The WR ends up on the >>> RPL, with its referent cleared, i.e. no longer pointing to the SR? But >>> we want to keep the SR alive? >> >> No. The WR gets promoted to old during the initial mark evacuation as it was strongly reachable by a SR in young. The referent of the WR died, and therefore it gets discovered. The assumption is then that since it was strongly reachable from the SR in young, the WR will be found during concurrent marking due to SATB. This is normally a safe assumption. >> >> However, just before finishing the initial mark pause and letting concurrent marking start trace through the heap, soft references may change strength to suddenly become weak. Therefore, the WR in old never gets marked during concurrent marking unless the GC is made aware of the existence of this new strong edge to the pending list head. >> >> This is a problem, because the pending list was in this scenario exposed to Java land through the pending list head, without the concurrent marking knowing about it, violating GC completeness. > > I think SR also needs to be promoted by the initial-mark pause. If SR > is young and not promoted, then it will be a survivor of the > initial-mark pause, and so will be scanned by scan_root_regions. > scan_root_regions doesn't do reference processing, so the scan of the > survivor SR will mark WR. > > Here's my understanding of the problem scenario: > > (1) initial state > > SR => WR => O > WR, and O are young > WR and O are unreachable except through the chain from SR > SR has not expired > > (2) initial_mark > > SR and WR are both promoted to oldgen. > SR is not discovered, because it has not expired. > WR is discovered and enqueued, because O is unreachable. > WR ends up at the head of the pending list. This happens after the > initial root scan has examined the head of the pending list. > > (3) SR expires > > We now have an oldgen WR in the pending list, and no certain path by > which concurrent marking will reach it, even though it is accessible. > (The Java reference processing thread might process and discard it > before any damage is actually done, but that's far from certain.) > > So it requires a fairly unlikely sequence of events. > > Note: If WR ends up anywhere other than at the head of the pending > list, it will eventually be visited, either by scan_root_region or > normal concurrent marking, depending on its predecessor in the list. > (Assuming its predecessor is not another similar case that *did* end > up at the head of the list.) Thanks for this detailed explanation. /Mikael > From mikael.gerdin at oracle.com Mon Jul 31 05:40:38 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 31 Jul 2017 07:40:38 +0200 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: <03C36633-1BC2-4787-A541-725ED87C15BE@oracle.com> References: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com> <03C36633-1BC2-4787-A541-725ED87C15BE@oracle.com> Message-ID: <817a3f9f-c78f-7093-59c4-164f63a495d9@oracle.com> Hi Kim, On 2017-07-28 21:56, Kim Barrett wrote: >> On Jul 28, 2017, at 2:52 PM, Kim Barrett wrote: >> Looks good. > > Remember to update copyrights. > Will do, thanks for the review! /Mikael From thomas.schatzl at oracle.com Mon Jul 31 13:25:23 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 31 Jul 2017 15:25:23 +0200 Subject: RFR (9) 8185133: Reference pending list root might not get marked In-Reply-To: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com> References: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com> Message-ID: <1501507523.2394.2.camel@oracle.com> Hi, On Fri, 2017-07-28 at 14:52 -0400, Kim Barrett wrote: > > > > On Jul 28, 2017, at 8:50 AM, Mikael Gerdin > m> wrote: > > > > Hi all, > > > > Please review this fix to a tricky reference processing / conc > > marking bug affecting G1 in 9. > > > > The bug occurs when a weak reference WR is promoted to old > > and[...] My suggested fix is to explicitly mark the reference > > pending list > > head oop during initial mark, after the reference enqueue phase. > > This mirrors how other roots are handled in initial mark, see > > G1Mark::G1MarkPromotedFromRoots. > > > > Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185133 > > > > Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test. > > > > Many thanks to Kim and Erik ? for discussions around this issue! > > > > Thanks > > /Mikael > ------------------------------------------------------------------- > -----------? > src/share/vm/memory/universe.cpp > ?499 oop Universe::reference_pending_list() { > ?500???if (Thread::current()->is_VM_thread()) { > ?501?????assert_pll_locked(is_locked); > ?502???} else { > ?503?????assert_pll_ownership(); > ?504???} > ?505???return _reference_pending_list; > ?506 } > > I was afraid that conditionalization might be needed. > > I think I'd like distinct readers for the different locking context > use cases.??However, I'd be fine with such a distinction being > deferred to JDK 10. > Agree with that this code looks ugly, I agree with Kim that fixing this can wait. Looks good, and great work. Thomas From daniel.daugherty at oracle.com Mon Jul 31 14:24:31 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 08:24:31 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) Message-ID: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> Greetings, I have a fix for the following P1 JDK10-hs integration_blocker bug: 8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly https://bugs.openjdk.java.net/browse/JDK-8185273 The fix is 2 lines and the comment describing the fix is 4 lines: src/share/vm/runtime/thread.cpp: L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) { L3395: // This function is used by ParallelSPCleanupTask in safepoint.cpp L3396: // for cleaning up JavaThreads, but we have to keep the VMThread's L3397: // _oops_do_parity field in sync so we don't miss a parallel GC on L3398: // the VMThread. L3399: VMThread* vmt = VMThread::vm_thread(); L3400: (void)vmt->claim_oops_do(true, cp); I'm also including some new logging for the VMThread (tag == 'vmthread') that came in useful during this bug hunt. Lastly, I've fixed a few minor typos that I ran across in the areas where I was hunting. Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ There's lots of discussion in the bug. The evaluation comment that I added on Sunday, July 30 is probably the most complete and hopefully the most clear. For context, here's the webrev for 8180932 and another bug fix: http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ http://cr.openjdk.java.net/~rkennke/8185102/webrev.01/ Testing: - JPRT - Test8004741.java has been running in a forever loop with 'fastdebug' bits (17200+ iterations) and 'slowdebug' bits (13400+ iterations) Comments, questions and feedback are welcome. Dan P.S. Roman and I were also thinking about updating Threads::assert_all_threads_claimed() to verify that the VMThread is also claimed... Obviously that's not part of the current patch... From shade at redhat.com Mon Jul 31 14:35:28 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 31 Jul 2017 16:35:28 +0200 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> Message-ID: On 07/31/2017 04:24 PM, Daniel D. Daugherty wrote: > Greetings, > > I have a fix for the following P1 JDK10-hs integration_blocker bug: > > 8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly > https://bugs.openjdk.java.net/browse/JDK-8185273 > > The fix is 2 lines and the comment describing the fix is 4 lines: > > src/share/vm/runtime/thread.cpp: > > L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) { > > L3395: // This function is used by ParallelSPCleanupTask in safepoint.cpp > L3396: // for cleaning up JavaThreads, but we have to keep the VMThread's > L3397: // _oops_do_parity field in sync so we don't miss a parallel GC on > L3398: // the VMThread. > L3399: VMThread* vmt = VMThread::vm_thread(); > L3400: (void)vmt->claim_oops_do(true, cp); > > I'm also including some new logging for the VMThread (tag == 'vmthread') > that came in useful during this bug hunt. Lastly, I've fixed a few minor > typos that I ran across in the areas where I was hunting. > > Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ Those changes make sense, thanks. It is probably worth mentioning that Threads::parallel_java_threads_do should be in sync with Threads::possibly_parallel_oops_do? It gets easier to point out the symmetry: possibly_parallel_... claims all Java threads and the VMThread, so this should also claim the VMThread. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Mon Jul 31 14:42:21 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 31 Jul 2017 16:42:21 +0200 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> Message-ID: Hi Dan, You could also do_thread() on the VMThread, and let the ThreadClosurer filter it. I believe the ThreadClosure in safepoint.cpp (currently only consumer) already filters it. This would make it consistent with Threads::possibly_parallel_oops_do() (and infact, that latter method could just use the new Threads::parallel_java_threads_do() but this is beyond the scope). I leave that to you to decide though. I'd also include the fix for assert_all_threads_claimed() because it's related (and the cause for me not noticing this slip). But that is up to you too. ;-) In other words, thumbs up, unless you want to add the above points. And sorry for making such a mess! Roman > Greetings, > > I have a fix for the following P1 JDK10-hs integration_blocker bug: > > 8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly > https://bugs.openjdk.java.net/browse/JDK-8185273 > > The fix is 2 lines and the comment describing the fix is 4 lines: > > src/share/vm/runtime/thread.cpp: > > L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) { > > L3395: // This function is used by ParallelSPCleanupTask in > safepoint.cpp > L3396: // for cleaning up JavaThreads, but we have to keep the > VMThread's > L3397: // _oops_do_parity field in sync so we don't miss a parallel > GC on > L3398: // the VMThread. > L3399: VMThread* vmt = VMThread::vm_thread(); > L3400: (void)vmt->claim_oops_do(true, cp); > > I'm also including some new logging for the VMThread (tag == 'vmthread') > that came in useful during this bug hunt. Lastly, I've fixed a few minor > typos that I ran across in the areas where I was hunting. > > Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ > > There's lots of discussion in the bug. The evaluation comment that I > added > on Sunday, July 30 is probably the most complete and hopefully the > most clear. > > For context, here's the webrev for 8180932 and another bug fix: > > http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ > http://cr.openjdk.java.net/~rkennke/8185102/webrev.01/ > > > Testing: > - JPRT > - Test8004741.java has been running in a forever loop with 'fastdebug' > bits (17200+ iterations) and 'slowdebug' bits (13400+ iterations) > > Comments, questions and feedback are welcome. > > Dan > > P.S. > Roman and I were also thinking about updating > Threads::assert_all_threads_claimed() to verify > that the VMThread is also claimed... Obviously > that's not part of the current patch... From daniel.daugherty at oracle.com Mon Jul 31 15:09:38 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 09:09:38 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> Message-ID: <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> On 7/31/17 8:35 AM, Aleksey Shipilev wrote: > On 07/31/2017 04:24 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a fix for the following P1 JDK10-hs integration_blocker bug: >> >> 8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly >> https://bugs.openjdk.java.net/browse/JDK-8185273 >> >> The fix is 2 lines and the comment describing the fix is 4 lines: >> >> src/share/vm/runtime/thread.cpp: >> >> L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) { >> >> L3395: // This function is used by ParallelSPCleanupTask in safepoint.cpp >> L3396: // for cleaning up JavaThreads, but we have to keep the VMThread's >> L3397: // _oops_do_parity field in sync so we don't miss a parallel GC on >> L3398: // the VMThread. >> L3399: VMThread* vmt = VMThread::vm_thread(); >> L3400: (void)vmt->claim_oops_do(true, cp); >> >> I'm also including some new logging for the VMThread (tag == 'vmthread') >> that came in useful during this bug hunt. Lastly, I've fixed a few minor >> typos that I ran across in the areas where I was hunting. >> >> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ > Those changes make sense, thanks. Thanks for the fast review! > It is probably worth mentioning that Threads::parallel_java_threads_do should be in sync with > Threads::possibly_parallel_oops_do? It gets easier to point out the symmetry: possibly_parallel_... > claims all Java threads and the VMThread, so this should also claim the VMThread. We would have to be careful about how we phrase that. Threads::possibly_parallel_oops_do() claims and applies the closure to all the threads it claims. Threads::parallel_java_threads_do() is missing the claim for the VMThread (this bug), but does not apply the closure to the VMThread. I think we'll be in good shape once Threads::assert_all_threads_claimed() is updated to make sure that the VMThread is claimed. Once that happens, anyone that uses StrongRootsScope to manage the "claim" protocol will have a sanity check in place. Dan > > -Aleksey > From daniel.daugherty at oracle.com Mon Jul 31 15:14:45 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 09:14:45 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> Message-ID: On 7/31/17 8:42 AM, Roman Kennke wrote: > Hi Dan, > > You could also do_thread() on the VMThread, and let the ThreadClosurer > filter it. I believe the ThreadClosure in safepoint.cpp (currently only > consumer) already filters it. This would make it consistent with > Threads::possibly_parallel_oops_do() (and infact, that latter method > could just use the new Threads::parallel_java_threads_do() but this is > beyond the scope). I leave that to you to decide though. I'm good with just adding the missing part of the "claims" protocol. I'm not comfortable with applying the closure to the VMThread since I'm just visiting the GC sandbox as it were... :-) > I'd also include the fix for assert_all_threads_claimed() because it's > related (and the cause for me not noticing this slip). But that is up to > you too. ;-) Yes, I plan to kick off another JPRT run with the additional fix for assert_all_threads_claimed()... If that goes well, then I'll include it... > > In other words, thumbs up, unless you want to add the above points. Thanks for the review! > And sorry for making such a mess! No worries. We have it covered. Dan P.S. Reminder: you're supposed to be on vacation! (But I do appreciate you taking the time to chime in here...) > > Roman > >> Greetings, >> >> I have a fix for the following P1 JDK10-hs integration_blocker bug: >> >> 8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly >> https://bugs.openjdk.java.net/browse/JDK-8185273 >> >> The fix is 2 lines and the comment describing the fix is 4 lines: >> >> src/share/vm/runtime/thread.cpp: >> >> L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) { >> >> L3395: // This function is used by ParallelSPCleanupTask in >> safepoint.cpp >> L3396: // for cleaning up JavaThreads, but we have to keep the >> VMThread's >> L3397: // _oops_do_parity field in sync so we don't miss a parallel >> GC on >> L3398: // the VMThread. >> L3399: VMThread* vmt = VMThread::vm_thread(); >> L3400: (void)vmt->claim_oops_do(true, cp); >> >> I'm also including some new logging for the VMThread (tag == 'vmthread') >> that came in useful during this bug hunt. Lastly, I've fixed a few minor >> typos that I ran across in the areas where I was hunting. >> >> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ >> >> There's lots of discussion in the bug. The evaluation comment that I >> added >> on Sunday, July 30 is probably the most complete and hopefully the >> most clear. >> >> For context, here's the webrev for 8180932 and another bug fix: >> >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ >> http://cr.openjdk.java.net/~rkennke/8185102/webrev.01/ >> >> >> Testing: >> - JPRT >> - Test8004741.java has been running in a forever loop with 'fastdebug' >> bits (17200+ iterations) and 'slowdebug' bits (13400+ iterations) >> >> Comments, questions and feedback are welcome. >> >> Dan >> >> P.S. >> Roman and I were also thinking about updating >> Threads::assert_all_threads_claimed() to verify >> that the VMThread is also claimed... Obviously >> that's not part of the current patch... > From shade at redhat.com Mon Jul 31 15:43:42 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 31 Jul 2017 17:43:42 +0200 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> Message-ID: On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote: > On 7/31/17 8:35 AM, Aleksey Shipilev wrote: >>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ >> Those changes make sense, thanks. > > Thanks for the fast review! > > >> It is probably worth mentioning that Threads::parallel_java_threads_do should be in sync with >> Threads::possibly_parallel_oops_do? It gets easier to point out the symmetry: possibly_parallel_... >> claims all Java threads and the VMThread, so this should also claim the VMThread. > > We would have to be careful about how we phrase that. > Threads::possibly_parallel_oops_do() claims and applies > the closure to all the threads it claims. > > Threads::parallel_java_threads_do() is missing the claim > for the VMThread (this bug), but does not apply the > closure to the VMThread. Yeah. It's just I had to work upwards from the gory details explained in the comment to the actual setup for the bug to appear. I think details about ParallelSPCleanupTask, safepoint.cpp, parity, etc. are too low-level here, and capture only the current state of affairs. E.g. what if there are more callers to parallel_java_threads_do in future? What if Parallel SP cleanup ceases to call it? Would the comment get outdated? Does Threads::parallel_java_threads_do make sense without Parallel SP cleanup? Yes, it does. Would it make sense to cherry-pick it somewhere else with that comment as stated? Not really. AFAIU, the high-level bug is because we have to claim the same subset of threads on all paths. From that, it becomes obvious that if possibly_parallel_java_threads_do claims VMThread, all other paths should claim it too. Something like this: Threads::parallel_java_threads_do(ThreadClosure* tc) { ... // Thread claiming protocol requires us to claim the same interesting threads // on all paths. Notably, Threads::possibly_parallel_threads_do claims all // Java threads *and* the VMThread. To avoid breaking the claiming protocol, // we have to appear to claim VMThread on this path too, even if we would not // process the VMThread oops. VMThread* vmt = VMThread::vm_thread(); (void)vmt->claim_oops_do(true, cp); ...and then the assert fix would seal the deal. Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From daniel.daugherty at oracle.com Mon Jul 31 16:42:53 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 10:42:53 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> Message-ID: <3c7259d1-22ab-c407-3259-a59a71893b13@oracle.com> On 7/31/17 9:14 AM, Daniel D. Daugherty wrote: > On 7/31/17 8:42 AM, Roman Kennke wrote: >> Hi Dan, >> >> You could also do_thread() on the VMThread, and let the ThreadClosurer >> filter it. I believe the ThreadClosure in safepoint.cpp (currently only >> consumer) already filters it. This would make it consistent with >> Threads::possibly_parallel_oops_do() (and infact, that latter method >> could just use the new Threads::parallel_java_threads_do() but this is >> beyond the scope). I leave that to you to decide though. > > I'm good with just adding the missing part of the "claims" protocol. > I'm not comfortable with applying the closure to the VMThread since > I'm just visiting the GC sandbox as it were... :-) > >> I'd also include the fix for assert_all_threads_claimed() because it's >> related (and the cause for me not noticing this slip). But that is up to >> you too. ;-) > > Yes, I plan to kick off another JPRT run with the additional fix for > assert_all_threads_claimed()... If that goes well, then I'll include > it... Here's the addition of the assert: $ diff -C 6 src/share/vm/runtime/thread.cpp.cr0 src/share/vm/runtime/thread.cpp *** src/share/vm/runtime/thread.cpp.cr0 Sun Jul 30 18:49:06 2017 --- src/share/vm/runtime/thread.cpp Mon Jul 31 08:22:47 2017 *************** *** 4360,4371 **** --- 4360,4375 ---- void Threads::assert_all_threads_claimed() { ALL_JAVA_THREADS(p) { const int thread_parity = p->oops_do_parity(); assert((thread_parity == _thread_claim_parity), "Thread " PTR_FORMAT " has incorrect parity %d != %d", p2i(p), thread_parity, _thread_claim_parity); } + VMThread* vmt = VMThread::vm_thread(); + const int thread_parity = vmt->oops_do_parity(); + assert((thread_parity == _thread_claim_parity), + "VMThread " PTR_FORMAT " has incorrect parity %d != %d", p2i(vmt), thread_parity, _thread_claim_parity); } #endif // ASSERT void Threads::possibly_parallel_oops_do(bool is_par, OopClosure* f, CodeBlobClosure* cf) { int cp = Threads::thread_claim_parity(); ALL_JAVA_THREADS(p) { I ran a test JPRT job and there were no problems. Aleksey and Roman, are you two good with the assert? Dan > >> >> In other words, thumbs up, unless you want to add the above points. > > Thanks for the review! > > >> And sorry for making such a mess! > > No worries. We have it covered. > > Dan > > P.S. > Reminder: you're supposed to be on vacation! (But I do appreciate > you taking the time to chime in here...) > > >> >> Roman >> >>> Greetings, >>> >>> I have a fix for the following P1 JDK10-hs integration_blocker bug: >>> >>> 8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly >>> https://bugs.openjdk.java.net/browse/JDK-8185273 >>> >>> The fix is 2 lines and the comment describing the fix is 4 lines: >>> >>> src/share/vm/runtime/thread.cpp: >>> >>> L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) { >>> >>> L3395: // This function is used by ParallelSPCleanupTask in >>> safepoint.cpp >>> L3396: // for cleaning up JavaThreads, but we have to keep the >>> VMThread's >>> L3397: // _oops_do_parity field in sync so we don't miss a parallel >>> GC on >>> L3398: // the VMThread. >>> L3399: VMThread* vmt = VMThread::vm_thread(); >>> L3400: (void)vmt->claim_oops_do(true, cp); >>> >>> I'm also including some new logging for the VMThread (tag == >>> 'vmthread') >>> that came in useful during this bug hunt. Lastly, I've fixed a few >>> minor >>> typos that I ran across in the areas where I was hunting. >>> >>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ >>> >>> There's lots of discussion in the bug. The evaluation comment that I >>> added >>> on Sunday, July 30 is probably the most complete and hopefully the >>> most clear. >>> >>> For context, here's the webrev for 8180932 and another bug fix: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ >>> http://cr.openjdk.java.net/~rkennke/8185102/webrev.01/ >>> >>> >>> Testing: >>> - JPRT >>> - Test8004741.java has been running in a forever loop with >>> 'fastdebug' >>> bits (17200+ iterations) and 'slowdebug' bits (13400+ iterations) >>> >>> Comments, questions and feedback are welcome. >>> >>> Dan >>> >>> P.S. >>> Roman and I were also thinking about updating >>> Threads::assert_all_threads_claimed() to verify >>> that the VMThread is also claimed... Obviously >>> that's not part of the current patch... >> > > From rkennke at redhat.com Mon Jul 31 16:46:14 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 31 Jul 2017 18:46:14 +0200 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> Message-ID: <2b846656-a319-a2b3-9c32-554498e5f7ba@redhat.com> Am 31.07.2017 um 17:14 schrieb Daniel D. Daugherty: > On 7/31/17 8:42 AM, Roman Kennke wrote: >> Hi Dan, >> >> You could also do_thread() on the VMThread, and let the ThreadClosurer >> filter it. I believe the ThreadClosure in safepoint.cpp (currently only >> consumer) already filters it. This would make it consistent with >> Threads::possibly_parallel_oops_do() (and infact, that latter method >> could just use the new Threads::parallel_java_threads_do() but this is >> beyond the scope). I leave that to you to decide though. > > I'm good with just adding the missing part of the "claims" protocol. > I'm not comfortable with applying the closure to the VMThread since > I'm just visiting the GC sandbox as it were... :-) Ok. I'll do it in a followup then. IMO it would be best if there is *one* place that does the claiming protocol (i.e. parallel_java_threads_do() which should probably be renamed to parallel_threads_do() ), and have possibly_parallel_oops_do() use that via a private ThreadClosure. Best to do it asap, as long as there's only 1 user it's easy to see that it's correct ;-) >> I'd also include the fix for assert_all_threads_claimed() because it's >> related (and the cause for me not noticing this slip). But that is up to >> you too. ;-) > > Yes, I plan to kick off another JPRT run with the additional fix for > assert_all_threads_claimed()... If that goes well, then I'll include > it... Great! > > P.S. > Reminder: you're supposed to be on vacation! (But I do appreciate > you taking the time to chime in here...) Yeah, I should be at the Atlantic already, but my son got sick and we have to delay travel a little bit... And in reply to Aleksey: yes there will be more callers of Threads::parallel_java_threads_do() in the future :-) We've got one in Shenandoah already... Thanks for doing this! Cheers, Roman From daniel.daugherty at oracle.com Mon Jul 31 16:47:02 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 10:47:02 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> Message-ID: <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> On 7/31/17 9:43 AM, Aleksey Shipilev wrote: > On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote: >> On 7/31/17 8:35 AM, Aleksey Shipilev wrote: >>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ >>> Those changes make sense, thanks. >> Thanks for the fast review! >> >> >>> It is probably worth mentioning that Threads::parallel_java_threads_do should be in sync with >>> Threads::possibly_parallel_oops_do? It gets easier to point out the symmetry: possibly_parallel_... >>> claims all Java threads and the VMThread, so this should also claim the VMThread. >> We would have to be careful about how we phrase that. >> Threads::possibly_parallel_oops_do() claims and applies >> the closure to all the threads it claims. >> >> Threads::parallel_java_threads_do() is missing the claim >> for the VMThread (this bug), but does not apply the >> closure to the VMThread. > Yeah. It's just I had to work upwards from the gory details explained in the comment to the actual > setup for the bug to appear. I think details about ParallelSPCleanupTask, safepoint.cpp, parity, > etc. are too low-level here, and capture only the current state of affairs. E.g. what if there are > more callers to parallel_java_threads_do in future? What if Parallel SP cleanup ceases to call it? > Would the comment get outdated? Does Threads::parallel_java_threads_do make sense without Parallel > SP cleanup? Yes, it does. Would it make sense to cherry-pick it somewhere else with that comment as > stated? Not really. > > AFAIU, the high-level bug is because we have to claim the same subset of threads on all paths. From > that, it becomes obvious that if possibly_parallel_java_threads_do claims VMThread, all other paths > should claim it too. > > Something like this: > > Threads::parallel_java_threads_do(ThreadClosure* tc) { > ... > > // Thread claiming protocol requires us to claim the same interesting threads > // on all paths. Notably, Threads::possibly_parallel_threads_do claims all > // Java threads *and* the VMThread. To avoid breaking the claiming protocol, > // we have to appear to claim VMThread on this path too, even if we would not > // process the VMThread oops. > VMThread* vmt = VMThread::vm_thread(); > (void)vmt->claim_oops_do(true, cp); I like your comment better than mine, with a slight tweak: // Thread claiming protocol requires us to claim the same interesting threads // on all paths. Notably, Threads::possibly_parallel_threads_do claims all // Java threads *and* the VMThread. To avoid breaking the claiming protocol, // we have to claim VMThread on this path too, even if we do not apply the // closure to the VMThread. > > ...and then the assert fix would seal the deal. The assert diffs are applied and tested via JPRT. Please see my other e-mail on this thread... Dan > > Thanks, > -Aleksey > From daniel.daugherty at oracle.com Mon Jul 31 16:50:24 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 10:50:24 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <2b846656-a319-a2b3-9c32-554498e5f7ba@redhat.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <2b846656-a319-a2b3-9c32-554498e5f7ba@redhat.com> Message-ID: <4eb7bf5f-bf1c-3bf6-41fa-59a3c4e8f69b@oracle.com> On 7/31/17 10:46 AM, Roman Kennke wrote: > Am 31.07.2017 um 17:14 schrieb Daniel D. Daugherty: >> On 7/31/17 8:42 AM, Roman Kennke wrote: >>> Hi Dan, >>> >>> You could also do_thread() on the VMThread, and let the ThreadClosurer >>> filter it. I believe the ThreadClosure in safepoint.cpp (currently only >>> consumer) already filters it. This would make it consistent with >>> Threads::possibly_parallel_oops_do() (and infact, that latter method >>> could just use the new Threads::parallel_java_threads_do() but this is >>> beyond the scope). I leave that to you to decide though. >> I'm good with just adding the missing part of the "claims" protocol. >> I'm not comfortable with applying the closure to the VMThread since >> I'm just visiting the GC sandbox as it were... :-) > Ok. I'll do it in a followup then. IMO it would be best if there is > *one* place that does the claiming protocol (i.e. > parallel_java_threads_do() which should probably be renamed to > parallel_threads_do() ), and have possibly_parallel_oops_do() use that > via a private ThreadClosure. Best to do it asap, as long as there's only > 1 user it's easy to see that it's correct ;-) Thanks. I can file a follow up bug: cleanup parallel_java_threads_do() and possibly_parallel_oops_do() and assign it to you if you like... >>> I'd also include the fix for assert_all_threads_claimed() because it's >>> related (and the cause for me not noticing this slip). But that is up to >>> you too. ;-) >> Yes, I plan to kick off another JPRT run with the additional fix for >> assert_all_threads_claimed()... If that goes well, then I'll include >> it... > Great! Done. And I sent out the diffs... >> P.S. >> Reminder: you're supposed to be on vacation! (But I do appreciate >> you taking the time to chime in here...) > Yeah, I should be at the Atlantic already, but my son got sick and we > have to delay travel a little bit... Hope your son gets well soon... > And in reply to Aleksey: yes there will be more callers of > Threads::parallel_java_threads_do() in the future :-) We've got one in > Shenandoah already... > > Thanks for doing this! > Cheers, > Roman > Dan From daniel.daugherty at oracle.com Mon Jul 31 17:07:14 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 11:07:14 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> Message-ID: <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ Only src/share/vm/runtime/thread.cpp is changed relative to round 0: - Revised the comment in Threads::parallel_java_threads_do. - Added the assert to Threads::assert_all_threads_claimed(). Comments, questions and feedback are welcome. Dan On 7/31/17 10:47 AM, Daniel D. Daugherty wrote: > On 7/31/17 9:43 AM, Aleksey Shipilev wrote: >> On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote: >>> On 7/31/17 8:35 AM, Aleksey Shipilev wrote: >>>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ >>>> Those changes make sense, thanks. >>> Thanks for the fast review! >>> >>> >>>> It is probably worth mentioning that >>>> Threads::parallel_java_threads_do should be in sync with >>>> Threads::possibly_parallel_oops_do? It gets easier to point out the >>>> symmetry: possibly_parallel_... >>>> claims all Java threads and the VMThread, so this should also claim >>>> the VMThread. >>> We would have to be careful about how we phrase that. >>> Threads::possibly_parallel_oops_do() claims and applies >>> the closure to all the threads it claims. >>> >>> Threads::parallel_java_threads_do() is missing the claim >>> for the VMThread (this bug), but does not apply the >>> closure to the VMThread. >> Yeah. It's just I had to work upwards from the gory details explained >> in the comment to the actual >> setup for the bug to appear. I think details about >> ParallelSPCleanupTask, safepoint.cpp, parity, >> etc. are too low-level here, and capture only the current state of >> affairs. E.g. what if there are >> more callers to parallel_java_threads_do in future? What if Parallel >> SP cleanup ceases to call it? >> Would the comment get outdated? Does >> Threads::parallel_java_threads_do make sense without Parallel >> SP cleanup? Yes, it does. Would it make sense to cherry-pick it >> somewhere else with that comment as >> stated? Not really. >> >> AFAIU, the high-level bug is because we have to claim the same subset >> of threads on all paths. From >> that, it becomes obvious that if possibly_parallel_java_threads_do >> claims VMThread, all other paths >> should claim it too. >> >> Something like this: >> >> Threads::parallel_java_threads_do(ThreadClosure* tc) { >> ... >> >> // Thread claiming protocol requires us to claim the same >> interesting threads >> // on all paths. Notably, Threads::possibly_parallel_threads_do >> claims all >> // Java threads *and* the VMThread. To avoid breaking the >> claiming protocol, >> // we have to appear to claim VMThread on this path too, even if >> we would not >> // process the VMThread oops. >> VMThread* vmt = VMThread::vm_thread(); >> (void)vmt->claim_oops_do(true, cp); > > I like your comment better than mine, with a slight tweak: > > // Thread claiming protocol requires us to claim the same > interesting threads > // on all paths. Notably, Threads::possibly_parallel_threads_do > claims all > // Java threads *and* the VMThread. To avoid breaking the claiming > protocol, > // we have to claim VMThread on this path too, even if we do not > apply the > // closure to the VMThread. > >> >> ...and then the assert fix would seal the deal. > > The assert diffs are applied and tested via JPRT. Please see my > other e-mail on this thread... > > Dan > > >> >> Thanks, >> -Aleksey >> > > From rkennke at redhat.com Mon Jul 31 17:10:54 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 31 Jul 2017 19:10:54 +0200 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> Message-ID: <436d8741-a387-8442-1133-cb0a3e78f8c3@redhat.com> Looks good! Roman (not an official reviewer) PS: I've filed JDK-8185580: Refactor Threads::possibly_parallel_oops_do() to use Threads::parallel_java_threads_do() to take care of the rest for when I get back from vacation. > Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ > > Only src/share/vm/runtime/thread.cpp is changed relative to round 0: > > - Revised the comment in Threads::parallel_java_threads_do. > - Added the assert to Threads::assert_all_threads_claimed(). > > Comments, questions and feedback are welcome. > > Dan > > > On 7/31/17 10:47 AM, Daniel D. Daugherty wrote: >> On 7/31/17 9:43 AM, Aleksey Shipilev wrote: >>> On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote: >>>> On 7/31/17 8:35 AM, Aleksey Shipilev wrote: >>>>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ >>>>> Those changes make sense, thanks. >>>> Thanks for the fast review! >>>> >>>> >>>>> It is probably worth mentioning that >>>>> Threads::parallel_java_threads_do should be in sync with >>>>> Threads::possibly_parallel_oops_do? It gets easier to point out >>>>> the symmetry: possibly_parallel_... >>>>> claims all Java threads and the VMThread, so this should also >>>>> claim the VMThread. >>>> We would have to be careful about how we phrase that. >>>> Threads::possibly_parallel_oops_do() claims and applies >>>> the closure to all the threads it claims. >>>> >>>> Threads::parallel_java_threads_do() is missing the claim >>>> for the VMThread (this bug), but does not apply the >>>> closure to the VMThread. >>> Yeah. It's just I had to work upwards from the gory details >>> explained in the comment to the actual >>> setup for the bug to appear. I think details about >>> ParallelSPCleanupTask, safepoint.cpp, parity, >>> etc. are too low-level here, and capture only the current state of >>> affairs. E.g. what if there are >>> more callers to parallel_java_threads_do in future? What if Parallel >>> SP cleanup ceases to call it? >>> Would the comment get outdated? Does >>> Threads::parallel_java_threads_do make sense without Parallel >>> SP cleanup? Yes, it does. Would it make sense to cherry-pick it >>> somewhere else with that comment as >>> stated? Not really. >>> >>> AFAIU, the high-level bug is because we have to claim the same >>> subset of threads on all paths. From >>> that, it becomes obvious that if possibly_parallel_java_threads_do >>> claims VMThread, all other paths >>> should claim it too. >>> >>> Something like this: >>> >>> Threads::parallel_java_threads_do(ThreadClosure* tc) { >>> ... >>> >>> // Thread claiming protocol requires us to claim the same >>> interesting threads >>> // on all paths. Notably, Threads::possibly_parallel_threads_do >>> claims all >>> // Java threads *and* the VMThread. To avoid breaking the >>> claiming protocol, >>> // we have to appear to claim VMThread on this path too, even if >>> we would not >>> // process the VMThread oops. >>> VMThread* vmt = VMThread::vm_thread(); >>> (void)vmt->claim_oops_do(true, cp); >> >> I like your comment better than mine, with a slight tweak: >> >> // Thread claiming protocol requires us to claim the same >> interesting threads >> // on all paths. Notably, Threads::possibly_parallel_threads_do >> claims all >> // Java threads *and* the VMThread. To avoid breaking the claiming >> protocol, >> // we have to claim VMThread on this path too, even if we do not >> apply the >> // closure to the VMThread. >> >>> >>> ...and then the assert fix would seal the deal. >> >> The assert diffs are applied and tested via JPRT. Please see my >> other e-mail on this thread... >> >> Dan >> >> >>> >>> Thanks, >>> -Aleksey >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Mon Jul 31 17:13:25 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 11:13:25 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <436d8741-a387-8442-1133-cb0a3e78f8c3@redhat.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> <436d8741-a387-8442-1133-cb0a3e78f8c3@redhat.com> Message-ID: Thanks Roman! Dan On 7/31/17 11:10 AM, Roman Kennke wrote: > Looks good! > > Roman (not an official reviewer) > > PS: I've filed JDK-8185580: Refactor > Threads::possibly_parallel_oops_do() to use > Threads::parallel_java_threads_do() > to take care of the > rest for when I get back from vacation. > > >> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ >> >> Only src/share/vm/runtime/thread.cpp is changed relative to round 0: >> >> - Revised the comment in Threads::parallel_java_threads_do. >> - Added the assert to Threads::assert_all_threads_claimed(). >> >> Comments, questions and feedback are welcome. >> >> Dan >> >> >> On 7/31/17 10:47 AM, Daniel D. Daugherty wrote: >>> On 7/31/17 9:43 AM, Aleksey Shipilev wrote: >>>> On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote: >>>>> On 7/31/17 8:35 AM, Aleksey Shipilev wrote: >>>>>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/ >>>>>> Those changes make sense, thanks. >>>>> Thanks for the fast review! >>>>> >>>>> >>>>>> It is probably worth mentioning that >>>>>> Threads::parallel_java_threads_do should be in sync with >>>>>> Threads::possibly_parallel_oops_do? It gets easier to point out >>>>>> the symmetry: possibly_parallel_... >>>>>> claims all Java threads and the VMThread, so this should also >>>>>> claim the VMThread. >>>>> We would have to be careful about how we phrase that. >>>>> Threads::possibly_parallel_oops_do() claims and applies >>>>> the closure to all the threads it claims. >>>>> >>>>> Threads::parallel_java_threads_do() is missing the claim >>>>> for the VMThread (this bug), but does not apply the >>>>> closure to the VMThread. >>>> Yeah. It's just I had to work upwards from the gory details >>>> explained in the comment to the actual >>>> setup for the bug to appear. I think details about >>>> ParallelSPCleanupTask, safepoint.cpp, parity, >>>> etc. are too low-level here, and capture only the current state of >>>> affairs. E.g. what if there are >>>> more callers to parallel_java_threads_do in future? What if >>>> Parallel SP cleanup ceases to call it? >>>> Would the comment get outdated? Does >>>> Threads::parallel_java_threads_do make sense without Parallel >>>> SP cleanup? Yes, it does. Would it make sense to cherry-pick it >>>> somewhere else with that comment as >>>> stated? Not really. >>>> >>>> AFAIU, the high-level bug is because we have to claim the same >>>> subset of threads on all paths. From >>>> that, it becomes obvious that if possibly_parallel_java_threads_do >>>> claims VMThread, all other paths >>>> should claim it too. >>>> >>>> Something like this: >>>> >>>> Threads::parallel_java_threads_do(ThreadClosure* tc) { >>>> ... >>>> >>>> // Thread claiming protocol requires us to claim the same >>>> interesting threads >>>> // on all paths. Notably, Threads::possibly_parallel_threads_do >>>> claims all >>>> // Java threads *and* the VMThread. To avoid breaking the >>>> claiming protocol, >>>> // we have to appear to claim VMThread on this path too, even >>>> if we would not >>>> // process the VMThread oops. >>>> VMThread* vmt = VMThread::vm_thread(); >>>> (void)vmt->claim_oops_do(true, cp); >>> >>> I like your comment better than mine, with a slight tweak: >>> >>> // Thread claiming protocol requires us to claim the same >>> interesting threads >>> // on all paths. Notably, Threads::possibly_parallel_threads_do >>> claims all >>> // Java threads *and* the VMThread. To avoid breaking the >>> claiming protocol, >>> // we have to claim VMThread on this path too, even if we do not >>> apply the >>> // closure to the VMThread. >>> >>>> >>>> ...and then the assert fix would seal the deal. >>> >>> The assert diffs are applied and tested via JPRT. Please see my >>> other e-mail on this thread... >>> >>> Dan >>> >>> >>>> >>>> Thanks, >>>> -Aleksey >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shade at redhat.com Mon Jul 31 17:24:11 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 31 Jul 2017 19:24:11 +0200 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> Message-ID: <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote: > Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ > > Only src/share/vm/runtime/thread.cpp is changed relative to round 0: > > - Revised the comment in Threads::parallel_java_threads_do. > - Added the assert to Threads::assert_all_threads_claimed(). > > Comments, questions and feedback are welcome. Looks good! -Aleksey P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From daniel.daugherty at oracle.com Mon Jul 31 17:26:11 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 11:26:11 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com> Message-ID: <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com> Thanks for the re-review! (and for the reworded comment...) Dan On 7/31/17 11:24 AM, Aleksey Shipilev wrote: > On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote: >> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ >> >> Only src/share/vm/runtime/thread.cpp is changed relative to round 0: >> >> - Revised the comment in Threads::parallel_java_threads_do. >> - Added the assert to Threads::assert_all_threads_claimed(). >> >> Comments, questions and feedback are welcome. > Looks good! > > -Aleksey > > P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs. > From vladimir.kozlov at oracle.com Mon Jul 31 18:39:35 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 31 Jul 2017 11:39:35 -0700 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com> <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com> Message-ID: Dan Can you put new code which used for assert check under #ifdef ASSERT to avoid side effects in product code? Thanks Vladimir > On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty wrote: > > Thanks for the re-review! (and for the reworded comment...) > > Dan > >> On 7/31/17 11:24 AM, Aleksey Shipilev wrote: >>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote: >>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ >>> >>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0: >>> >>> - Revised the comment in Threads::parallel_java_threads_do. >>> - Added the assert to Threads::assert_all_threads_claimed(). >>> >>> Comments, questions and feedback are welcome. >> Looks good! >> >> -Aleksey >> >> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs. >> > From daniel.daugherty at oracle.com Mon Jul 31 18:49:36 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 12:49:36 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com> <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com> Message-ID: <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com> On 7/31/17 12:39 PM, Vladimir Kozlov wrote: > Dan > > Can you put new code which used for assert check under #ifdef ASSERT to avoid side effects in product code? That entire function is in a #ifdef ASSERT: 4360 #ifdef ASSERT 4361 void Threads::assert_all_threads_claimed() { 4362 ALL_JAVA_THREADS(p) { 4363 const int thread_parity = p->oops_do_parity(); 4364 assert((thread_parity == _thread_claim_parity), 4365 "Thread " PTR_FORMAT " has incorrect parity %d != %d", p2i(p), thread_parity, _thread_claim_parity); 4366 } 4367 VMThread* vmt = VMThread::vm_thread(); 4368 const int thread_parity = vmt->oops_do_parity(); 4369 assert((thread_parity == _thread_claim_parity), 4370 "VMThread " PTR_FORMAT " has incorrect parity %d != %d", p2i(vmt), thread_parity, _thread_claim_parity); 4371 } 4372 #endif // ASSERT Thanks for the review! Dan > > Thanks > Vladimir > >> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty wrote: >> >> Thanks for the re-review! (and for the reworded comment...) >> >> Dan >> >>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote: >>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote: >>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ >>>> >>>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0: >>>> >>>> - Revised the comment in Threads::parallel_java_threads_do. >>>> - Added the assert to Threads::assert_all_threads_claimed(). >>>> >>>> Comments, questions and feedback are welcome. >>> Looks good! >>> >>> -Aleksey >>> >>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs. >>> From vladimir.kozlov at oracle.com Mon Jul 31 20:14:46 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 31 Jul 2017 13:14:46 -0700 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com> <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com> <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com> Message-ID: <07323FAA-AACA-4CA4-ADFF-73816102C23C@oracle.com> This is what happens when you do review on phone ;) Sorry for noise. Looks good. Vladimir Sent from my iPhone > On Jul 31, 2017, at 11:49 AM, Daniel D. Daugherty wrote: > >> On 7/31/17 12:39 PM, Vladimir Kozlov wrote: >> Dan >> >> Can you put new code which used for assert check under #ifdef ASSERT to avoid side effects in product code? > > That entire function is in a #ifdef ASSERT: > > 4360 #ifdef ASSERT > 4361 void Threads::assert_all_threads_claimed() { > 4362 ALL_JAVA_THREADS(p) { > 4363 const int thread_parity = p->oops_do_parity(); > 4364 assert((thread_parity == _thread_claim_parity), > 4365 "Thread " PTR_FORMAT " has incorrect parity %d != %d", p2i(p), thread_parity, _thread_claim_parity); > 4366 } > 4367 VMThread* vmt = VMThread::vm_thread(); > 4368 const int thread_parity = vmt->oops_do_parity(); > 4369 assert((thread_parity == _thread_claim_parity), > 4370 "VMThread " PTR_FORMAT " has incorrect parity %d != %d", p2i(vmt), thread_parity, _thread_claim_parity); > 4371 } > 4372 #endif // ASSERT > > Thanks for the review! > > Dan > > >> >> Thanks >> Vladimir >> >>> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty wrote: >>> >>> Thanks for the re-review! (and for the reworded comment...) >>> >>> Dan >>> >>>>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote: >>>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote: >>>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ >>>>> >>>>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0: >>>>> >>>>> - Revised the comment in Threads::parallel_java_threads_do. >>>>> - Added the assert to Threads::assert_all_threads_claimed(). >>>>> >>>>> Comments, questions and feedback are welcome. >>>> Looks good! >>>> >>>> -Aleksey >>>> >>>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs. >>>> > From daniel.daugherty at oracle.com Mon Jul 31 20:56:20 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 14:56:20 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <07323FAA-AACA-4CA4-ADFF-73816102C23C@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com> <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com> <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com> <07323FAA-AACA-4CA4-ADFF-73816102C23C@oracle.com> Message-ID: <3871c04c-a7f2-8a61-cdd0-67aff6ad0bf0@oracle.com> Thanks! Dan On 7/31/17 2:14 PM, Vladimir Kozlov wrote: > This is what happens when you do review on phone ;) > Sorry for noise. Looks good. > > Vladimir > > Sent from my iPhone > >> On Jul 31, 2017, at 11:49 AM, Daniel D. Daugherty wrote: >> >>> On 7/31/17 12:39 PM, Vladimir Kozlov wrote: >>> Dan >>> >>> Can you put new code which used for assert check under #ifdef ASSERT to avoid side effects in product code? >> That entire function is in a #ifdef ASSERT: >> >> 4360 #ifdef ASSERT >> 4361 void Threads::assert_all_threads_claimed() { >> 4362 ALL_JAVA_THREADS(p) { >> 4363 const int thread_parity = p->oops_do_parity(); >> 4364 assert((thread_parity == _thread_claim_parity), >> 4365 "Thread " PTR_FORMAT " has incorrect parity %d != %d", p2i(p), thread_parity, _thread_claim_parity); >> 4366 } >> 4367 VMThread* vmt = VMThread::vm_thread(); >> 4368 const int thread_parity = vmt->oops_do_parity(); >> 4369 assert((thread_parity == _thread_claim_parity), >> 4370 "VMThread " PTR_FORMAT " has incorrect parity %d != %d", p2i(vmt), thread_parity, _thread_claim_parity); >> 4371 } >> 4372 #endif // ASSERT >> >> Thanks for the review! >> >> Dan >> >> >>> Thanks >>> Vladimir >>> >>>> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty wrote: >>>> >>>> Thanks for the re-review! (and for the reworded comment...) >>>> >>>> Dan >>>> >>>>>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote: >>>>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote: >>>>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ >>>>>> >>>>>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0: >>>>>> >>>>>> - Revised the comment in Threads::parallel_java_threads_do. >>>>>> - Added the assert to Threads::assert_all_threads_claimed(). >>>>>> >>>>>> Comments, questions and feedback are welcome. >>>>> Looks good! >>>>> >>>>> -Aleksey >>>>> >>>>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs. >>>>> From daniel.daugherty at oracle.com Mon Jul 31 22:57:21 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Jul 2017 16:57:21 -0600 Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in JDK10-hs nightly (8185273) In-Reply-To: <3871c04c-a7f2-8a61-cdd0-67aff6ad0bf0@oracle.com> References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com> <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com> <666db742-9fae-768b-fb86-93a086381ec6@oracle.com> <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com> <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com> <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com> <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com> <07323FAA-AACA-4CA4-ADFF-73816102C23C@oracle.com> <3871c04c-a7f2-8a61-cdd0-67aff6ad0bf0@oracle.com> Message-ID: <8df591e7-60fe-c0f9-bcf2-a974a6d810d1@oracle.com> Final local testing numbers for this fix: 20062 runs on slowdebug bits; of those, 802 were in the right sequence of VM-ops for the crash 28523 runs on fastdebug bits; of those, 1915 were in the right sequence of VM-ops for the crash Dan On 7/31/17 2:56 PM, Daniel D. Daugherty wrote: > Thanks! > > Dan > > > On 7/31/17 2:14 PM, Vladimir Kozlov wrote: >> This is what happens when you do review on phone ;) >> Sorry for noise. Looks good. >> >> Vladimir >> >> Sent from my iPhone >> >>> On Jul 31, 2017, at 11:49 AM, Daniel D. Daugherty >>> wrote: >>> >>>> On 7/31/17 12:39 PM, Vladimir Kozlov wrote: >>>> Dan >>>> >>>> Can you put new code which used for assert check under #ifdef >>>> ASSERT to avoid side effects in product code? >>> That entire function is in a #ifdef ASSERT: >>> >>> 4360 #ifdef ASSERT >>> 4361 void Threads::assert_all_threads_claimed() { >>> 4362 ALL_JAVA_THREADS(p) { >>> 4363 const int thread_parity = p->oops_do_parity(); >>> 4364 assert((thread_parity == _thread_claim_parity), >>> 4365 "Thread " PTR_FORMAT " has incorrect parity %d != >>> %d", p2i(p), thread_parity, _thread_claim_parity); >>> 4366 } >>> 4367 VMThread* vmt = VMThread::vm_thread(); >>> 4368 const int thread_parity = vmt->oops_do_parity(); >>> 4369 assert((thread_parity == _thread_claim_parity), >>> 4370 "VMThread " PTR_FORMAT " has incorrect parity %d != >>> %d", p2i(vmt), thread_parity, _thread_claim_parity); >>> 4371 } >>> 4372 #endif // ASSERT >>> >>> Thanks for the review! >>> >>> Dan >>> >>> >>>> Thanks >>>> Vladimir >>>> >>>>> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty >>>>> wrote: >>>>> >>>>> Thanks for the re-review! (and for the reworded comment...) >>>>> >>>>> Dan >>>>> >>>>>>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote: >>>>>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote: >>>>>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/ >>>>>>> >>>>>>> Only src/share/vm/runtime/thread.cpp is changed relative to >>>>>>> round 0: >>>>>>> >>>>>>> - Revised the comment in Threads::parallel_java_threads_do. >>>>>>> - Added the assert to Threads::assert_all_threads_claimed(). >>>>>>> >>>>>>> Comments, questions and feedback are welcome. >>>>>> Looks good! >>>>>> >>>>>> -Aleksey >>>>>> >>>>>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after >>>>>> this lands to jdk10/hs. >>>>>> > >