From yasuenag at gmail.com  Sat Jul  1 14:44:23 2017
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Sat, 1 Jul 2017 23:44:23 +0900
Subject: JDK-8153333: [REDO] STW phases at Concurrent GC should count in
 PerfCounter
In-Reply-To: <CAGFVN2CoZiXRsr3McK69PSNz1bYwDvnx+LkqEaJc25c4uuDovw@mail.gmail.com>
References: <CAGFVN2CoZiXRsr3McK69PSNz1bYwDvnx+LkqEaJc25c4uuDovw@mail.gmail.com>
Message-ID: <e8d8f095-80cf-40ca-1bf8-fd45eeb7611e@gmail.com>

PING:

Have you checked this issue?


Yasumasa


On 2017/06/14 13:22, Yasumasa Suenaga wrote:
> Hi all,
> 
> I changed PerfCounter to show CGC STW phase in jstat in JDK-8151674.
> However, it occurred several jtreg test failure, so it was back-outed.
> 
> I want to resume to work for this issue.
> 
> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.03/hotspot/
> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.03/jdk/
> 
> These changes are work fine on jtreg test as below:
> 
>    hotspot/test/serviceability/tmtools/jstat
>    jdk/test/sun/tools
> 
> 
> Since JDK 9, default GC algorithm is set to G1.
> So I think this change is useful to watch GC behavior through jstat.
> 
> I cannot access JPRT. Could you help?
> 
> 
> Thanks,
> 
> Yasumasa
> 


From mikael.gerdin at oracle.com  Mon Jul  3 07:35:26 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 3 Jul 2017 09:35:26 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
Message-ID: <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>

Hi Roman,

On 2017-06-30 18:32, Roman Kennke wrote:
> I came across one problem using this approach: We will have 2 instances
> of CollectedHeap around, where there's usually only 1, and some code
> expects only 1. For example, in CollectedHeap constructor, we create new
> PerfData variables, and we now create them 2x, which leads to an assert
> being thrown. I suspect there is more code like that.
> 
> I will attempt to refactor this a little more, maybe it's not that bad,
> but it's probably not worth spending too much time on it.

I think refactoring the code to not expect a singleton CollectedHeap 
instance is a bit too much.
Perhaps there is another way to share common code between Serial and CMS 
but that might require a bit more thought.

/Mikael

> 
> Roman
>> Hi Roman,
>>
>> thanks for putting this patch together, it is a great step forward! One
>> thung that (in my mind) would improve it even further is if we embed a
>> GenCollectedHeap in CMSHeap and then make CMSHeap inherit directly from
>> CollectedHeap.
>>
>> With this solution, the definition of CMSHeap would look like something
>> along the lines of:
>>
>> class CMSHeap : public CollectedHeap {
>>    WorkGang* _wg;
>>    GenCollectedHeap _gch;
>>
>>   public:
>>    CMSHeap(GenCollectorPolicy* policy) :
>>      _wg(new WorkGang("GC Thread", ParallelGCThreads, true, true),
>>      _gch(policy) {
>>      _wg->initialize_workers();
>>    }
>>
>>    // a bunch of "facade" methods
>>    virtual bool supports_tlab_allocation() const {
>>      return _gch->supports_tlab_allocation();
>>    }
>>
>>    virtual size_t tlab_capacity(Thread* t) const {
>>      return _gch->tlab_capacity(t);
>>    }
>> };
>>
>> With this approach, you would have to implement a bunch of "facade"
>> methods that just delegates to _gch, such as the methods
>> supports_tlab_allocation and tlab_capacity above. There are two reasons
>> why I prefer this approach:
>> 1. In the end we want CMSHeap to inherit from CollectedHeap anyway :)
>> 2. It makes it very clear which methods we gradually have to
>>     re-implement in CMSHeap to eventually get rid of the _gch field (the
>>     end goal). This is much harder to see if CMSHeap inherits from
>>     GenCollectedHeap (see more below).
>>
>> The second point will most likely cause some initial problems with
>> `protected` code in GenCollectedHeap. For example, as you noticed when
>> creating this patch, CMSHeap make use of a few `protected` fields and
>> methods from GenCollectedHeap, most notably:
>> - _process_strong_tasks
>> - process_roots()
>> - process_string_table_roots()
>>
>> It would be much better (IMO) to share this code via composition rather
>> than inheritance. In this particular case, I would prefer to create a
>> class StrongRootsProcessor that encapsulates the root processing logic.
>> Then GenCollectedHeap and CMSHeap can both contain an instance of
>> StrongRootsProcessor.
>>
>> What do you think of this approach? Do you have some spare cycles to try
>> this approach out?
>>
>> Thanks,
>> Erik
>>
>> On 06/02/2017 10:55 AM, Roman Kennke wrote:
>>> Take this patch. It #ifdef ASSERT's a call to check_gen_kinds() that is
>>> only present in debug builds.
>>>
>>>
>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.01/
>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.01/>
>>>
>>> Roman
>>>
>>> Am 01.06.2017 um 22:50 schrieb Roman Kennke:
>>>> What $SUBJECT says.
>>>>
>>>> I went over genCollectedHeap.[hpp|cpp] and moved everything that I could
>>>> find that is CMS-only into a new CMSHeap class.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.00/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.00/>
>>>>
>>>> It is possible that I overlooked something there. There may be code in
>>>> there that doesn't shout "CMS" at me, but is still intrinsically CMS stuff.
>>>>
>>>> Also not that I have not removed that little part:
>>>>
>>>>    always_do_update_barrier = UseConcMarkSweepGC;
>>>>
>>>> because I expect it to go away with Erik ?'s big refactoring.
>>>>
>>>> What do you think?
>>>>
>>>> Testing: hotspot_gc, specjvm, some little apps with -XX:+UseConcMarkSweepGC
>>>>
>>>> Roman
>>>>
> 


From stefan.johansson at oracle.com  Mon Jul  3 08:38:47 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 3 Jul 2017 10:38:47 +0200
Subject: RFR: 8183281: Remove unnecessary call to increment_gc_time_stamp
In-Reply-To: <c647ba54-cc4c-bec6-86b3-4cd4af14d39b@oracle.com>
References: <646a5d9a-b6d3-82c9-3937-027c3193d4c0@oracle.com>
 <5b2dff36-0a55-feb8-7e80-52e4562a5651@oracle.com>
 <c647ba54-cc4c-bec6-86b3-4cd4af14d39b@oracle.com>
Message-ID: <b6de145b-7331-d722-166c-1b7add32f6c2@oracle.com>


On 2017-06-30 17:34, Erik Helin wrote:
> On 06/30/2017 01:53 PM, Stefan Johansson wrote:
>> Hi Erik,
>>
>> On 2017-06-30 11:37, Erik Helin wrote:
>>> Hi all,
>>>
>>> the following small patch removes an unnecessary call to
>>> increment_gc_time_stamp from
>>> G1CollectedHeap::do_collection_pause_at_safepoint (and the long,
>>> wrong, comment above the call).
>>>
>>> We already do a call increment_gc_time_stamp much earlier in
>>> do_collection_pause_at_safepoint, which is enough. The reasons
>>> outlined in the comment motivating a second call is no longer true,
>>> the code has changed (but the comment has not).
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8183281
>>> Patch: see below
>>> Testing: make hotspot
>>>
>> Patch looks good, but I would like to see some more testing than just
>> building hotspot. Running the gc jtreg tests for example.
>
> Thanks for reviewing! All pass for both fastdebug and product when 
> running `make test TEST=hotspot_gc` on my Linux workstation.
>
Thanks for running the tests, ship it!
StefanJ
> Thanks,
> Erik
>
>> Thanks for cleaning up the code,
>> Stefan
>>> Thanks,
>>> Erik
>>>
>>> # HG changeset patch
>>> # User ehelin
>>> # Date 1498814642 -7200
>>> #      Fri Jun 30 11:24:02 2017 +0200
>>> # Node ID 62400b3cbec4e0d06e0d6c21c9486070d8c906a4
>>> # Parent  10ccf0a5f63fdca04d9eda2c774ccdd0e12bc1a1
>>> 8183281: Remove unnecessary call to increment_gc_time_stamp
>>>
>>> diff -r 10ccf0a5f63f -r 62400b3cbec4
>>> src/share/vm/gc/g1/g1CollectedHeap.cpp
>>> --- a/src/share/vm/gc/g1/g1CollectedHeap.cpp    Thu Jun 29 19:09:04
>>> 2017 +0000
>>> +++ b/src/share/vm/gc/g1/g1CollectedHeap.cpp    Fri Jun 30 11:24:02
>>> 2017 +0200
>>> @@ -3266,29 +3266,6 @@
>>>
>>>          MemoryService::track_memory_usage();
>>>
>>> -        // In prepare_for_verify() below we'll need to scan the 
>>> deferred
>>> -        // update buffers to bring the RSets up-to-date if
>>> -        // G1HRRSFlushLogBuffersOnVerify has been set. While scanning
>>> -        // the update buffers we'll probably need to scan cards on the
>>> -        // regions we just allocated to (i.e., the GC alloc
>>> -        // regions). However, during the last GC we called
>>> -        // set_saved_mark() on all the GC alloc regions, so card
>>> -        // scanning might skip the [saved_mark_word()...top()] area of
>>> -        // those regions (i.e., the area we allocated objects into
>>> -        // during the last GC). But it shouldn't. Given that
>>> -        // saved_mark_word() is conditional on whether the GC time 
>>> stamp
>>> -        // on the region is current or not, by incrementing the GC 
>>> time
>>> -        // stamp here we invalidate all the GC time stamps on all the
>>> -        // regions and saved_mark_word() will simply return top() for
>>> -        // all the regions. This is a nicer way of ensuring this 
>>> rather
>>> -        // than iterating over the regions and fixing them. In 
>>> fact, the
>>> -        // GC time stamp increment here also ensures that
>>> -        // saved_mark_word() will return top() between pauses, i.e.,
>>> -        // during concurrent refinement. So we don't need the
>>> -        // is_gc_active() check to decided which top to use when
>>> -        // scanning cards (see CR 7039627).
>>> -        increment_gc_time_stamp();
>>> -
>>>          if (VerifyRememberedSets) {
>>>            log_info(gc, verify)("[Verifying RemSets after GC]");
>>>            VerifyRegionRemSetClosure v_cl;
>>


From rkennke at redhat.com  Mon Jul  3 09:13:43 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 3 Jul 2017 11:13:43 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
Message-ID: <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>

Am 03.07.2017 um 09:35 schrieb Mikael Gerdin:
> Hi Roman,
>
> On 2017-06-30 18:32, Roman Kennke wrote:
>> I came across one problem using this approach: We will have 2 instances
>> of CollectedHeap around, where there's usually only 1, and some code
>> expects only 1. For example, in CollectedHeap constructor, we create new
>> PerfData variables, and we now create them 2x, which leads to an assert
>> being thrown. I suspect there is more code like that.
>>
>> I will attempt to refactor this a little more, maybe it's not that bad,
>> but it's probably not worth spending too much time on it.
>
> I think refactoring the code to not expect a singleton CollectedHeap
> instance is a bit too much.
> Perhaps there is another way to share common code between Serial and
> CMS but that might require a bit more thought.

Yeah, definitely. I hit another difficulty: pretty much the same issues
that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now show up
with Generation and its subclasses..

How about we push the original patch that I've posted, and work from
there? In fact, I *have* found some little things I would change (some
more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have
overlooked in my first pass...)

Roman

>
> /Mikael
>
>>
>> Roman
>>> Hi Roman,
>>>
>>> thanks for putting this patch together, it is a great step forward! One
>>> thung that (in my mind) would improve it even further is if we embed a
>>> GenCollectedHeap in CMSHeap and then make CMSHeap inherit directly from
>>> CollectedHeap.
>>>
>>> With this solution, the definition of CMSHeap would look like something
>>> along the lines of:
>>>
>>> class CMSHeap : public CollectedHeap {
>>>    WorkGang* _wg;
>>>    GenCollectedHeap _gch;
>>>
>>>   public:
>>>    CMSHeap(GenCollectorPolicy* policy) :
>>>      _wg(new WorkGang("GC Thread", ParallelGCThreads, true, true),
>>>      _gch(policy) {
>>>      _wg->initialize_workers();
>>>    }
>>>
>>>    // a bunch of "facade" methods
>>>    virtual bool supports_tlab_allocation() const {
>>>      return _gch->supports_tlab_allocation();
>>>    }
>>>
>>>    virtual size_t tlab_capacity(Thread* t) const {
>>>      return _gch->tlab_capacity(t);
>>>    }
>>> };
>>>
>>> With this approach, you would have to implement a bunch of "facade"
>>> methods that just delegates to _gch, such as the methods
>>> supports_tlab_allocation and tlab_capacity above. There are two reasons
>>> why I prefer this approach:
>>> 1. In the end we want CMSHeap to inherit from CollectedHeap anyway :)
>>> 2. It makes it very clear which methods we gradually have to
>>>     re-implement in CMSHeap to eventually get rid of the _gch field
>>> (the
>>>     end goal). This is much harder to see if CMSHeap inherits from
>>>     GenCollectedHeap (see more below).
>>>
>>> The second point will most likely cause some initial problems with
>>> `protected` code in GenCollectedHeap. For example, as you noticed when
>>> creating this patch, CMSHeap make use of a few `protected` fields and
>>> methods from GenCollectedHeap, most notably:
>>> - _process_strong_tasks
>>> - process_roots()
>>> - process_string_table_roots()
>>>
>>> It would be much better (IMO) to share this code via composition rather
>>> than inheritance. In this particular case, I would prefer to create a
>>> class StrongRootsProcessor that encapsulates the root processing logic.
>>> Then GenCollectedHeap and CMSHeap can both contain an instance of
>>> StrongRootsProcessor.
>>>
>>> What do you think of this approach? Do you have some spare cycles to
>>> try
>>> this approach out?
>>>
>>> Thanks,
>>> Erik
>>>
>>> On 06/02/2017 10:55 AM, Roman Kennke wrote:
>>>> Take this patch. It #ifdef ASSERT's a call to check_gen_kinds()
>>>> that is
>>>> only present in debug builds.
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.01/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.01/>
>>>>
>>>> Roman
>>>>
>>>> Am 01.06.2017 um 22:50 schrieb Roman Kennke:
>>>>> What $SUBJECT says.
>>>>>
>>>>> I went over genCollectedHeap.[hpp|cpp] and moved everything that I
>>>>> could
>>>>> find that is CMS-only into a new CMSHeap class.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.00/
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.00/>
>>>>>
>>>>> It is possible that I overlooked something there. There may be
>>>>> code in
>>>>> there that doesn't shout "CMS" at me, but is still intrinsically
>>>>> CMS stuff.
>>>>>
>>>>> Also not that I have not removed that little part:
>>>>>
>>>>>    always_do_update_barrier = UseConcMarkSweepGC;
>>>>>
>>>>> because I expect it to go away with Erik ?'s big refactoring.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Testing: hotspot_gc, specjvm, some little apps with
>>>>> -XX:+UseConcMarkSweepGC
>>>>>
>>>>> Roman
>>>>>
>>


From thomas.schatzl at oracle.com  Mon Jul  3 09:16:50 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 03 Jul 2017 11:16:50 +0200
Subject: RFR: 8183281: Remove unnecessary call to increment_gc_time_stamp
In-Reply-To: <646a5d9a-b6d3-82c9-3937-027c3193d4c0@oracle.com>
References: <646a5d9a-b6d3-82c9-3937-027c3193d4c0@oracle.com>
Message-ID: <1499073410.2802.0.camel@oracle.com>

Hi,

On Fri, 2017-06-30 at 11:37 +0200, Erik Helin wrote:
> Hi all,
> 
> the following small patch removes an unnecessary call to?
> increment_gc_time_stamp from?
> G1CollectedHeap::do_collection_pause_at_safepoint (and the long,
> wrong,?
> comment above the call).
> 
> We already do a call increment_gc_time_stamp much earlier in?
> do_collection_pause_at_safepoint, which is enough. The reasons
> outlined?
> in the comment motivating a second call is no longer true, the code
> has?
> changed (but the comment has not).
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8183281
> Patch: see below
> Testing: make hotspot
> 

? looks good.

Thomas


From thomas.schatzl at oracle.com  Mon Jul  3 09:53:32 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 03 Jul 2017 11:53:32 +0200
Subject: RFR (XXS): 8183394: Remove unused G1RemSet::n_workers() method
Message-ID: <1499075612.2802.5.camel@oracle.com>

Hi all,

? can I have a review for this trivial removal of an unused method? One
Reviewer should be sufficient for this ;)

CR:
https://bugs.openjdk.java.net/browse/JDK-8183394
Webrev:
http://cr.openjdk.java.net/~tschatzl/8183394/webrev/
Testing:
Local compilation

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Mon Jul  3 09:58:37 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 03 Jul 2017 11:58:37 +0200
Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count
 number of cards concurrently refined
Message-ID: <1499075917.2802.8.camel@oracle.com>

Hi all,

? can I have reviews for this small change that
makes?G1Remset::_conc_refined_cards only count the number of
concurrently refined cards (+ some trivial renaming of the variable)?

The reason is that I plan to add the number of refined cards during gc
as separately soon. This has been suggested earlier in some internal
discussion, and I agree.

CR:
https://bugs.openjdk.java.net/browse/JDK-8179677
Webrev:
http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/
Testing:
jprt

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Mon Jul  3 11:24:48 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 03 Jul 2017 13:24:48 +0200
Subject: RFR (XS): 8183397: Ensure consistent closure filtering during
 evacuation
Message-ID: <1499081088.2802.29.camel@oracle.com>

Hi all,

? can I have reviews for this change that fixes an observation that has
been made recently by Erik, i.e. that the "else" part of several
evacuation closures inconsistently filters out non-cross-region
references before checking whether the referenced object is a humongous
or ext region.

This causes somewhat hard to diagnose performance issues, and earlier
filtering does not hurt if done anyway.

(Note that the current way of checking in all but the UpdateRS closure
using HeapRegion::is_in_same_region() seems optimal. The only reason
why the other way in the UpdateRS closure is better because the code
needs the "to" HeapRegion pointer anyway)

CR:
https://bugs.openjdk.java.net/browse/JDK-8183397
Webrev:
http://cr.openjdk.java.net/~tschatzl/8183397/webrev/
Testing:
jprt,?performance regression analysis

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Mon Jul  3 11:24:53 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 03 Jul 2017 13:24:53 +0200
Subject: RFR (S): 8179679: Rearrange filters before card scanning
Message-ID: <1499081093.2802.30.camel@oracle.com>

Hi all,

? please have a look at this change that rearranges the checks in the
G1RemSet card scanning a bit in order to:

- remove some redundant checking made possible recently with?JDK-
8177044
- group together similar checks (so that the compiler can more easily
reuse some intermediate values)
- minimize unnecessary card claiming

CR:
https://bugs.openjdk.java.net/browse/JDK-8179679
Webrev:
http://cr.openjdk.java.net/~tschatzl/8179679/webrev.1/?(note: there has
been a previous webrev, but without reviews; still there is a
webrev.0_to_1 for the curious)
Testing:
jprt,?performance regression analysis

Thanks,
? Thomas


From mikael.gerdin at oracle.com  Mon Jul  3 11:54:54 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 3 Jul 2017 13:54:54 +0200
Subject: RFR (XXS): 8183394: Remove unused G1RemSet::n_workers() method
In-Reply-To: <1499075612.2802.5.camel@oracle.com>
References: <1499075612.2802.5.camel@oracle.com>
Message-ID: <866973a1-3698-e36e-e38d-8a7631fcf1c6@oracle.com>

Hi Thomas,

On 2017-07-03 11:53, Thomas Schatzl wrote:
> Hi all,
> 
>    can I have a review for this trivial removal of an unused method? One
> Reviewer should be sufficient for this ;)
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183394
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183394/webrev/

Looks good and trivial enough to me.
/Mikael

> Testing:
> Local compilation
> 
> Thanks,
>    Thomas
> 


From mikael.gerdin at oracle.com  Mon Jul  3 11:57:48 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 3 Jul 2017 13:57:48 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <59510D5E.10009@oracle.com>
References: <59510D5E.10009@oracle.com>
Message-ID: <cabff42d-92ad-dcac-8bd2-a4173140abd6@oracle.com>

Hi Erik,

On 2017-06-26 15:34, Erik ?sterlund wrote:
> Hi,
> 
> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/

I think this change makes sense and I agree with your reasoning below.

I'm leaning towards suggesting creating a named enum value for 
"access+1" to begin a move towards getting rid of adding and subtracting 
values from enums in this code. I don't have a good name for it, though.

/Mikael


> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
> 
> The G1 barrier queues have very awkward lock orderings for the following 
> reasons:
> 
> 1) These queues may queue up things when performing a reference write or 
> resolving a jweak (intentionally or just happened to be jweak, even 
> though it looks like a jobject), which can happen in a lot of places in 
> the code. We resolve JNIHandles while holding special locks in many 
> places. We perform reference writes also in many places. Now the 
> unsuspecting hotspot developer might think that it is okay to resolve a 
> JNIHandle or perform a reference write while possibly holding a special 
> lock. But no. In some cases, object writes have been moved out of locks 
> and replaced with lock-free CAS, only to dodge the G1 write barrier 
> locks. I don't think the G1 lock ordering issues should shape the shared 
> code rather than the other way around.
> 2) There is an issue that the shared queue locks have a "special" rank, 
> which is below the lock ranks used by the cbl monitor and free list 
> monitor. This leads to an issue when these locks have to be taken while 
> holding the shared queue locks. The current solution is to drop the 
> shared queue locks temporarily, introducing nasty data races. These 
> races are guarded, but the whole race seems very unnecessary.
> 
> I argue that if the G1 write barrier queue locks were simply set 
> appropriately in the first place by analyzing what ranks they should 
> have, none of the above issues would exist. Therefore I propose this new 
> ordering.
> 
> Specifically, I recognize that locks required for performing memory 
> accesses and resolving JNIHandles are more special than the "special" 
> rank. Therefore, this change introduces a new lock ordering category 
> called "access", which is to be used by barriers required to perform 
> memory accesses. In other words, by recognizing the rank is more special 
> than "special", we can remove "special" code to walk around making its 
> rank more "special". That seems desirable to me. The access locks need 
> to comply to the same constraints as the special locks: they may not 
> perform safepoint checks.
> 
> The old lock ranks were:
> 
> SATB_Q_FL_lock: special
> SATB_Q_CBL_mon: leaf - 1
> Shared_SATB_Q_lock: leaf - 1
> 
> DirtyCardQ_FL_lock: special
> DirtyCardQ_CBL_mon: leaf - 1
> Shared_DirtyCardQ_lock: leaf - 1
> 
> The new lock ranks are:
> 
> SATB_Q_FL_lock: access (special - 2)
> SATB_Q_CBL_mon: access (special - 2)
> Shared_SATB_Q_lock: access + 1 (special - 1)
> 
> DirtyCardQ_FL_lock: access (special - 2)
> DirtyCardQ_CBL_mon: access (special - 2)
> Shared_DirtyCardQ_lock: access + 1 (special - 1)
> 
> Analysis:
> 
> Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the same 
> group of locks. The free list lock, the completed buffer list monitor 
> and the shared queue lock.
> 
> Observations:
> 1) The free list lock and completed buffer list monitors (members of 
> PtrQueueSet) are disjoint. We never hold both of them at the same time.
> Rationale: The free list lock is only used from 
> PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and 
> PtrQueueSet::reduce_free_list, and no callsite from there can be 
> expanded where the cbl monitor is acquired. So therefore it is 
> impossible to acquire the cbl monitor while holding the free list lock. 
> The opposite case of acquiring the free list lock while holding the cbl 
> monitor is also not possible; only the following places acquire the cbl 
> monitor: PtrQueueSet::enqueue_complete_buffer, 
> PtrQueueSet::merge_bufferlists, 
> PtrQueueSet::assert_completed_buffer_list_len_correct, 
> PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id, 
> FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer, 
> DirtyCardQueueSet::clear, 
> SATBMarkQueueSet::apply_closure_to_completed_buffer and 
> SATBMarkQueueSet::abandon_partial_marking. Again, neither of these paths 
> where the cbl monitor is held can expand callsites to a place where the 
> free list locks are held. Therefore it holds that the cbl monitor can 
> not be held while the free list lock is held, and the free list lock can 
> not be held while the cbl monitor is held. Therefore they are held 
> disjointly.
> 2) We might hold the shared queue locks before acquiring the completed 
> buffer list monitor. (today we drop the shared queue lock then and 
> reacquire it later as a hack as already described)
> 3) We do not acquire a shared queue lock while holding the free list 
> lock or completed buffer list monitor, as there is no reference from a 
> PtrQueueSet to its shared queue, so those code paths do not know how to 
> reach the shared PtrQueue to acquire its lock. The derived classes are 
> exceptions but they never use the shared queue lock while holding the 
> completed buffer list monitor or free list lock. DirtyCardQueueSet uses 
> the shared queue for concatenating logs (in a safepoint without holding 
> those locks). The SATBMarkQueueSet uses the shared queue for filtering 
> the buffers, fiddling with activeness, printing and resetting, all 
> without grabbing any locks.
> 4) We do not acquire any other lock (above event) while holding the free 
> list lock or completed buffer list monitors. This was discovered by 
> manually expanding the call graphs from where these two locks are held.
> 
> Derived constraints:
> a) Because of observation 1, the free list lock and completed buffer 
> list monitors can have the same rank.
> b) Because of observations 1 and 2, the shared queue lock ought to have 
> a rank higher than the ranks of the free list lock and the completed 
> buffer list monitors (not the case today).
> c) Because of of observation 3 and 2, the free list lock and completed 
> buffer list monitors ought to have a rank lower than the rank of the 
> shared queue lock.
> d) Because of observation 4 (and constraints a-c), all the barrier locks 
> should be below the "special" rank without violating any existing ranks.
> 
> The proposed new lock ranks conform to the constraints derived from my 
> observations. It is worth noting that the potential relationship that 
> could break (and why they do not) are:
> 1) If a lock is acquired from within the barriers that does not involve 
> the shared queue lock, the free list lock or the completed buffer list 
> monitor, we have now inverted their relationship as that other lock 
> would probably have a rank higher than or equal to "special". But due to 
> observation 4, there are no such cases.
> 2) The relationship between the shared queue lock and the completed 
> buffer list monitor has been changed so both can be held at the same 
> time if the shared queue lock is acquired first (which it is). This is 
> arguably the way it should have been from the first place, and the old 
> solution had ugly hacks where we would drop the shared queue lock to not 
> run into the lock order assert (and only not to run into the lock order 
> assert, i.e. not to avoid potential deadlock) to ensure the locks are 
> not held at the same time. That code has now been removed, so that the 
> shared queue lock is still held when enqueueing completed buffers (no 
> dodgy dropping and reclaiming), and the code for handling the races due 
> to multiple concurrent enqueuers has also been removed and replaced with 
> an assertion that there simply should not be multiple concurrent 
> enqueuers. Since the shared queue lock is now held throughout the whole 
> operation, there will be no concurrent enqueuers.
> 3) The completed buffer list monitor used to have a higher rank than the 
> free list lock. Now they have the same. Therefore, they could previously 
> allow them to be held at the same time if the cbl monitor was acquired 
> first. However, as discussed, there is no such case, and they ought to 
> have the same rank not to confuse their true disjointness. If anyone 
> insists we do not break this relationship despite the true disjointness, 
> I could consent to adding another access lock rank, like this: 
> http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I think it 
> seems better to have the same rank since they are actually truly 
> disjoint and should remain disjoint.
> 
> I do recognize that long term, we *might* want a lock-free solution or 
> something (not saying we do or do not). But until then, the ranks ought 
> to be corrected so that they do not cause these problems causing 
> everyone to bash their head against the awkward G1 lock ranks throughout 
> the code and make hacks around it.
> 
> Testing: JPRT with hotspot all and lots of local testing.
> 
> Thanks,
> /Erik


From thomas.schatzl at oracle.com  Mon Jul  3 12:12:50 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 03 Jul 2017 14:12:50 +0200
Subject: RFR (S): 8183226: Remembered set summarization accesses not fully
 initialized java thread DCQS
Message-ID: <1499083970.2802.33.camel@oracle.com>

Hi all,

? can I get reviews for the following change that breaks some
dependency cycle in g1remset initialization to fix some (at this time
benign) bug when printing remembered set summarization information?

The problem is that G1Remset initializes its internal remembered set
summarization helper data structure in the constructor, which accesses
some DCQS members before we call the initialize methods on the various
global DCQS'es in G1CollectedHeap::initialize().
By splitting the initialization of the remembered set summarization
into an extra method, this one can be called at the very end of
G1CollectedHeap::initialize(), thus breaking the dependency.

Benign because the values accessed at that time have the same values as
the values after initialization.

This also allows for grouping together the initialization of
G1RemSet/DCQS/G1ConcurrentRefine related data structures more easily in
G1CollectedHeap::initialize().

CR:
https://bugs.openjdk.java.net/browse/JDK-8183226
Webrev:
http://cr.openjdk.java.net/~tschatzl/8183226/webrev/
Testing:
local testing running remembered set summarization manually, jprt

Thanks,
? Thomas


From erik.helin at oracle.com  Mon Jul  3 12:44:18 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 3 Jul 2017 14:44:18 +0200
Subject: RFR (XXS): 8183394: Remove unused G1RemSet::n_workers() method
In-Reply-To: <1499075612.2802.5.camel@oracle.com>
References: <1499075612.2802.5.camel@oracle.com>
Message-ID: <55027601-074b-b92a-7516-b08282291b70@oracle.com>

On 07/03/2017 11:53 AM, Thomas Schatzl wrote:
> Hi all,
>
>   can I have a review for this trivial removal of an unused method? One
> Reviewer should be sufficient for this ;)
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183394
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183394/webrev/

Looks good, Reviewed. Thanks for cleaning this up!
Erik

> Testing:
> Local compilation
>
> Thanks,
>   Thomas
>


From erik.osterlund at oracle.com  Mon Jul  3 12:53:58 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 3 Jul 2017 14:53:58 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <cabff42d-92ad-dcac-8bd2-a4173140abd6@oracle.com>
References: <59510D5E.10009@oracle.com>
 <cabff42d-92ad-dcac-8bd2-a4173140abd6@oracle.com>
Message-ID: <595A3E66.5050705@oracle.com>

Hi Mikael,

Thank you for the review!

Regarding the use of + x in the current enum system for lock rankings, I 
agree that it is not a brilliant system and you feel a bit sad when your 
lock rank is "leaf+2". However, sometimes I feel like abstracting 
numbers with names can become confusing as well - even misleading. Like 
for example how leaf is no longer a leaf and how it is questionable 
whether special is really all that special any longer.

When I thought about possible name for access + 0 and access + 1, I was 
thinking something in the lines of "perhaps access_inner/outer or 
access_leaf/nonleaf", but then that might get confusing if suddenly 
access will need 3 ranks for some reason and we get an "access_special" 
rank or something.

Perhaps a different solution than enum names would be nice long-term for 
lock ranks and deadlock detection, but I believe that might be outside 
of the current scope for this change.

Thanks,
/Erik

On 2017-07-03 13:57, Mikael Gerdin wrote:
> Hi Erik,
>
> On 2017-06-26 15:34, Erik ?sterlund wrote:
>> Hi,
>>
>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>
> I think this change makes sense and I agree with your reasoning below.
>
> I'm leaning towards suggesting creating a named enum value for 
> "access+1" to begin a move towards getting rid of adding and 
> subtracting values from enums in this code. I don't have a good name 
> for it, though.
>
> /Mikael
>
>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>>
>> The G1 barrier queues have very awkward lock orderings for the 
>> following reasons:
>>
>> 1) These queues may queue up things when performing a reference write 
>> or resolving a jweak (intentionally or just happened to be jweak, 
>> even though it looks like a jobject), which can happen in a lot of 
>> places in the code. We resolve JNIHandles while holding special locks 
>> in many places. We perform reference writes also in many places. Now 
>> the unsuspecting hotspot developer might think that it is okay to 
>> resolve a JNIHandle or perform a reference write while possibly 
>> holding a special lock. But no. In some cases, object writes have 
>> been moved out of locks and replaced with lock-free CAS, only to 
>> dodge the G1 write barrier locks. I don't think the G1 lock ordering 
>> issues should shape the shared code rather than the other way around.
>> 2) There is an issue that the shared queue locks have a "special" 
>> rank, which is below the lock ranks used by the cbl monitor and free 
>> list monitor. This leads to an issue when these locks have to be 
>> taken while holding the shared queue locks. The current solution is 
>> to drop the shared queue locks temporarily, introducing nasty data 
>> races. These races are guarded, but the whole race seems very 
>> unnecessary.
>>
>> I argue that if the G1 write barrier queue locks were simply set 
>> appropriately in the first place by analyzing what ranks they should 
>> have, none of the above issues would exist. Therefore I propose this 
>> new ordering.
>>
>> Specifically, I recognize that locks required for performing memory 
>> accesses and resolving JNIHandles are more special than the "special" 
>> rank. Therefore, this change introduces a new lock ordering category 
>> called "access", which is to be used by barriers required to perform 
>> memory accesses. In other words, by recognizing the rank is more 
>> special than "special", we can remove "special" code to walk around 
>> making its rank more "special". That seems desirable to me. The 
>> access locks need to comply to the same constraints as the special 
>> locks: they may not perform safepoint checks.
>>
>> The old lock ranks were:
>>
>> SATB_Q_FL_lock: special
>> SATB_Q_CBL_mon: leaf - 1
>> Shared_SATB_Q_lock: leaf - 1
>>
>> DirtyCardQ_FL_lock: special
>> DirtyCardQ_CBL_mon: leaf - 1
>> Shared_DirtyCardQ_lock: leaf - 1
>>
>> The new lock ranks are:
>>
>> SATB_Q_FL_lock: access (special - 2)
>> SATB_Q_CBL_mon: access (special - 2)
>> Shared_SATB_Q_lock: access + 1 (special - 1)
>>
>> DirtyCardQ_FL_lock: access (special - 2)
>> DirtyCardQ_CBL_mon: access (special - 2)
>> Shared_DirtyCardQ_lock: access + 1 (special - 1)
>>
>> Analysis:
>>
>> Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the same 
>> group of locks. The free list lock, the completed buffer list monitor 
>> and the shared queue lock.
>>
>> Observations:
>> 1) The free list lock and completed buffer list monitors (members of 
>> PtrQueueSet) are disjoint. We never hold both of them at the same time.
>> Rationale: The free list lock is only used from 
>> PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and 
>> PtrQueueSet::reduce_free_list, and no callsite from there can be 
>> expanded where the cbl monitor is acquired. So therefore it is 
>> impossible to acquire the cbl monitor while holding the free list 
>> lock. The opposite case of acquiring the free list lock while holding 
>> the cbl monitor is also not possible; only the following places 
>> acquire the cbl monitor: PtrQueueSet::enqueue_complete_buffer, 
>> PtrQueueSet::merge_bufferlists, 
>> PtrQueueSet::assert_completed_buffer_list_len_correct, 
>> PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id, 
>> FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer, 
>> DirtyCardQueueSet::clear, 
>> SATBMarkQueueSet::apply_closure_to_completed_buffer and 
>> SATBMarkQueueSet::abandon_partial_marking. Again, neither of these 
>> paths where the cbl monitor is held can expand callsites to a place 
>> where the free list locks are held. Therefore it holds that the cbl 
>> monitor can not be held while the free list lock is held, and the 
>> free list lock can not be held while the cbl monitor is held. 
>> Therefore they are held disjointly.
>> 2) We might hold the shared queue locks before acquiring the 
>> completed buffer list monitor. (today we drop the shared queue lock 
>> then and reacquire it later as a hack as already described)
>> 3) We do not acquire a shared queue lock while holding the free list 
>> lock or completed buffer list monitor, as there is no reference from 
>> a PtrQueueSet to its shared queue, so those code paths do not know 
>> how to reach the shared PtrQueue to acquire its lock. The derived 
>> classes are exceptions but they never use the shared queue lock while 
>> holding the completed buffer list monitor or free list lock. 
>> DirtyCardQueueSet uses the shared queue for concatenating logs (in a 
>> safepoint without holding those locks). The SATBMarkQueueSet uses the 
>> shared queue for filtering the buffers, fiddling with activeness, 
>> printing and resetting, all without grabbing any locks.
>> 4) We do not acquire any other lock (above event) while holding the 
>> free list lock or completed buffer list monitors. This was discovered 
>> by manually expanding the call graphs from where these two locks are 
>> held.
>>
>> Derived constraints:
>> a) Because of observation 1, the free list lock and completed buffer 
>> list monitors can have the same rank.
>> b) Because of observations 1 and 2, the shared queue lock ought to 
>> have a rank higher than the ranks of the free list lock and the 
>> completed buffer list monitors (not the case today).
>> c) Because of of observation 3 and 2, the free list lock and 
>> completed buffer list monitors ought to have a rank lower than the 
>> rank of the shared queue lock.
>> d) Because of observation 4 (and constraints a-c), all the barrier 
>> locks should be below the "special" rank without violating any 
>> existing ranks.
>>
>> The proposed new lock ranks conform to the constraints derived from 
>> my observations. It is worth noting that the potential relationship 
>> that could break (and why they do not) are:
>> 1) If a lock is acquired from within the barriers that does not 
>> involve the shared queue lock, the free list lock or the completed 
>> buffer list monitor, we have now inverted their relationship as that 
>> other lock would probably have a rank higher than or equal to 
>> "special". But due to observation 4, there are no such cases.
>> 2) The relationship between the shared queue lock and the completed 
>> buffer list monitor has been changed so both can be held at the same 
>> time if the shared queue lock is acquired first (which it is). This 
>> is arguably the way it should have been from the first place, and the 
>> old solution had ugly hacks where we would drop the shared queue lock 
>> to not run into the lock order assert (and only not to run into the 
>> lock order assert, i.e. not to avoid potential deadlock) to ensure 
>> the locks are not held at the same time. That code has now been 
>> removed, so that the shared queue lock is still held when enqueueing 
>> completed buffers (no dodgy dropping and reclaiming), and the code 
>> for handling the races due to multiple concurrent enqueuers has also 
>> been removed and replaced with an assertion that there simply should 
>> not be multiple concurrent enqueuers. Since the shared queue lock is 
>> now held throughout the whole operation, there will be no concurrent 
>> enqueuers.
>> 3) The completed buffer list monitor used to have a higher rank than 
>> the free list lock. Now they have the same. Therefore, they could 
>> previously allow them to be held at the same time if the cbl monitor 
>> was acquired first. However, as discussed, there is no such case, and 
>> they ought to have the same rank not to confuse their true 
>> disjointness. If anyone insists we do not break this relationship 
>> despite the true disjointness, I could consent to adding another 
>> access lock rank, like this: 
>> http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I think 
>> it seems better to have the same rank since they are actually truly 
>> disjoint and should remain disjoint.
>>
>> I do recognize that long term, we *might* want a lock-free solution 
>> or something (not saying we do or do not). But until then, the ranks 
>> ought to be corrected so that they do not cause these problems 
>> causing everyone to bash their head against the awkward G1 lock ranks 
>> throughout the code and make hacks around it.
>>
>> Testing: JPRT with hotspot all and lots of local testing.
>>
>> Thanks,
>> /Erik


From rkennke at redhat.com  Mon Jul  3 13:08:22 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 3 Jul 2017 15:08:22 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
Message-ID: <de9c422b-530d-e946-51b2-c5cd7bb66479@redhat.com>


In fact, my original plan was to also factor out the serial specific
stuff into a new subclass. This means, everything that is truly shared
between SerialHeap and CMSHeap would remain in GenCollectedHeap (for
now), and everything else would move down to the specific subclasses.
Then we can see what remains shared, and what is GC specific, and go
from there. What do you think?

Roman

Am 03.07.2017 um 09:35 schrieb Mikael Gerdin:
> Hi Roman,
>
> On 2017-06-30 18:32, Roman Kennke wrote:
>> I came across one problem using this approach: We will have 2 instances
>> of CollectedHeap around, where there's usually only 1, and some code
>> expects only 1. For example, in CollectedHeap constructor, we create new
>> PerfData variables, and we now create them 2x, which leads to an assert
>> being thrown. I suspect there is more code like that.
>>
>> I will attempt to refactor this a little more, maybe it's not that bad,
>> but it's probably not worth spending too much time on it.
>
> I think refactoring the code to not expect a singleton CollectedHeap
> instance is a bit too much.
> Perhaps there is another way to share common code between Serial and
> CMS but that might require a bit more thought.
>
> /Mikael
>
>>
>> Roman
>>> Hi Roman,
>>>
>>> thanks for putting this patch together, it is a great step forward! One
>>> thung that (in my mind) would improve it even further is if we embed a
>>> GenCollectedHeap in CMSHeap and then make CMSHeap inherit directly from
>>> CollectedHeap.
>>>
>>> With this solution, the definition of CMSHeap would look like something
>>> along the lines of:
>>>
>>> class CMSHeap : public CollectedHeap {
>>>    WorkGang* _wg;
>>>    GenCollectedHeap _gch;
>>>
>>>   public:
>>>    CMSHeap(GenCollectorPolicy* policy) :
>>>      _wg(new WorkGang("GC Thread", ParallelGCThreads, true, true),
>>>      _gch(policy) {
>>>      _wg->initialize_workers();
>>>    }
>>>
>>>    // a bunch of "facade" methods
>>>    virtual bool supports_tlab_allocation() const {
>>>      return _gch->supports_tlab_allocation();
>>>    }
>>>
>>>    virtual size_t tlab_capacity(Thread* t) const {
>>>      return _gch->tlab_capacity(t);
>>>    }
>>> };
>>>
>>> With this approach, you would have to implement a bunch of "facade"
>>> methods that just delegates to _gch, such as the methods
>>> supports_tlab_allocation and tlab_capacity above. There are two reasons
>>> why I prefer this approach:
>>> 1. In the end we want CMSHeap to inherit from CollectedHeap anyway :)
>>> 2. It makes it very clear which methods we gradually have to
>>>     re-implement in CMSHeap to eventually get rid of the _gch field
>>> (the
>>>     end goal). This is much harder to see if CMSHeap inherits from
>>>     GenCollectedHeap (see more below).
>>>
>>> The second point will most likely cause some initial problems with
>>> `protected` code in GenCollectedHeap. For example, as you noticed when
>>> creating this patch, CMSHeap make use of a few `protected` fields and
>>> methods from GenCollectedHeap, most notably:
>>> - _process_strong_tasks
>>> - process_roots()
>>> - process_string_table_roots()
>>>
>>> It would be much better (IMO) to share this code via composition rather
>>> than inheritance. In this particular case, I would prefer to create a
>>> class StrongRootsProcessor that encapsulates the root processing logic.
>>> Then GenCollectedHeap and CMSHeap can both contain an instance of
>>> StrongRootsProcessor.
>>>
>>> What do you think of this approach? Do you have some spare cycles to
>>> try
>>> this approach out?
>>>
>>> Thanks,
>>> Erik
>>>
>>> On 06/02/2017 10:55 AM, Roman Kennke wrote:
>>>> Take this patch. It #ifdef ASSERT's a call to check_gen_kinds()
>>>> that is
>>>> only present in debug builds.
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.01/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.01/>
>>>>
>>>> Roman
>>>>
>>>> Am 01.06.2017 um 22:50 schrieb Roman Kennke:
>>>>> What $SUBJECT says.
>>>>>
>>>>> I went over genCollectedHeap.[hpp|cpp] and moved everything that I
>>>>> could
>>>>> find that is CMS-only into a new CMSHeap class.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.00/
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.00/>
>>>>>
>>>>> It is possible that I overlooked something there. There may be
>>>>> code in
>>>>> there that doesn't shout "CMS" at me, but is still intrinsically
>>>>> CMS stuff.
>>>>>
>>>>> Also not that I have not removed that little part:
>>>>>
>>>>>    always_do_update_barrier = UseConcMarkSweepGC;
>>>>>
>>>>> because I expect it to go away with Erik ?'s big refactoring.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Testing: hotspot_gc, specjvm, some little apps with
>>>>> -XX:+UseConcMarkSweepGC
>>>>>
>>>>> Roman
>>>>>
>>


From stefan.johansson at oracle.com  Mon Jul  3 13:12:51 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 3 Jul 2017 15:12:51 +0200
Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count
 number of cards concurrently refined
In-Reply-To: <1499075917.2802.8.camel@oracle.com>
References: <1499075917.2802.8.camel@oracle.com>
Message-ID: <4331c480-7c20-8d39-9ea9-7418a86a878d@oracle.com>


On 2017-07-03 11:58, Thomas Schatzl wrote:
> Hi all,
>
>    can I have reviews for this small change that
> makes G1Remset::_conc_refined_cards only count the number of
> concurrently refined cards (+ some trivial renaming of the variable)?
>
> The reason is that I plan to add the number of refined cards during gc
> as separately soon. This has been suggested earlier in some internal
> discussion, and I agree.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8179677
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/
Looks good,
StefanJ
> Testing:
> jprt
>
> Thanks,
>    Thomas


From stefan.johansson at oracle.com  Mon Jul  3 14:27:14 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 3 Jul 2017 16:27:14 +0200
Subject: RFR (XS): 8183397: Ensure consistent closure filtering during
 evacuation
In-Reply-To: <1499081088.2802.29.camel@oracle.com>
References: <1499081088.2802.29.camel@oracle.com>
Message-ID: <80f207b6-5458-10d7-f40b-8001887adef4@oracle.com>


On 2017-07-03 13:24, Thomas Schatzl wrote:
> Hi all,
>
>    can I have reviews for this change that fixes an observation that has
> been made recently by Erik, i.e. that the "else" part of several
> evacuation closures inconsistently filters out non-cross-region
> references before checking whether the referenced object is a humongous
> or ext region.
>
> This causes somewhat hard to diagnose performance issues, and earlier
> filtering does not hurt if done anyway.
>
> (Note that the current way of checking in all but the UpdateRS closure
> using HeapRegion::is_in_same_region() seems optimal. The only reason
> why the other way in the UpdateRS closure is better because the code
> needs the "to" HeapRegion pointer anyway)
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183397
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183397/webrev/
Looks good,
StefanJ
> Testing:
> jprt, performance regression analysis
>
> Thanks,
>    Thomas


From rkennke at redhat.com  Mon Jul  3 14:39:34 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 3 Jul 2017 16:39:34 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <72d197f7-a99b-84bc-26f7-c9a84da26ccd@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <f797c416-e8a2-bc33-3f33-58ae51830aaf@oracle.com>
 <676d3b56-cee0-b68a-d700-e43695355148@redhat.com>
 <1fbd2b4a-9aef-d6db-726e-929b6b466e4c@oracle.com>
 <08391C19-4675-475C-A30D-F10B364B5AF3@redhat.com>
 <9a882506-282a-ec74-27de-5b22e258e352@oracle.com>
 <47667919-0786-56a0-ebf9-d7c1b48766c2@redhat.com>
 <72d197f7-a99b-84bc-26f7-c9a84da26ccd@oracle.com>
Message-ID: <f96ff4f9-ba74-6097-ebe7-1b4ab5b5da99@redhat.com>

Hi Robbin,

does this require another review? I am not sure about David Holmes?

If not, I'm going to need a sponsor.

Thanks and cheers,
Roman

Am 29.06.2017 um 21:27 schrieb Robbin Ehn:
> Hi Roman,
>
> Thanks,
>
> There seem to be a performance gain vs old just running VM thread
> (again shaky numbers, but an indication):
>
> Old code with,   MonitorUsedDeflationThreshold=0, 0.003099s, avg of 10
> worsed cleanups 0.0213s
> Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s, avg of 10
> worsed cleanups 0.0173s
>
> I'm assuming that combining deflation and nmethods marking in same
> pass is the reason for this.
> Great!
>
> I'm happy, looks good!
>
> Thanks for fixing!
>
> /Robbin
>
> On 06/29/2017 08:25 PM, Roman Kennke wrote:
>> I just did a run with gcbench.
>> I am running:
>>
>> build/linux-x86_64-normal-server-release/images/jdk/bin/java -jar
>> target/benchmarks.jar roots.Sync --jvmArgs "-Xmx8g -Xms8g
>> -XX:ParallelSafepointCleanupThreads=1 -XX:-UseBiasedLocking --add-opens
>> java.base/jdk.internal.misc=ALL-UNNAMED -XX:+PrintSafepointStatistics"
>> -p size=500000 -wi 5 -i 5 -f 1
>>
>> i.e. I am giving it 500,000 monitors per thread on 8 java threads.
>>
>> with VMThread I am getting:
>>
>>            vmop                            [ threads:    total
>> initially_running wait_to_block ][ time:    spin   block    sync
>> cleanup    vmop ] page_trap_count
>>     0,646: G1IncCollectionPause            [
>> 19                 4             6 ][             0       0       0
>> 158     225 ]               4
>>     1,073: G1IncCollectionPause            [
>> 19                 5             6 ][             1       0       1
>> 159     174 ]               5
>>     1,961: G1IncCollectionPause            [
>> 19                 2             6 ][             0       0       0
>> 130      66 ]               2
>>     2,202: G1IncCollectionPause            [
>> 19                 5             6 ][             1       0       1
>> 127      70 ]               5
>>     2,445: G1IncCollectionPause            [
>> 19                 7             7 ][             1       0       1
>> 127      66 ]               7
>>     2,684: G1IncCollectionPause            [
>> 19                 7             7 ][             1       0       1
>> 127      66 ]               7
>>     3,371: G1IncCollectionPause            [
>> 19                 5             7 ][             1       0       1
>> 127      74 ]               5
>>     3,619: G1IncCollectionPause            [
>> 19                 5             6 ][             1       0       1
>> 127      66 ]               5
>>     3,857: G1IncCollectionPause            [
>> 19                 6             6 ][             1       0       1
>> 126      68 ]               6
>>
>> I.e. it gets to fairly consistent >120us for cleanup.
>>
>> With 4 safepoint cleanup threads I get:
>>
>>
>>            vmop                            [ threads:    total
>> initially_running wait_to_block ][ time:    spin   block    sync
>> cleanup    vmop ] page_trap_count
>>     0,650: G1IncCollectionPause            [
>> 19                 4             6 ][             0       0       0
>> 63     197 ]               4
>>     0,951: G1IncCollectionPause            [
>> 19                 0             1 ][             0       0       0
>> 64     151 ]               0
>>     1,214: G1IncCollectionPause            [
>> 19                 7             8 ][             0       0       0
>> 62      93 ]               6
>>     1,942: G1IncCollectionPause            [
>> 19                 4             6 ][             1       0       1
>> 59      71 ]               4
>>     2,118: G1IncCollectionPause            [
>> 19                 6             6 ][             1       0       1
>> 59      72 ]               6
>>     2,296: G1IncCollectionPause            [
>> 19                 5             6 ][             0       0       0
>> 59      69 ]               5
>>
>> i.e. fairly consistently around 60 us (I think it's us?!)
>>
>> I grant you that I'm throwing way way more monitors at it. With just
>> 12000 monitors per thread I get columns of 0s under cleanup. :-)
>>
>> Roman
>>
>> Here's with 1 tAm 29.06.2017 um 14:17 schrieb Robbin Ehn:
>>> The test is using 24 threads (whatever that means), total number of
>>> javathreads is 57 (including compiler, etc...).
>>>
>>> [29.186s][error][os       ] Num threads:57
>>> [29.186s][error][os       ] omInUseCount:0
>>> [29.186s][error][os       ] omInUseCount:2064
>>> [29.187s][error][os       ] omInUseCount:1861
>>> [29.188s][error][os       ] omInUseCount:1058
>>> [29.188s][error][os       ] omInUseCount:2
>>> [29.188s][error][os       ] omInUseCount:577
>>> [29.189s][error][os       ] omInUseCount:1443
>>> [29.189s][error][os       ] omInUseCount:122
>>> [29.189s][error][os       ] omInUseCount:47
>>> [29.189s][error][os       ] omInUseCount:497
>>> [29.189s][error][os       ] omInUseCount:16
>>> [29.189s][error][os       ] omInUseCount:113
>>> [29.189s][error][os       ] omInUseCount:5
>>> [29.189s][error][os       ] omInUseCount:678
>>> [29.190s][error][os       ] omInUseCount:105
>>> [29.190s][error][os       ] omInUseCount:609
>>> [29.190s][error][os       ] omInUseCount:286
>>> [29.190s][error][os       ] omInUseCount:228
>>> [29.190s][error][os       ] omInUseCount:1391
>>> [29.191s][error][os       ] omInUseCount:1652
>>> [29.191s][error][os       ] omInUseCount:325
>>> [29.191s][error][os       ] omInUseCount:439
>>> [29.192s][error][os       ] omInUseCount:994
>>> [29.192s][error][os       ] omInUseCount:103
>>> [29.192s][error][os       ] omInUseCount:2337
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:1
>>> [29.193s][error][os       ] omInUseCount:1
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:1
>>> [29.193s][error][os       ] omInUseCount:2
>>> [29.193s][error][os       ] omInUseCount:1
>>> [29.193s][error][os       ] omInUseCount:1
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:1
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:1
>>> [29.193s][error][os       ] omInUseCount:0
>>> [29.193s][error][os       ] omInUseCount:0
>>>
>>> So in my setup even if you parallel the per thread in use monitors
>>> work the synchronization overhead is still larger.
>>>
>>> /Robbin
>>>
>>> On 06/29/2017 01:42 PM, Roman Kennke wrote:
>>>> How many Java threads are involved in monitor Inflation ?
>>>> Parallelization is spread by Java threads (i.e. each worker claims
>>>> and deflates monitors of 1 java thread per step).
>>>>
>>>> Roman
>>>>
>>>> Am 29. Juni 2017 12:49:58 MESZ schrieb Robbin Ehn
>>>> <robbin.ehn at oracle.com>:
>>>>
>>>>      Hi Roman,
>>>>
>>>>      I haven't had the time to test all scenarios, and the numbers are
>>>> just an indication:
>>>>
>>>>      Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s avg,
>>>> avg of 10 worsed cleanups 0.0173s
>>>>      Do it 4 workers, MonitorUsedDeflationThreshold=0, 0.002923s avg,
>>>> avg of 10 worsed cleanups 0.0199s
>>>>      Do it VM thread, MonitorUsedDeflationThreshold=1, 0.001889s avg,
>>>> avg of 10 worsed cleanups 0.0066s
>>>>
>>>>      When MonitorUsedDeflationThreshold=0 we are talking about 120000
>>>> free monitors to deflate.
>>>>      And I get worse numbers doing the cleanup in 4 threads.
>>>>
>>>>      Any idea why I see these numbers?
>>>>
>>>>      Thanks, Robbin
>>>>
>>>>      On 06/28/2017 10:23 PM, Roman Kennke wrote:
>>>>
>>>>
>>>>
>>>>              On 06/27/2017 09:47 PM, Roman Kennke wrote:
>>>>
>>>>                  Hi Robbin,
>>>>
>>>>                  Ugh. Thanks for catching this.
>>>>                  Problem was that I was accounting the thread-local
>>>> deflations twice:
>>>>                  once in thread-local processing (basically a leftover
>>>> from my earlier
>>>>                  attempt to implement this accounting) and then
>>>> again in
>>>>                  finish_deflate_idle_monitors(). Should be fixed here:
>>>>
>>>>                 
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/
>>>>                
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.09/>
>>>>
>>>>
>>>>              Nit:
>>>>              safepoint.cpp : ParallelSPCleanupTask
>>>>              "const char* name = " is not needed and 1 is unused
>>>>
>>>>
>>>>          Sorry, I don't understand what you mean by this. I see code
>>>> like this:
>>>>
>>>>          const char* name = "deflating idle monitors";
>>>>
>>>>          and it is used a few lines below, even 2x.
>>>>
>>>>          What's '1 is unused' ?
>>>>
>>>>
>>>>                  Side question: which jtreg targets do you usually
>>>> run?
>>>>
>>>>
>>>>              Right now I cherry pick directories from: hotspot/test/
>>>>
>>>>              I'm going to add a decent test group for local testing.
>>>>
>>>>          That would be good!
>>>>
>>>>
>>>>
>>>>
>>>>                  Trying: make test TEST=hotspot_all
>>>>                  gives me *lots* of failures due to missing jcstress
>>>> stuff (?!)
>>>>                  And even other subsets seem to depend on several bits
>>>> and pieces
>>>>                  that I
>>>>                  have no idea about.
>>>>
>>>>
>>>>              Yes, you need to use internal tool 'jib' java integrate
>>>> build to get
>>>>              that work or you can set some environment where the
>>>> jcstress
>>>>              application stuff is...
>>>>
>>>>          Uhhh. We really do want a subset of tests that we can run
>>>> reliably and
>>>>          that are self-contained, how else are people (without that
>>>> jib thingy)
>>>>          supposed to do some sanity checking with their patches? ;-)
>>>>
>>>>              I have a regression on ClassLoaderData root scanning,
>>>> this should not
>>>>              be related,
>>>>              but I only have 3 patches which could cause this, if it's
>>>> not
>>>>              something in the environment that have changed.
>>>>
>>>>          Let me know if it's my patch :-)
>>>>
>>>>
>>>>              Also do not see any immediate performance gains (off vs 4
>>>> threads), it
>>>>              might be
>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/06994badeb24
>>>>              , but I need to-do some more testing. I know you often
>>>> run with none
>>>>              default GSI.
>>>>
>>>>
>>>>          First of all, during the course of this review I reduced the
>>>> change from
>>>>          an actual implementation to a kind of framework, and it needs
>>>> some
>>>>          separate changes in the GC to make use of it. Not sure if you
>>>> added
>>>>          corresponding code in (e.g.) G1?
>>>>
>>>>          Also, this is only really visible in code that makes
>>>> excessive use of
>>>>          monitors, i.e. the one linked by Carsten's original patch, or
>>>> the test
>>>>          org.openjdk.gcbench.roots.Synchronizers.test in gc-bench:
>>>>
>>>>          http://icedtea.classpath.org/hg/gc-bench/
>>>>
>>>>          There are also some popular real-world apps that tend to do
>>>> this. From
>>>>          the top off my head, Cassandra is such an application.
>>>>
>>>>          Thanks, Roman
>>>>
>>>>
>>>>              I'll get back to you.
>>>>
>>>>              Thanks, Robbin
>>>>
>>>>
>>>>                  Roman
>>>>
>>>>                  Am 27.06.2017 um 16:51 schrieb Robbin Ehn:
>>>>
>>>>                      Hi Roman,
>>>>
>>>>                      There is something wrong in calculations:
>>>>                      INFO: Deflate: InCirc=43 InUse=18 Scavenged=25
>>>> ForceMonitorScavenge=0
>>>>                      : pop=27051 free=215487
>>>>
>>>>                      free is larger than population, have not had the
>>>> time to dig into this.
>>>>
>>>>                      Thanks, Robbin
>>>>
>>>>                      On 06/22/2017 10:19 PM, Roman Kennke wrote:
>>>>
>>>>                          So here's the latest iteration of that patch:
>>>>
>>>>                        
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.08/
>>>>                        
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.08/>
>>>>
>>>>                          I checked and fixed all the counters. The
>>>> problem here is that they
>>>>                          are
>>>>                          not updated in a single place
>>>> (deflate_idle_monitors() ) but in
>>>>                          several
>>>>                          places, potentially by multiple threads. I
>>>> split up deflation into
>>>>                          prepare_.. and a finish_.. methods to
>>>> initialize local and update
>>>>                          global
>>>>                          counters respectively, and pass around a
>>>> counters object (allocated on
>>>>                          stack) to the various code paths that use it.
>>>> Updating the counters
>>>>                          always happen under a lock, there's no need
>>>> to do anything special
>>>>                          with
>>>>                          regards to concurrency.
>>>>
>>>>                          I also checked the nmethod marking, but there
>>>> doesn't seem to be
>>>>                          anything in that code that looks problematic
>>>> under concurrency. The
>>>>                          worst that can happen is that two threads
>>>> write the same value into an
>>>>                          nmethod field. I think we can live with
>>>> that ;-)
>>>>
>>>>                          Good to go?
>>>>
>>>>                          Tested by running specjvm and jcstress
>>>> fastdebug+release without
>>>>                          issues.
>>>>
>>>>                          Roman
>>>>
>>>>                          Am 02.06.2017 um 12:39 schrieb Robbin Ehn:
>>>>
>>>>                              Hi Roman,
>>>>
>>>>                              On 06/02/2017 11:41 AM, Roman Kennke
>>>> wrote:
>>>>
>>>>                                  Hi David,
>>>>                                  thanks for reviewing. I'll be on
>>>> vacation the next two weeks too,
>>>>                                  with
>>>>                                  only sporadic access to work stuff.
>>>>                                  Yes, exposure will not be as good as
>>>> otherwise, but it's not totally
>>>>                                  untested either: the serial code path
>>>> is the same as the
>>>>                                  parallel, the
>>>>                                  only difference is that it's not
>>>> actually called by multiple
>>>>                                  threads.
>>>>                                  It's ok I think.
>>>>
>>>>                                  I found two more issues that I think
>>>> should be addressed:
>>>>                                  - There are some counters in
>>>> deflate_idle_monitors() and I'm not
>>>>                                  sure I
>>>>                                  correctly handle them in the split-up
>>>> and MT'ed thread-local/ global
>>>>                                  list deflation
>>>>                                  - nmethod marking seems to
>>>> unconditionally poke true or something
>>>>                                  like
>>>>                                  that in nmethod fields. This doesn't
>>>> hurt correctness-wise, but it's
>>>>                                  probably worth checking if it's
>>>> already true, especially when doing
>>>>                                  this
>>>>                                  with multiple threads concurrently.
>>>>
>>>>                                  I'll send an updated patch around
>>>> later, I hope I can get to it
>>>>                                  today...
>>>>
>>>>
>>>>                              I'll review that when you get it out.
>>>>                              I think this looks as a reasonable step
>>>> before we tackle this with a
>>>>                              major effort, such as the JEP you and
>>>> Carsten doing.
>>>>                              And another effort to 'fix' nmethods
>>>> marking.
>>>>
>>>>                              Internal discussion yesterday lead us to
>>>> conclude that the runtime
>>>>                              will probably need more threads.
>>>>                              This would be a good driver to do a
>>>> 'global' worker pool which serves
>>>>                              both gc, runtime and safepoints with
>>>> threads.
>>>>
>>>>
>>>>                                  Roman
>>>>
>>>>                                      Hi Roman,
>>>>
>>>>                                      I am about to disappear on an
>>>> extended vacation so will let others
>>>>                                      pursue this. IIUC this is longer
>>>> an opt-in by the user at runtime,
>>>>                                      but
>>>>                                      an opt-in by the particular GC
>>>> developers. Okay. My only concern
>>>>                                      with
>>>>                                      that is if Shenandoah is the only
>>>> GC that currently opts in then
>>>>                                      this
>>>>                                      code is not going to get much
>>>> testing and will be more prone to
>>>>                                      incidental breakage.
>>>>
>>>>
>>>>                              As I mentioned before, it seem like Erik
>>>> ? have some idea, maybe he
>>>>                              can do this after his barrier patch.
>>>>
>>>>                              Thanks!
>>>>
>>>>                              /Robbin
>>>>
>>>>
>>>>                                      Cheers,
>>>>                                      David
>>>>
>>>>                                      On 2/06/2017 2:21 AM, Roman
>>>> Kennke wrote:
>>>>
>>>>                                          Am 01.06.2017 um 17:50
>>>> schrieb Roman Kennke:
>>>>
>>>>                                              Am 01.06.2017 um 14:18
>>>> schrieb Robbin Ehn:
>>>>
>>>>                                                  Hi Roman,
>>>>
>>>>                                                  On 06/01/2017 11:29
>>>> AM, Roman Kennke wrote:
>>>>
>>>>                                                      Am 31.05.2017 um
>>>> 22:06 schrieb Robbin Ehn:
>>>>
>>>>                                                          Hi Roman, I
>>>> agree that is really needed but:
>>>>
>>>>                                                          On 05/31/2017
>>>> 10:27 AM, Roman Kennke wrote:
>>>>
>>>>                                                              I
>>>> realized that sharing workers with GC is not so easy.
>>>>
>>>>                                                              We need
>>>> to be able to use the workers at a safepoint during
>>>>                                                             
>>>> concurrent
>>>>                                                              GC work
>>>> (which also uses the same workers). This does not
>>>>                                                              only
>>>>                                                              require
>>>>                                                              that
>>>> those workers be suspended, like e.g.
>>>>                                                            
>>>> SuspendibleThreadSet::yield(), but they need to be idle, i.e.
>>>>                                                              have
>>>>                                                              finished
>>>> their tasks. This needs some careful handling to
>>>>                                                              work
>>>>                                                              without
>>>>                                                              races: it
>>>> requires a SuspendibleThreadSetJoiner around the
>>>>                                                            
>>>> corresponding
>>>>                                                            
>>>> run_task() call and also the tasks themselves need to join
>>>>                                                              the
>>>>                                                              STS and
>>>>                                                              handle
>>>> requests for safepoints not by yielding, but by
>>>>                                                              leaving
>>>>                                                              the
>>>>                                                              task.
>>>>                                                              This is
>>>> far too peculiar for me to make the call to hook
>>>>                                                              up GC
>>>>                                                              workers
>>>>                                                              for
>>>> safepoint cleanup, and I thus removed those parts. I
>>>>                                                              left the
>>>>                                                              API in
>>>>                                                            
>>>> CollectedHeap in place. I think GC devs who know better
>>>>                                                              about G1
>>>>                                                              and CMS
>>>>                                                              should
>>>> make that call, or else just use a separate thread
>>>>                                                              pool.
>>>>
>>>>                                                            
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/
>>>>                                                            
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.05/>
>>>>
>>>>                                                              Is it ok
>>>> now?
>>>>
>>>>                                                          I still think
>>>> you should put the "Parallel Safepoint Cleanup"
>>>>                                                          workers
>>>>                                                          inside
>>>> Shenandoah,
>>>>                                                          so the
>>>> SafepointSynchronizer only calls get_safepoint_workers,
>>>>                                                          e.g.:
>>>>
>>>>                                                        
>>>> _cleanup_workers = heap->get_safepoint_workers();
>>>>                                                        
>>>> _num_cleanup_workers = _cleanup_workers != NULL ?
>>>>                                                        
>>>> _cleanup_workers->total_workers() : 1;
>>>>                                                        
>>>> ParallelSPCleanupTask cleanup(_cleanup_subtasks);
>>>>                                                        
>>>> StrongRootsScope srs(_num_cleanup_workers);
>>>>                                                          if
>>>> (_cleanup_workers != NULL) {
>>>>                                                        
>>>> _cleanup_workers->run_task(&cleanup,
>>>>                                                        
>>>> _num_cleanup_workers);
>>>>                                                          } else {
>>>>                                                          cleanup.work
>>>> <http://cleanup.work>(0);
>>>>                                                          }
>>>>
>>>>                                                          That way you
>>>> don't even need your new flags, but it will be
>>>>                                                          up to
>>>>                                                          the
>>>>                                                          other GCs to
>>>> make their worker available
>>>>                                                          or cheat with
>>>> a separate workgang.
>>>>
>>>>                                                      I can do that, I
>>>> don't mind. The question is, do we want that?
>>>>
>>>>                                                  The problem is that
>>>> we do not want to haste such decision, we
>>>>                                                  believe
>>>>                                                  there is a better
>>>> solution.
>>>>                                                  I think you also
>>>> would want another solution.
>>>>                                                  But it's seems like
>>>> such solution with 1 'global' thread pool
>>>>                                                  either
>>>>                                                  own by GC or the VM
>>>> it self is quite the undertaking.
>>>>                                                  Since this probably
>>>> will not be done any time soon my
>>>>                                                  suggestion is,
>>>>                                                  to not hold you back
>>>> (we also want this), just to make
>>>>                                                  the code parallel and
>>>> as an intermediate step ask the GC if it
>>>>                                                  minds
>>>>                                                  sharing it's thread.
>>>>
>>>>                                                  Now when Shenandoah
>>>> is merged it's possible that e.g. G1 will
>>>>                                                  share
>>>>                                                  the code for a
>>>> separate thread pool, do something of it's own or
>>>>                                                  wait until the bigger
>>>> question about thread pool(s) have been
>>>>                                                  resolved.
>>>>
>>>>                                                  By adding a thread
>>>> pool directly to the SafepointSynchronizer
>>>>                                                  and
>>>>                                                  flags for it we might
>>>> limit our future options.
>>>>
>>>>                                                      I wouldn't call
>>>> it 'cheating with a separate workgang'
>>>>                                                      though. I
>>>>                                                      see
>>>>                                                      that both G1 and
>>>> CMS suspend their worker threads at a
>>>>                                                      safepoint.
>>>>                                                      However:
>>>>
>>>>                                                  Yes it's not cheating
>>>> but I want decent heuristics between e.g.
>>>>                                                  number
>>>>                                                  of concurrent marking
>>>> threads and parallel safepoint threads
>>>>                                                  since
>>>>                                                  they compete for cpu
>>>> time.
>>>>                                                  As the code looks
>>>> now, I think that decisions must be made by
>>>>                                                  the
>>>>                                                  GC.
>>>>
>>>>                                              Ok, I see your point. I
>>>> updated the proposed patch accordingly:
>>>>
>>>>                                            
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.06/
>>>>                                            
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.06/>
>>>>
>>>>                                          Oops. Minor mistake there.
>>>> Correction:
>>>>                                        
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.07/
>>>>                                        
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.07/>
>>>>
>>>>                                          (Removed 'class WorkGang'
>>>> from safepoint.hpp, and forgot to add it
>>>>                                          into
>>>>                                          collectedHeap.hpp, resulting
>>>> in build failure...)
>>>>
>>>>                                          Roman
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet.
>>
>>


From thomas.schatzl at oracle.com  Mon Jul  3 15:04:42 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 03 Jul 2017 17:04:42 +0200
Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count
 number of cards concurrently refined
In-Reply-To: <1499075917.2802.8.camel@oracle.com>
References: <1499075917.2802.8.camel@oracle.com>
Message-ID: <1499094282.2802.132.camel@oracle.com>

Hi all,

? Erik asked for a few renamings and some additional comments. Here are
the new webrevs:
http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1_to_2?(diff)
http://cr.openjdk.java.net/~tschatzl/8179677/webrev.2?(full)

Thanks,
? Thomas

On Mon, 2017-07-03 at 11:58 +0200, Thomas Schatzl wrote:
> Hi all,
> 
> ? can I have reviews for this small change that
> makes?G1Remset::_conc_refined_cards only count the number of
> concurrently refined cards (+ some trivial renaming of the variable)?
> 
> The reason is that I plan to add the number of refined cards during
> gc
> as separately soon. This has been suggested earlier in some internal
> discussion, and I agree.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8179677
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/
> Testing:
> jprt
> 
> Thanks,
> ? Thomas


From rkennke at redhat.com  Mon Jul  3 15:05:29 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 3 Jul 2017 17:05:29 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
Message-ID: <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>

Am 03.07.2017 um 11:13 schrieb Roman Kennke:
> Am 03.07.2017 um 09:35 schrieb Mikael Gerdin:
>> Hi Roman,
>>
>> On 2017-06-30 18:32, Roman Kennke wrote:
>>> I came across one problem using this approach: We will have 2 instances
>>> of CollectedHeap around, where there's usually only 1, and some code
>>> expects only 1. For example, in CollectedHeap constructor, we create new
>>> PerfData variables, and we now create them 2x, which leads to an assert
>>> being thrown. I suspect there is more code like that.
>>>
>>> I will attempt to refactor this a little more, maybe it's not that bad,
>>> but it's probably not worth spending too much time on it.
>> I think refactoring the code to not expect a singleton CollectedHeap
>> instance is a bit too much.
>> Perhaps there is another way to share common code between Serial and
>> CMS but that might require a bit more thought.
> Yeah, definitely. I hit another difficulty: pretty much the same issues
> that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now show up
> with Generation and its subclasses..
>
> How about we push the original patch that I've posted, and work from
> there? In fact, I *have* found some little things I would change (some
> more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have
> overlooked in my first pass...)

So here's the little change (two more places in genCollectedHeap.hpp
where UseConcMarkSweepGC was used to alter behaviour:

http://cr.openjdk.java.net/~rkennke/8179387/webrev.02/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.02/>

Ok to push this?

Roman


From thomas.schatzl at oracle.com  Mon Jul  3 15:22:01 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 03 Jul 2017 17:22:01 +0200
Subject: RFR (XXS): 8183394: Remove unused G1RemSet::n_workers() method
In-Reply-To: <55027601-074b-b92a-7516-b08282291b70@oracle.com>
References: <1499075612.2802.5.camel@oracle.com>
 <55027601-074b-b92a-7516-b08282291b70@oracle.com>
Message-ID: <1499095321.2802.134.camel@oracle.com>

Thanks Erik, Mikael for your reviews!

Thomas

On Mon, 2017-07-03 at 14:44 +0200, Erik Helin wrote:
> On 07/03/2017 11:53 AM, Thomas Schatzl wrote:
> > 
> > Hi all,
> > 
> > ? can I have a review for this trivial removal of an unused method?
> > One
> > Reviewer should be sufficient for this ;)
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8183394
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8183394/webrev/
> Looks good, Reviewed. Thanks for cleaning this up!
> Erik
> 
> > 
> > Testing:
> > Local compilation
> > 
> > Thanks,
> > ? Thomas
> > 


From erik.helin at oracle.com  Mon Jul  3 15:41:44 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 3 Jul 2017 17:41:44 +0200
Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count
 number of cards concurrently refined
In-Reply-To: <1499094282.2802.132.camel@oracle.com>
References: <1499075917.2802.8.camel@oracle.com>
 <1499094282.2802.132.camel@oracle.com>
Message-ID: <47787650-1ade-d89c-29a8-3b8b6e4e8bd0@oracle.com>

On 07/03/2017 05:04 PM, Thomas Schatzl wrote:
> Hi all,
>
>   Erik asked for a few renamings and some additional comments. Here are
> the new webrevs:
> http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1_to_2 (diff)
> http://cr.openjdk.java.net/~tschatzl/8179677/webrev.2 (full)

Looks good, Reviewed. Thanks Thomas!
Erik

> Thanks,
>   Thomas
>
> On Mon, 2017-07-03 at 11:58 +0200, Thomas Schatzl wrote:
>> Hi all,
>>
>>   can I have reviews for this small change that
>> makes G1Remset::_conc_refined_cards only count the number of
>> concurrently refined cards (+ some trivial renaming of the variable)?
>>
>> The reason is that I plan to add the number of refined cards during
>> gc
>> as separately soon. This has been suggested earlier in some internal
>> discussion, and I agree.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8179677
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/
>> Testing:
>> jprt
>>
>> Thanks,
>>   Thomas


From robbin.ehn at oracle.com  Tue Jul  4 07:11:50 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 4 Jul 2017 09:11:50 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <f96ff4f9-ba74-6097-ebe7-1b4ab5b5da99@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <f797c416-e8a2-bc33-3f33-58ae51830aaf@oracle.com>
 <676d3b56-cee0-b68a-d700-e43695355148@redhat.com>
 <1fbd2b4a-9aef-d6db-726e-929b6b466e4c@oracle.com>
 <08391C19-4675-475C-A30D-F10B364B5AF3@redhat.com>
 <9a882506-282a-ec74-27de-5b22e258e352@oracle.com>
 <47667919-0786-56a0-ebf9-d7c1b48766c2@redhat.com>
 <72d197f7-a99b-84bc-26f7-c9a84da26ccd@oracle.com>
 <f96ff4f9-ba74-6097-ebe7-1b4ab5b5da99@redhat.com>
Message-ID: <f44e9119-8796-410b-7c40-2af8ce9fb94b@oracle.com>

Hi Roman,

On 07/03/2017 04:39 PM, Roman Kennke wrote:
> Hi Robbin,
> 
> does this require another review? I am not sure about David Holmes?

David is back in Aug, I think he was pretty okey with but I think we should get another review.
Must of our people had extended weekend and are back tomorrow.
I'm soon of for 5 weeks and I really want this to be pushed before that.

> 
> If not, I'm going to need a sponsor.

I will of course take care of that!

/Robbin

> 
> Thanks and cheers,
> Roman
> 
> Am 29.06.2017 um 21:27 schrieb Robbin Ehn:
>> Hi Roman,
>>
>> Thanks,
>>
>> There seem to be a performance gain vs old just running VM thread
>> (again shaky numbers, but an indication):
>>
>> Old code with,   MonitorUsedDeflationThreshold=0, 0.003099s, avg of 10
>> worsed cleanups 0.0213s
>> Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s, avg of 10
>> worsed cleanups 0.0173s
>>
>> I'm assuming that combining deflation and nmethods marking in same
>> pass is the reason for this.
>> Great!
>>
>> I'm happy, looks good!
>>
>> Thanks for fixing!
>>
>> /Robbin
>>
>> On 06/29/2017 08:25 PM, Roman Kennke wrote:
>>> I just did a run with gcbench.
>>> I am running:
>>>
>>> build/linux-x86_64-normal-server-release/images/jdk/bin/java -jar
>>> target/benchmarks.jar roots.Sync --jvmArgs "-Xmx8g -Xms8g
>>> -XX:ParallelSafepointCleanupThreads=1 -XX:-UseBiasedLocking --add-opens
>>> java.base/jdk.internal.misc=ALL-UNNAMED -XX:+PrintSafepointStatistics"
>>> -p size=500000 -wi 5 -i 5 -f 1
>>>
>>> i.e. I am giving it 500,000 monitors per thread on 8 java threads.
>>>
>>> with VMThread I am getting:
>>>
>>>             vmop                            [ threads:    total
>>> initially_running wait_to_block ][ time:    spin   block    sync
>>> cleanup    vmop ] page_trap_count
>>>      0,646: G1IncCollectionPause            [
>>> 19                 4             6 ][             0       0       0
>>> 158     225 ]               4
>>>      1,073: G1IncCollectionPause            [
>>> 19                 5             6 ][             1       0       1
>>> 159     174 ]               5
>>>      1,961: G1IncCollectionPause            [
>>> 19                 2             6 ][             0       0       0
>>> 130      66 ]               2
>>>      2,202: G1IncCollectionPause            [
>>> 19                 5             6 ][             1       0       1
>>> 127      70 ]               5
>>>      2,445: G1IncCollectionPause            [
>>> 19                 7             7 ][             1       0       1
>>> 127      66 ]               7
>>>      2,684: G1IncCollectionPause            [
>>> 19                 7             7 ][             1       0       1
>>> 127      66 ]               7
>>>      3,371: G1IncCollectionPause            [
>>> 19                 5             7 ][             1       0       1
>>> 127      74 ]               5
>>>      3,619: G1IncCollectionPause            [
>>> 19                 5             6 ][             1       0       1
>>> 127      66 ]               5
>>>      3,857: G1IncCollectionPause            [
>>> 19                 6             6 ][             1       0       1
>>> 126      68 ]               6
>>>
>>> I.e. it gets to fairly consistent >120us for cleanup.
>>>
>>> With 4 safepoint cleanup threads I get:
>>>
>>>
>>>             vmop                            [ threads:    total
>>> initially_running wait_to_block ][ time:    spin   block    sync
>>> cleanup    vmop ] page_trap_count
>>>      0,650: G1IncCollectionPause            [
>>> 19                 4             6 ][             0       0       0
>>> 63     197 ]               4
>>>      0,951: G1IncCollectionPause            [
>>> 19                 0             1 ][             0       0       0
>>> 64     151 ]               0
>>>      1,214: G1IncCollectionPause            [
>>> 19                 7             8 ][             0       0       0
>>> 62      93 ]               6
>>>      1,942: G1IncCollectionPause            [
>>> 19                 4             6 ][             1       0       1
>>> 59      71 ]               4
>>>      2,118: G1IncCollectionPause            [
>>> 19                 6             6 ][             1       0       1
>>> 59      72 ]               6
>>>      2,296: G1IncCollectionPause            [
>>> 19                 5             6 ][             0       0       0
>>> 59      69 ]               5
>>>
>>> i.e. fairly consistently around 60 us (I think it's us?!)
>>>
>>> I grant you that I'm throwing way way more monitors at it. With just
>>> 12000 monitors per thread I get columns of 0s under cleanup. :-)
>>>
>>> Roman
>>>
>>> Here's with 1 tAm 29.06.2017 um 14:17 schrieb Robbin Ehn:
>>>> The test is using 24 threads (whatever that means), total number of
>>>> javathreads is 57 (including compiler, etc...).
>>>>
>>>> [29.186s][error][os       ] Num threads:57
>>>> [29.186s][error][os       ] omInUseCount:0
>>>> [29.186s][error][os       ] omInUseCount:2064
>>>> [29.187s][error][os       ] omInUseCount:1861
>>>> [29.188s][error][os       ] omInUseCount:1058
>>>> [29.188s][error][os       ] omInUseCount:2
>>>> [29.188s][error][os       ] omInUseCount:577
>>>> [29.189s][error][os       ] omInUseCount:1443
>>>> [29.189s][error][os       ] omInUseCount:122
>>>> [29.189s][error][os       ] omInUseCount:47
>>>> [29.189s][error][os       ] omInUseCount:497
>>>> [29.189s][error][os       ] omInUseCount:16
>>>> [29.189s][error][os       ] omInUseCount:113
>>>> [29.189s][error][os       ] omInUseCount:5
>>>> [29.189s][error][os       ] omInUseCount:678
>>>> [29.190s][error][os       ] omInUseCount:105
>>>> [29.190s][error][os       ] omInUseCount:609
>>>> [29.190s][error][os       ] omInUseCount:286
>>>> [29.190s][error][os       ] omInUseCount:228
>>>> [29.190s][error][os       ] omInUseCount:1391
>>>> [29.191s][error][os       ] omInUseCount:1652
>>>> [29.191s][error][os       ] omInUseCount:325
>>>> [29.191s][error][os       ] omInUseCount:439
>>>> [29.192s][error][os       ] omInUseCount:994
>>>> [29.192s][error][os       ] omInUseCount:103
>>>> [29.192s][error][os       ] omInUseCount:2337
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:1
>>>> [29.193s][error][os       ] omInUseCount:1
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:1
>>>> [29.193s][error][os       ] omInUseCount:2
>>>> [29.193s][error][os       ] omInUseCount:1
>>>> [29.193s][error][os       ] omInUseCount:1
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:1
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:1
>>>> [29.193s][error][os       ] omInUseCount:0
>>>> [29.193s][error][os       ] omInUseCount:0
>>>>
>>>> So in my setup even if you parallel the per thread in use monitors
>>>> work the synchronization overhead is still larger.
>>>>
>>>> /Robbin
>>>>
>>>> On 06/29/2017 01:42 PM, Roman Kennke wrote:
>>>>> How many Java threads are involved in monitor Inflation ?
>>>>> Parallelization is spread by Java threads (i.e. each worker claims
>>>>> and deflates monitors of 1 java thread per step).
>>>>>
>>>>> Roman
>>>>>
>>>>> Am 29. Juni 2017 12:49:58 MESZ schrieb Robbin Ehn
>>>>> <robbin.ehn at oracle.com>:
>>>>>
>>>>>       Hi Roman,
>>>>>
>>>>>       I haven't had the time to test all scenarios, and the numbers are
>>>>> just an indication:
>>>>>
>>>>>       Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s avg,
>>>>> avg of 10 worsed cleanups 0.0173s
>>>>>       Do it 4 workers, MonitorUsedDeflationThreshold=0, 0.002923s avg,
>>>>> avg of 10 worsed cleanups 0.0199s
>>>>>       Do it VM thread, MonitorUsedDeflationThreshold=1, 0.001889s avg,
>>>>> avg of 10 worsed cleanups 0.0066s
>>>>>
>>>>>       When MonitorUsedDeflationThreshold=0 we are talking about 120000
>>>>> free monitors to deflate.
>>>>>       And I get worse numbers doing the cleanup in 4 threads.
>>>>>
>>>>>       Any idea why I see these numbers?
>>>>>
>>>>>       Thanks, Robbin
>>>>>
>>>>>       On 06/28/2017 10:23 PM, Roman Kennke wrote:
>>>>>
>>>>>
>>>>>
>>>>>               On 06/27/2017 09:47 PM, Roman Kennke wrote:
>>>>>
>>>>>                   Hi Robbin,
>>>>>
>>>>>                   Ugh. Thanks for catching this.
>>>>>                   Problem was that I was accounting the thread-local
>>>>> deflations twice:
>>>>>                   once in thread-local processing (basically a leftover
>>>>> from my earlier
>>>>>                   attempt to implement this accounting) and then
>>>>> again in
>>>>>                   finish_deflate_idle_monitors(). Should be fixed here:
>>>>>
>>>>>                  
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/
>>>>>                 
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.09/>
>>>>>
>>>>>
>>>>>               Nit:
>>>>>               safepoint.cpp : ParallelSPCleanupTask
>>>>>               "const char* name = " is not needed and 1 is unused
>>>>>
>>>>>
>>>>>           Sorry, I don't understand what you mean by this. I see code
>>>>> like this:
>>>>>
>>>>>           const char* name = "deflating idle monitors";
>>>>>
>>>>>           and it is used a few lines below, even 2x.
>>>>>
>>>>>           What's '1 is unused' ?
>>>>>
>>>>>
>>>>>                   Side question: which jtreg targets do you usually
>>>>> run?
>>>>>
>>>>>
>>>>>               Right now I cherry pick directories from: hotspot/test/
>>>>>
>>>>>               I'm going to add a decent test group for local testing.
>>>>>
>>>>>           That would be good!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                   Trying: make test TEST=hotspot_all
>>>>>                   gives me *lots* of failures due to missing jcstress
>>>>> stuff (?!)
>>>>>                   And even other subsets seem to depend on several bits
>>>>> and pieces
>>>>>                   that I
>>>>>                   have no idea about.
>>>>>
>>>>>
>>>>>               Yes, you need to use internal tool 'jib' java integrate
>>>>> build to get
>>>>>               that work or you can set some environment where the
>>>>> jcstress
>>>>>               application stuff is...
>>>>>
>>>>>           Uhhh. We really do want a subset of tests that we can run
>>>>> reliably and
>>>>>           that are self-contained, how else are people (without that
>>>>> jib thingy)
>>>>>           supposed to do some sanity checking with their patches? ;-)
>>>>>
>>>>>               I have a regression on ClassLoaderData root scanning,
>>>>> this should not
>>>>>               be related,
>>>>>               but I only have 3 patches which could cause this, if it's
>>>>> not
>>>>>               something in the environment that have changed.
>>>>>
>>>>>           Let me know if it's my patch :-)
>>>>>
>>>>>
>>>>>               Also do not see any immediate performance gains (off vs 4
>>>>> threads), it
>>>>>               might be
>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/06994badeb24
>>>>>               , but I need to-do some more testing. I know you often
>>>>> run with none
>>>>>               default GSI.
>>>>>
>>>>>
>>>>>           First of all, during the course of this review I reduced the
>>>>> change from
>>>>>           an actual implementation to a kind of framework, and it needs
>>>>> some
>>>>>           separate changes in the GC to make use of it. Not sure if you
>>>>> added
>>>>>           corresponding code in (e.g.) G1?
>>>>>
>>>>>           Also, this is only really visible in code that makes
>>>>> excessive use of
>>>>>           monitors, i.e. the one linked by Carsten's original patch, or
>>>>> the test
>>>>>           org.openjdk.gcbench.roots.Synchronizers.test in gc-bench:
>>>>>
>>>>>           http://icedtea.classpath.org/hg/gc-bench/
>>>>>
>>>>>           There are also some popular real-world apps that tend to do
>>>>> this. From
>>>>>           the top off my head, Cassandra is such an application.
>>>>>
>>>>>           Thanks, Roman
>>>>>
>>>>>
>>>>>               I'll get back to you.
>>>>>
>>>>>               Thanks, Robbin
>>>>>
>>>>>
>>>>>                   Roman
>>>>>
>>>>>                   Am 27.06.2017 um 16:51 schrieb Robbin Ehn:
>>>>>
>>>>>                       Hi Roman,
>>>>>
>>>>>                       There is something wrong in calculations:
>>>>>                       INFO: Deflate: InCirc=43 InUse=18 Scavenged=25
>>>>> ForceMonitorScavenge=0
>>>>>                       : pop=27051 free=215487
>>>>>
>>>>>                       free is larger than population, have not had the
>>>>> time to dig into this.
>>>>>
>>>>>                       Thanks, Robbin
>>>>>
>>>>>                       On 06/22/2017 10:19 PM, Roman Kennke wrote:
>>>>>
>>>>>                           So here's the latest iteration of that patch:
>>>>>
>>>>>                         
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.08/
>>>>>                         
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.08/>
>>>>>
>>>>>                           I checked and fixed all the counters. The
>>>>> problem here is that they
>>>>>                           are
>>>>>                           not updated in a single place
>>>>> (deflate_idle_monitors() ) but in
>>>>>                           several
>>>>>                           places, potentially by multiple threads. I
>>>>> split up deflation into
>>>>>                           prepare_.. and a finish_.. methods to
>>>>> initialize local and update
>>>>>                           global
>>>>>                           counters respectively, and pass around a
>>>>> counters object (allocated on
>>>>>                           stack) to the various code paths that use it.
>>>>> Updating the counters
>>>>>                           always happen under a lock, there's no need
>>>>> to do anything special
>>>>>                           with
>>>>>                           regards to concurrency.
>>>>>
>>>>>                           I also checked the nmethod marking, but there
>>>>> doesn't seem to be
>>>>>                           anything in that code that looks problematic
>>>>> under concurrency. The
>>>>>                           worst that can happen is that two threads
>>>>> write the same value into an
>>>>>                           nmethod field. I think we can live with
>>>>> that ;-)
>>>>>
>>>>>                           Good to go?
>>>>>
>>>>>                           Tested by running specjvm and jcstress
>>>>> fastdebug+release without
>>>>>                           issues.
>>>>>
>>>>>                           Roman
>>>>>
>>>>>                           Am 02.06.2017 um 12:39 schrieb Robbin Ehn:
>>>>>
>>>>>                               Hi Roman,
>>>>>
>>>>>                               On 06/02/2017 11:41 AM, Roman Kennke
>>>>> wrote:
>>>>>
>>>>>                                   Hi David,
>>>>>                                   thanks for reviewing. I'll be on
>>>>> vacation the next two weeks too,
>>>>>                                   with
>>>>>                                   only sporadic access to work stuff.
>>>>>                                   Yes, exposure will not be as good as
>>>>> otherwise, but it's not totally
>>>>>                                   untested either: the serial code path
>>>>> is the same as the
>>>>>                                   parallel, the
>>>>>                                   only difference is that it's not
>>>>> actually called by multiple
>>>>>                                   threads.
>>>>>                                   It's ok I think.
>>>>>
>>>>>                                   I found two more issues that I think
>>>>> should be addressed:
>>>>>                                   - There are some counters in
>>>>> deflate_idle_monitors() and I'm not
>>>>>                                   sure I
>>>>>                                   correctly handle them in the split-up
>>>>> and MT'ed thread-local/ global
>>>>>                                   list deflation
>>>>>                                   - nmethod marking seems to
>>>>> unconditionally poke true or something
>>>>>                                   like
>>>>>                                   that in nmethod fields. This doesn't
>>>>> hurt correctness-wise, but it's
>>>>>                                   probably worth checking if it's
>>>>> already true, especially when doing
>>>>>                                   this
>>>>>                                   with multiple threads concurrently.
>>>>>
>>>>>                                   I'll send an updated patch around
>>>>> later, I hope I can get to it
>>>>>                                   today...
>>>>>
>>>>>
>>>>>                               I'll review that when you get it out.
>>>>>                               I think this looks as a reasonable step
>>>>> before we tackle this with a
>>>>>                               major effort, such as the JEP you and
>>>>> Carsten doing.
>>>>>                               And another effort to 'fix' nmethods
>>>>> marking.
>>>>>
>>>>>                               Internal discussion yesterday lead us to
>>>>> conclude that the runtime
>>>>>                               will probably need more threads.
>>>>>                               This would be a good driver to do a
>>>>> 'global' worker pool which serves
>>>>>                               both gc, runtime and safepoints with
>>>>> threads.
>>>>>
>>>>>
>>>>>                                   Roman
>>>>>
>>>>>                                       Hi Roman,
>>>>>
>>>>>                                       I am about to disappear on an
>>>>> extended vacation so will let others
>>>>>                                       pursue this. IIUC this is longer
>>>>> an opt-in by the user at runtime,
>>>>>                                       but
>>>>>                                       an opt-in by the particular GC
>>>>> developers. Okay. My only concern
>>>>>                                       with
>>>>>                                       that is if Shenandoah is the only
>>>>> GC that currently opts in then
>>>>>                                       this
>>>>>                                       code is not going to get much
>>>>> testing and will be more prone to
>>>>>                                       incidental breakage.
>>>>>
>>>>>
>>>>>                               As I mentioned before, it seem like Erik
>>>>> ? have some idea, maybe he
>>>>>                               can do this after his barrier patch.
>>>>>
>>>>>                               Thanks!
>>>>>
>>>>>                               /Robbin
>>>>>
>>>>>
>>>>>                                       Cheers,
>>>>>                                       David
>>>>>
>>>>>                                       On 2/06/2017 2:21 AM, Roman
>>>>> Kennke wrote:
>>>>>
>>>>>                                           Am 01.06.2017 um 17:50
>>>>> schrieb Roman Kennke:
>>>>>
>>>>>                                               Am 01.06.2017 um 14:18
>>>>> schrieb Robbin Ehn:
>>>>>
>>>>>                                                   Hi Roman,
>>>>>
>>>>>                                                   On 06/01/2017 11:29
>>>>> AM, Roman Kennke wrote:
>>>>>
>>>>>                                                       Am 31.05.2017 um
>>>>> 22:06 schrieb Robbin Ehn:
>>>>>
>>>>>                                                           Hi Roman, I
>>>>> agree that is really needed but:
>>>>>
>>>>>                                                           On 05/31/2017
>>>>> 10:27 AM, Roman Kennke wrote:
>>>>>
>>>>>                                                               I
>>>>> realized that sharing workers with GC is not so easy.
>>>>>
>>>>>                                                               We need
>>>>> to be able to use the workers at a safepoint during
>>>>>                                                              
>>>>> concurrent
>>>>>                                                               GC work
>>>>> (which also uses the same workers). This does not
>>>>>                                                               only
>>>>>                                                               require
>>>>>                                                               that
>>>>> those workers be suspended, like e.g.
>>>>>                                                             
>>>>> SuspendibleThreadSet::yield(), but they need to be idle, i.e.
>>>>>                                                               have
>>>>>                                                               finished
>>>>> their tasks. This needs some careful handling to
>>>>>                                                               work
>>>>>                                                               without
>>>>>                                                               races: it
>>>>> requires a SuspendibleThreadSetJoiner around the
>>>>>                                                             
>>>>> corresponding
>>>>>                                                             
>>>>> run_task() call and also the tasks themselves need to join
>>>>>                                                               the
>>>>>                                                               STS and
>>>>>                                                               handle
>>>>> requests for safepoints not by yielding, but by
>>>>>                                                               leaving
>>>>>                                                               the
>>>>>                                                               task.
>>>>>                                                               This is
>>>>> far too peculiar for me to make the call to hook
>>>>>                                                               up GC
>>>>>                                                               workers
>>>>>                                                               for
>>>>> safepoint cleanup, and I thus removed those parts. I
>>>>>                                                               left the
>>>>>                                                               API in
>>>>>                                                             
>>>>> CollectedHeap in place. I think GC devs who know better
>>>>>                                                               about G1
>>>>>                                                               and CMS
>>>>>                                                               should
>>>>> make that call, or else just use a separate thread
>>>>>                                                               pool.
>>>>>
>>>>>                                                             
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/
>>>>>                                                             
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.05/>
>>>>>
>>>>>                                                               Is it ok
>>>>> now?
>>>>>
>>>>>                                                           I still think
>>>>> you should put the "Parallel Safepoint Cleanup"
>>>>>                                                           workers
>>>>>                                                           inside
>>>>> Shenandoah,
>>>>>                                                           so the
>>>>> SafepointSynchronizer only calls get_safepoint_workers,
>>>>>                                                           e.g.:
>>>>>
>>>>>                                                         
>>>>> _cleanup_workers = heap->get_safepoint_workers();
>>>>>                                                         
>>>>> _num_cleanup_workers = _cleanup_workers != NULL ?
>>>>>                                                         
>>>>> _cleanup_workers->total_workers() : 1;
>>>>>                                                         
>>>>> ParallelSPCleanupTask cleanup(_cleanup_subtasks);
>>>>>                                                         
>>>>> StrongRootsScope srs(_num_cleanup_workers);
>>>>>                                                           if
>>>>> (_cleanup_workers != NULL) {
>>>>>                                                         
>>>>> _cleanup_workers->run_task(&cleanup,
>>>>>                                                         
>>>>> _num_cleanup_workers);
>>>>>                                                           } else {
>>>>>                                                           cleanup.work
>>>>> <http://cleanup.work>(0);
>>>>>                                                           }
>>>>>
>>>>>                                                           That way you
>>>>> don't even need your new flags, but it will be
>>>>>                                                           up to
>>>>>                                                           the
>>>>>                                                           other GCs to
>>>>> make their worker available
>>>>>                                                           or cheat with
>>>>> a separate workgang.
>>>>>
>>>>>                                                       I can do that, I
>>>>> don't mind. The question is, do we want that?
>>>>>
>>>>>                                                   The problem is that
>>>>> we do not want to haste such decision, we
>>>>>                                                   believe
>>>>>                                                   there is a better
>>>>> solution.
>>>>>                                                   I think you also
>>>>> would want another solution.
>>>>>                                                   But it's seems like
>>>>> such solution with 1 'global' thread pool
>>>>>                                                   either
>>>>>                                                   own by GC or the VM
>>>>> it self is quite the undertaking.
>>>>>                                                   Since this probably
>>>>> will not be done any time soon my
>>>>>                                                   suggestion is,
>>>>>                                                   to not hold you back
>>>>> (we also want this), just to make
>>>>>                                                   the code parallel and
>>>>> as an intermediate step ask the GC if it
>>>>>                                                   minds
>>>>>                                                   sharing it's thread.
>>>>>
>>>>>                                                   Now when Shenandoah
>>>>> is merged it's possible that e.g. G1 will
>>>>>                                                   share
>>>>>                                                   the code for a
>>>>> separate thread pool, do something of it's own or
>>>>>                                                   wait until the bigger
>>>>> question about thread pool(s) have been
>>>>>                                                   resolved.
>>>>>
>>>>>                                                   By adding a thread
>>>>> pool directly to the SafepointSynchronizer
>>>>>                                                   and
>>>>>                                                   flags for it we might
>>>>> limit our future options.
>>>>>
>>>>>                                                       I wouldn't call
>>>>> it 'cheating with a separate workgang'
>>>>>                                                       though. I
>>>>>                                                       see
>>>>>                                                       that both G1 and
>>>>> CMS suspend their worker threads at a
>>>>>                                                       safepoint.
>>>>>                                                       However:
>>>>>
>>>>>                                                   Yes it's not cheating
>>>>> but I want decent heuristics between e.g.
>>>>>                                                   number
>>>>>                                                   of concurrent marking
>>>>> threads and parallel safepoint threads
>>>>>                                                   since
>>>>>                                                   they compete for cpu
>>>>> time.
>>>>>                                                   As the code looks
>>>>> now, I think that decisions must be made by
>>>>>                                                   the
>>>>>                                                   GC.
>>>>>
>>>>>                                               Ok, I see your point. I
>>>>> updated the proposed patch accordingly:
>>>>>
>>>>>                                             
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.06/
>>>>>                                             
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.06/>
>>>>>
>>>>>                                           Oops. Minor mistake there.
>>>>> Correction:
>>>>>                                         
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.07/
>>>>>                                         
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.07/>
>>>>>
>>>>>                                           (Removed 'class WorkGang'
>>>>> from safepoint.hpp, and forgot to add it
>>>>>                                           into
>>>>>                                           collectedHeap.hpp, resulting
>>>>> in build failure...)
>>>>>
>>>>>                                           Roman
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet.
>>>
>>>
> 


From thomas.schatzl at oracle.com  Tue Jul  4 07:17:08 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 04 Jul 2017 09:17:08 +0200
Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count
 number of cards concurrently refined
In-Reply-To: <4331c480-7c20-8d39-9ea9-7418a86a878d@oracle.com>
References: <1499075917.2802.8.camel@oracle.com>
 <4331c480-7c20-8d39-9ea9-7418a86a878d@oracle.com>
Message-ID: <1499152628.2761.0.camel@oracle.com>

Hi Stefan,

On Mon, 2017-07-03 at 15:12 +0200, Stefan Johansson wrote:
> 
> On 2017-07-03 11:58, Thomas Schatzl wrote:
> > 
> > Hi all,
> > 
> > ???can I have reviews for this small change that
> > makes G1Remset::_conc_refined_cards only count the number of
> > concurrently refined cards (+ some trivial renaming of the
> > variable)?
> > [...]
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8179677
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1/
> Looks good,
> StefanJ


? thanks for your review.

Thomas


From thomas.schatzl at oracle.com  Tue Jul  4 07:17:59 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 04 Jul 2017 09:17:59 +0200
Subject: RFR (XS): 8179677: Let G1Remset::_conc_refined_cards only count
 number of cards concurrently refined
In-Reply-To: <47787650-1ade-d89c-29a8-3b8b6e4e8bd0@oracle.com>
References: <1499075917.2802.8.camel@oracle.com>
 <1499094282.2802.132.camel@oracle.com>
 <47787650-1ade-d89c-29a8-3b8b6e4e8bd0@oracle.com>
Message-ID: <1499152679.2761.1.camel@oracle.com>

Hi Erik,

On Mon, 2017-07-03 at 17:41 +0200, Erik Helin wrote:
> On 07/03/2017 05:04 PM, Thomas Schatzl wrote:
> > 
> > Hi all,
> > 
> > ? Erik asked for a few renamings and some additional comments. Here
> > are
> > the new webrevs:
> > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.1_to_2 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8179677/webrev.2 (full)
> Looks good, Reviewed. Thanks Thomas!
> Erik
> 

? thanks for your review.

Thomas


From mikael.gerdin at oracle.com  Tue Jul  4 08:10:34 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 4 Jul 2017 10:10:34 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <595A3E66.5050705@oracle.com>
References: <59510D5E.10009@oracle.com>
 <cabff42d-92ad-dcac-8bd2-a4173140abd6@oracle.com>
 <595A3E66.5050705@oracle.com>
Message-ID: <012c4b90-ab34-b683-a641-751714b53bcd@oracle.com>

Hi Erik,

On 2017-07-03 14:53, Erik ?sterlund wrote:
> Hi Mikael,
> 
> Thank you for the review!
> 
> Regarding the use of + x in the current enum system for lock rankings, I 
> agree that it is not a brilliant system and you feel a bit sad when your 
> lock rank is "leaf+2". However, sometimes I feel like abstracting 
> numbers with names can become confusing as well - even misleading. Like 
> for example how leaf is no longer a leaf and how it is questionable 
> whether special is really all that special any longer.
> 
> When I thought about possible name for access + 0 and access + 1, I was 
> thinking something in the lines of "perhaps access_inner/outer or 
> access_leaf/nonleaf", but then that might get confusing if suddenly 
> access will need 3 ranks for some reason and we get an "access_special" 
> rank or something.

I suppose you're right. Let's leave the values as you suggested.

> 
> Perhaps a different solution than enum names would be nice long-term for 
> lock ranks and deadlock detection, but I believe that might be outside 
> of the current scope for this change.

Agreed.
/Mikael

> 
> Thanks,
> /Erik
> 
> On 2017-07-03 13:57, Mikael Gerdin wrote:
>> Hi Erik,
>>
>> On 2017-06-26 15:34, Erik ?sterlund wrote:
>>> Hi,
>>>
>>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>>
>> I think this change makes sense and I agree with your reasoning below.
>>
>> I'm leaning towards suggesting creating a named enum value for 
>> "access+1" to begin a move towards getting rid of adding and 
>> subtracting values from enums in this code. I don't have a good name 
>> for it, though.
>>
>> /Mikael
>>
>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>>>
>>> The G1 barrier queues have very awkward lock orderings for the 
>>> following reasons:
>>>
>>> 1) These queues may queue up things when performing a reference write 
>>> or resolving a jweak (intentionally or just happened to be jweak, 
>>> even though it looks like a jobject), which can happen in a lot of 
>>> places in the code. We resolve JNIHandles while holding special locks 
>>> in many places. We perform reference writes also in many places. Now 
>>> the unsuspecting hotspot developer might think that it is okay to 
>>> resolve a JNIHandle or perform a reference write while possibly 
>>> holding a special lock. But no. In some cases, object writes have 
>>> been moved out of locks and replaced with lock-free CAS, only to 
>>> dodge the G1 write barrier locks. I don't think the G1 lock ordering 
>>> issues should shape the shared code rather than the other way around.
>>> 2) There is an issue that the shared queue locks have a "special" 
>>> rank, which is below the lock ranks used by the cbl monitor and free 
>>> list monitor. This leads to an issue when these locks have to be 
>>> taken while holding the shared queue locks. The current solution is 
>>> to drop the shared queue locks temporarily, introducing nasty data 
>>> races. These races are guarded, but the whole race seems very 
>>> unnecessary.
>>>
>>> I argue that if the G1 write barrier queue locks were simply set 
>>> appropriately in the first place by analyzing what ranks they should 
>>> have, none of the above issues would exist. Therefore I propose this 
>>> new ordering.
>>>
>>> Specifically, I recognize that locks required for performing memory 
>>> accesses and resolving JNIHandles are more special than the "special" 
>>> rank. Therefore, this change introduces a new lock ordering category 
>>> called "access", which is to be used by barriers required to perform 
>>> memory accesses. In other words, by recognizing the rank is more 
>>> special than "special", we can remove "special" code to walk around 
>>> making its rank more "special". That seems desirable to me. The 
>>> access locks need to comply to the same constraints as the special 
>>> locks: they may not perform safepoint checks.
>>>
>>> The old lock ranks were:
>>>
>>> SATB_Q_FL_lock: special
>>> SATB_Q_CBL_mon: leaf - 1
>>> Shared_SATB_Q_lock: leaf - 1
>>>
>>> DirtyCardQ_FL_lock: special
>>> DirtyCardQ_CBL_mon: leaf - 1
>>> Shared_DirtyCardQ_lock: leaf - 1
>>>
>>> The new lock ranks are:
>>>
>>> SATB_Q_FL_lock: access (special - 2)
>>> SATB_Q_CBL_mon: access (special - 2)
>>> Shared_SATB_Q_lock: access + 1 (special - 1)
>>>
>>> DirtyCardQ_FL_lock: access (special - 2)
>>> DirtyCardQ_CBL_mon: access (special - 2)
>>> Shared_DirtyCardQ_lock: access + 1 (special - 1)
>>>
>>> Analysis:
>>>
>>> Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the same 
>>> group of locks. The free list lock, the completed buffer list monitor 
>>> and the shared queue lock.
>>>
>>> Observations:
>>> 1) The free list lock and completed buffer list monitors (members of 
>>> PtrQueueSet) are disjoint. We never hold both of them at the same time.
>>> Rationale: The free list lock is only used from 
>>> PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and 
>>> PtrQueueSet::reduce_free_list, and no callsite from there can be 
>>> expanded where the cbl monitor is acquired. So therefore it is 
>>> impossible to acquire the cbl monitor while holding the free list 
>>> lock. The opposite case of acquiring the free list lock while holding 
>>> the cbl monitor is also not possible; only the following places 
>>> acquire the cbl monitor: PtrQueueSet::enqueue_complete_buffer, 
>>> PtrQueueSet::merge_bufferlists, 
>>> PtrQueueSet::assert_completed_buffer_list_len_correct, 
>>> PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id, 
>>> FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer, 
>>> DirtyCardQueueSet::clear, 
>>> SATBMarkQueueSet::apply_closure_to_completed_buffer and 
>>> SATBMarkQueueSet::abandon_partial_marking. Again, neither of these 
>>> paths where the cbl monitor is held can expand callsites to a place 
>>> where the free list locks are held. Therefore it holds that the cbl 
>>> monitor can not be held while the free list lock is held, and the 
>>> free list lock can not be held while the cbl monitor is held. 
>>> Therefore they are held disjointly.
>>> 2) We might hold the shared queue locks before acquiring the 
>>> completed buffer list monitor. (today we drop the shared queue lock 
>>> then and reacquire it later as a hack as already described)
>>> 3) We do not acquire a shared queue lock while holding the free list 
>>> lock or completed buffer list monitor, as there is no reference from 
>>> a PtrQueueSet to its shared queue, so those code paths do not know 
>>> how to reach the shared PtrQueue to acquire its lock. The derived 
>>> classes are exceptions but they never use the shared queue lock while 
>>> holding the completed buffer list monitor or free list lock. 
>>> DirtyCardQueueSet uses the shared queue for concatenating logs (in a 
>>> safepoint without holding those locks). The SATBMarkQueueSet uses the 
>>> shared queue for filtering the buffers, fiddling with activeness, 
>>> printing and resetting, all without grabbing any locks.
>>> 4) We do not acquire any other lock (above event) while holding the 
>>> free list lock or completed buffer list monitors. This was discovered 
>>> by manually expanding the call graphs from where these two locks are 
>>> held.
>>>
>>> Derived constraints:
>>> a) Because of observation 1, the free list lock and completed buffer 
>>> list monitors can have the same rank.
>>> b) Because of observations 1 and 2, the shared queue lock ought to 
>>> have a rank higher than the ranks of the free list lock and the 
>>> completed buffer list monitors (not the case today).
>>> c) Because of of observation 3 and 2, the free list lock and 
>>> completed buffer list monitors ought to have a rank lower than the 
>>> rank of the shared queue lock.
>>> d) Because of observation 4 (and constraints a-c), all the barrier 
>>> locks should be below the "special" rank without violating any 
>>> existing ranks.
>>>
>>> The proposed new lock ranks conform to the constraints derived from 
>>> my observations. It is worth noting that the potential relationship 
>>> that could break (and why they do not) are:
>>> 1) If a lock is acquired from within the barriers that does not 
>>> involve the shared queue lock, the free list lock or the completed 
>>> buffer list monitor, we have now inverted their relationship as that 
>>> other lock would probably have a rank higher than or equal to 
>>> "special". But due to observation 4, there are no such cases.
>>> 2) The relationship between the shared queue lock and the completed 
>>> buffer list monitor has been changed so both can be held at the same 
>>> time if the shared queue lock is acquired first (which it is). This 
>>> is arguably the way it should have been from the first place, and the 
>>> old solution had ugly hacks where we would drop the shared queue lock 
>>> to not run into the lock order assert (and only not to run into the 
>>> lock order assert, i.e. not to avoid potential deadlock) to ensure 
>>> the locks are not held at the same time. That code has now been 
>>> removed, so that the shared queue lock is still held when enqueueing 
>>> completed buffers (no dodgy dropping and reclaiming), and the code 
>>> for handling the races due to multiple concurrent enqueuers has also 
>>> been removed and replaced with an assertion that there simply should 
>>> not be multiple concurrent enqueuers. Since the shared queue lock is 
>>> now held throughout the whole operation, there will be no concurrent 
>>> enqueuers.
>>> 3) The completed buffer list monitor used to have a higher rank than 
>>> the free list lock. Now they have the same. Therefore, they could 
>>> previously allow them to be held at the same time if the cbl monitor 
>>> was acquired first. However, as discussed, there is no such case, and 
>>> they ought to have the same rank not to confuse their true 
>>> disjointness. If anyone insists we do not break this relationship 
>>> despite the true disjointness, I could consent to adding another 
>>> access lock rank, like this: 
>>> http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I think 
>>> it seems better to have the same rank since they are actually truly 
>>> disjoint and should remain disjoint.
>>>
>>> I do recognize that long term, we *might* want a lock-free solution 
>>> or something (not saying we do or do not). But until then, the ranks 
>>> ought to be corrected so that they do not cause these problems 
>>> causing everyone to bash their head against the awkward G1 lock ranks 
>>> throughout the code and make hacks around it.
>>>
>>> Testing: JPRT with hotspot all and lots of local testing.
>>>
>>> Thanks,
>>> /Erik
> 


From thomas.schatzl at oracle.com  Tue Jul  4 08:24:23 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 04 Jul 2017 10:24:23 +0200
Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure
Message-ID: <1499156663.2761.6.camel@oracle.com>

Hi all,

? can I get reviews for this change that renames and cleans up the use
of?RefineCardTableEntryClosure in the code?

RefineCardTableEntryClosure is the closure that is applied by the
concurrent refinement threads. This change renames it slightly to
indicate its use (G1RefineCardConcurrentlyClosure) and moves it to the
G1RemSet files close to the closure that we use for refinement/Update
RS during GC.

This change is dependent on "JDK-8183226: Remembered set summarization
accesses not fully initialized java thread DCQS" which is also
currently out for review - that change reorganizes G1CollectedHeap
initialization so that the change can actually move the closure.

CR:
https://bugs.openjdk.java.net/browse/JDK-8183128
Webrev:
http://cr.openjdk.java.net/~tschatzl/8183128/webrev/
Testing:
jprt, local benchmarks

Thanks,
? Thomas


From erik.osterlund at oracle.com  Tue Jul  4 08:27:55 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 4 Jul 2017 10:27:55 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <012c4b90-ab34-b683-a641-751714b53bcd@oracle.com>
References: <59510D5E.10009@oracle.com>
 <cabff42d-92ad-dcac-8bd2-a4173140abd6@oracle.com>
 <595A3E66.5050705@oracle.com>
 <012c4b90-ab34-b683-a641-751714b53bcd@oracle.com>
Message-ID: <595B518B.2030703@oracle.com>

Hi Mikael,

Thank you for the review.

/Erik

On 2017-07-04 10:10, Mikael Gerdin wrote:
> Hi Erik,
>
> On 2017-07-03 14:53, Erik ?sterlund wrote:
>> Hi Mikael,
>>
>> Thank you for the review!
>>
>> Regarding the use of + x in the current enum system for lock 
>> rankings, I agree that it is not a brilliant system and you feel a 
>> bit sad when your lock rank is "leaf+2". However, sometimes I feel 
>> like abstracting numbers with names can become confusing as well - 
>> even misleading. Like for example how leaf is no longer a leaf and 
>> how it is questionable whether special is really all that special any 
>> longer.
>>
>> When I thought about possible name for access + 0 and access + 1, I 
>> was thinking something in the lines of "perhaps access_inner/outer or 
>> access_leaf/nonleaf", but then that might get confusing if suddenly 
>> access will need 3 ranks for some reason and we get an 
>> "access_special" rank or something.
>
> I suppose you're right. Let's leave the values as you suggested.
>
>>
>> Perhaps a different solution than enum names would be nice long-term 
>> for lock ranks and deadlock detection, but I believe that might be 
>> outside of the current scope for this change.
>
> Agreed.
> /Mikael
>
>>
>> Thanks,
>> /Erik
>>
>> On 2017-07-03 13:57, Mikael Gerdin wrote:
>>> Hi Erik,
>>>
>>> On 2017-06-26 15:34, Erik ?sterlund wrote:
>>>> Hi,
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>>>
>>> I think this change makes sense and I agree with your reasoning below.
>>>
>>> I'm leaning towards suggesting creating a named enum value for 
>>> "access+1" to begin a move towards getting rid of adding and 
>>> subtracting values from enums in this code. I don't have a good name 
>>> for it, though.
>>>
>>> /Mikael
>>>
>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>>>>
>>>> The G1 barrier queues have very awkward lock orderings for the 
>>>> following reasons:
>>>>
>>>> 1) These queues may queue up things when performing a reference 
>>>> write or resolving a jweak (intentionally or just happened to be 
>>>> jweak, even though it looks like a jobject), which can happen in a 
>>>> lot of places in the code. We resolve JNIHandles while holding 
>>>> special locks in many places. We perform reference writes also in 
>>>> many places. Now the unsuspecting hotspot developer might think 
>>>> that it is okay to resolve a JNIHandle or perform a reference write 
>>>> while possibly holding a special lock. But no. In some cases, 
>>>> object writes have been moved out of locks and replaced with 
>>>> lock-free CAS, only to dodge the G1 write barrier locks. I don't 
>>>> think the G1 lock ordering issues should shape the shared code 
>>>> rather than the other way around.
>>>> 2) There is an issue that the shared queue locks have a "special" 
>>>> rank, which is below the lock ranks used by the cbl monitor and 
>>>> free list monitor. This leads to an issue when these locks have to 
>>>> be taken while holding the shared queue locks. The current solution 
>>>> is to drop the shared queue locks temporarily, introducing nasty 
>>>> data races. These races are guarded, but the whole race seems very 
>>>> unnecessary.
>>>>
>>>> I argue that if the G1 write barrier queue locks were simply set 
>>>> appropriately in the first place by analyzing what ranks they 
>>>> should have, none of the above issues would exist. Therefore I 
>>>> propose this new ordering.
>>>>
>>>> Specifically, I recognize that locks required for performing memory 
>>>> accesses and resolving JNIHandles are more special than the 
>>>> "special" rank. Therefore, this change introduces a new lock 
>>>> ordering category called "access", which is to be used by barriers 
>>>> required to perform memory accesses. In other words, by recognizing 
>>>> the rank is more special than "special", we can remove "special" 
>>>> code to walk around making its rank more "special". That seems 
>>>> desirable to me. The access locks need to comply to the same 
>>>> constraints as the special locks: they may not perform safepoint 
>>>> checks.
>>>>
>>>> The old lock ranks were:
>>>>
>>>> SATB_Q_FL_lock: special
>>>> SATB_Q_CBL_mon: leaf - 1
>>>> Shared_SATB_Q_lock: leaf - 1
>>>>
>>>> DirtyCardQ_FL_lock: special
>>>> DirtyCardQ_CBL_mon: leaf - 1
>>>> Shared_DirtyCardQ_lock: leaf - 1
>>>>
>>>> The new lock ranks are:
>>>>
>>>> SATB_Q_FL_lock: access (special - 2)
>>>> SATB_Q_CBL_mon: access (special - 2)
>>>> Shared_SATB_Q_lock: access + 1 (special - 1)
>>>>
>>>> DirtyCardQ_FL_lock: access (special - 2)
>>>> DirtyCardQ_CBL_mon: access (special - 2)
>>>> Shared_DirtyCardQ_lock: access + 1 (special - 1)
>>>>
>>>> Analysis:
>>>>
>>>> Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the 
>>>> same group of locks. The free list lock, the completed buffer list 
>>>> monitor and the shared queue lock.
>>>>
>>>> Observations:
>>>> 1) The free list lock and completed buffer list monitors (members 
>>>> of PtrQueueSet) are disjoint. We never hold both of them at the 
>>>> same time.
>>>> Rationale: The free list lock is only used from 
>>>> PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and 
>>>> PtrQueueSet::reduce_free_list, and no callsite from there can be 
>>>> expanded where the cbl monitor is acquired. So therefore it is 
>>>> impossible to acquire the cbl monitor while holding the free list 
>>>> lock. The opposite case of acquiring the free list lock while 
>>>> holding the cbl monitor is also not possible; only the following 
>>>> places acquire the cbl monitor: 
>>>> PtrQueueSet::enqueue_complete_buffer, 
>>>> PtrQueueSet::merge_bufferlists, 
>>>> PtrQueueSet::assert_completed_buffer_list_len_correct, 
>>>> PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id, 
>>>> FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer, 
>>>> DirtyCardQueueSet::clear, 
>>>> SATBMarkQueueSet::apply_closure_to_completed_buffer and 
>>>> SATBMarkQueueSet::abandon_partial_marking. Again, neither of these 
>>>> paths where the cbl monitor is held can expand callsites to a place 
>>>> where the free list locks are held. Therefore it holds that the cbl 
>>>> monitor can not be held while the free list lock is held, and the 
>>>> free list lock can not be held while the cbl monitor is held. 
>>>> Therefore they are held disjointly.
>>>> 2) We might hold the shared queue locks before acquiring the 
>>>> completed buffer list monitor. (today we drop the shared queue lock 
>>>> then and reacquire it later as a hack as already described)
>>>> 3) We do not acquire a shared queue lock while holding the free 
>>>> list lock or completed buffer list monitor, as there is no 
>>>> reference from a PtrQueueSet to its shared queue, so those code 
>>>> paths do not know how to reach the shared PtrQueue to acquire its 
>>>> lock. The derived classes are exceptions but they never use the 
>>>> shared queue lock while holding the completed buffer list monitor 
>>>> or free list lock. DirtyCardQueueSet uses the shared queue for 
>>>> concatenating logs (in a safepoint without holding those locks). 
>>>> The SATBMarkQueueSet uses the shared queue for filtering the 
>>>> buffers, fiddling with activeness, printing and resetting, all 
>>>> without grabbing any locks.
>>>> 4) We do not acquire any other lock (above event) while holding the 
>>>> free list lock or completed buffer list monitors. This was 
>>>> discovered by manually expanding the call graphs from where these 
>>>> two locks are held.
>>>>
>>>> Derived constraints:
>>>> a) Because of observation 1, the free list lock and completed 
>>>> buffer list monitors can have the same rank.
>>>> b) Because of observations 1 and 2, the shared queue lock ought to 
>>>> have a rank higher than the ranks of the free list lock and the 
>>>> completed buffer list monitors (not the case today).
>>>> c) Because of of observation 3 and 2, the free list lock and 
>>>> completed buffer list monitors ought to have a rank lower than the 
>>>> rank of the shared queue lock.
>>>> d) Because of observation 4 (and constraints a-c), all the barrier 
>>>> locks should be below the "special" rank without violating any 
>>>> existing ranks.
>>>>
>>>> The proposed new lock ranks conform to the constraints derived from 
>>>> my observations. It is worth noting that the potential relationship 
>>>> that could break (and why they do not) are:
>>>> 1) If a lock is acquired from within the barriers that does not 
>>>> involve the shared queue lock, the free list lock or the completed 
>>>> buffer list monitor, we have now inverted their relationship as 
>>>> that other lock would probably have a rank higher than or equal to 
>>>> "special". But due to observation 4, there are no such cases.
>>>> 2) The relationship between the shared queue lock and the completed 
>>>> buffer list monitor has been changed so both can be held at the 
>>>> same time if the shared queue lock is acquired first (which it is). 
>>>> This is arguably the way it should have been from the first place, 
>>>> and the old solution had ugly hacks where we would drop the shared 
>>>> queue lock to not run into the lock order assert (and only not to 
>>>> run into the lock order assert, i.e. not to avoid potential 
>>>> deadlock) to ensure the locks are not held at the same time. That 
>>>> code has now been removed, so that the shared queue lock is still 
>>>> held when enqueueing completed buffers (no dodgy dropping and 
>>>> reclaiming), and the code for handling the races due to multiple 
>>>> concurrent enqueuers has also been removed and replaced with an 
>>>> assertion that there simply should not be multiple concurrent 
>>>> enqueuers. Since the shared queue lock is now held throughout the 
>>>> whole operation, there will be no concurrent enqueuers.
>>>> 3) The completed buffer list monitor used to have a higher rank 
>>>> than the free list lock. Now they have the same. Therefore, they 
>>>> could previously allow them to be held at the same time if the cbl 
>>>> monitor was acquired first. However, as discussed, there is no such 
>>>> case, and they ought to have the same rank not to confuse their 
>>>> true disjointness. If anyone insists we do not break this 
>>>> relationship despite the true disjointness, I could consent to 
>>>> adding another access lock rank, like this: 
>>>> http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I 
>>>> think it seems better to have the same rank since they are actually 
>>>> truly disjoint and should remain disjoint.
>>>>
>>>> I do recognize that long term, we *might* want a lock-free solution 
>>>> or something (not saying we do or do not). But until then, the 
>>>> ranks ought to be corrected so that they do not cause these 
>>>> problems causing everyone to bash their head against the awkward G1 
>>>> lock ranks throughout the code and make hacks around it.
>>>>
>>>> Testing: JPRT with hotspot all and lots of local testing.
>>>>
>>>> Thanks,
>>>> /Erik
>>


From erik.helin at oracle.com  Tue Jul  4 11:40:19 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Tue, 4 Jul 2017 13:40:19 +0200
Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set
Message-ID: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>

Hi all,

here comes a simple patch (just removing code) with a quite complicated 
justification :) So grab a cup of coffee, take out that good old pen and 
paper (it is almost impossible to convince yourself that this is correct 
without drawing) and enjoy the following little text:

The G1RemSet::_into_cset_dirty_card_queue_set is no longer needed. It 
was originally added to keep track of cards with pointer into the 
collection set. In the case of evacuation failure, this set of cards 
would then be enqueued for refinement in order to construct/update 
remembered sets for regions that encountered evacuation failure (only 
regions in the collection set can encounter evacuation failure). 
However, this functionality is already provided by the call to 
G1ParScanThreadState::update_rs and the evac failure handling code.

For pointers in regions outside of the collection set pointing into the 
collection set, we will always call G1ParScanThreadState::update_rs. 
G1ParScanThreadState::update_rs will enqueue the card containing the 
pointer pointing into the collection set onto 
G1CollectedHeap::_dirty_card_queue_set. So 
G1CollectedHeap::_dirty_card_queue_set will contain all the cards with 
pointers into the collection set (that are not themselves in the 
collection set). If an evacuation failure happens, then we will still 
trace through the object graph, calling do_oop_evac (but do_oop_evac 
will just return a pointer to the "from" object) for each object 
pointing into the collection set. This means that all cards in regions 
outside of the collection that contains pointers into the collection set 
will end up on G1CollectedHeap::_dirty_card_queue_set.

For pointers in regions in the collection set pointing into the 
collection set, those will be handled by the evacuation failure handling 
code. The evacuation failure handling code will iterate over all objects 
in all regions that encountered an evacuation failure. If it encounters 
an object with a forwarding pointer pointing to itself, then it will 
enqueue the cards that contains that object's fields onto 
G1CollectedHeap::_dirty_card_queue_set.

The two above paragraphs means that after a collection, 
G1CollectedHeap::_dirty_card_queue_set will always contain all cards 
that contained pointers into the collection set. This is true for both a 
successful collection and a collection that encountered evacuation 
failure. However, these cards are exactly the cards that 
G1RemSet::_into_cset_dirty_card_queue_set contains, so we might as well 
remove the G1RemSet::_into_cset_dirty_card_queue_set.

Patch: http://cr.openjdk.java.net/~ehelin/8183539/00/
Issue: https://bugs.openjdk.java.net/browse/JDK-8183539
Testing: make test TEST=hotspot_gc on Linux x86-64 fastdebug

Thanks,
Erik


From mikael.gerdin at oracle.com  Tue Jul  4 12:17:52 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 4 Jul 2017 14:17:52 +0200
Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set
In-Reply-To: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>
References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>
Message-ID: <ddffcf61-df48-d956-676e-14a80a4a3af7@oracle.com>

Hi Erik,

On 2017-07-04 13:40, Erik Helin wrote:
> Hi all,
> 
> here comes a simple patch (just removing code) with a quite complicated 
> justification :) So grab a cup of coffee, take out that good old pen and 
> paper (it is almost impossible to convince yourself that this is correct 
> without drawing) and enjoy the following little text:
> 
> The G1RemSet::_into_cset_dirty_card_queue_set is no longer needed. It 
> was originally added to keep track of cards with pointer into the 
> collection set. In the case of evacuation failure, this set of cards 
> would then be enqueued for refinement in order to construct/update 
> remembered sets for regions that encountered evacuation failure (only 
> regions in the collection set can encounter evacuation failure). 
> However, this functionality is already provided by the call to 
> G1ParScanThreadState::update_rs and the evac failure handling code.
> 
> For pointers in regions outside of the collection set pointing into the 
> collection set, we will always call G1ParScanThreadState::update_rs. 
> G1ParScanThreadState::update_rs will enqueue the card containing the 
> pointer pointing into the collection set onto 
> G1CollectedHeap::_dirty_card_queue_set. So 
> G1CollectedHeap::_dirty_card_queue_set will contain all the cards with 
> pointers into the collection set (that are not themselves in the 
> collection set). If an evacuation failure happens, then we will still 
> trace through the object graph, calling do_oop_evac (but do_oop_evac 
> will just return a pointer to the "from" object) for each object 
> pointing into the collection set. This means that all cards in regions 
> outside of the collection that contains pointers into the collection set 
> will end up on G1CollectedHeap::_dirty_card_queue_set.
> 
> For pointers in regions in the collection set pointing into the 
> collection set, those will be handled by the evacuation failure handling 
> code. The evacuation failure handling code will iterate over all objects 
> in all regions that encountered an evacuation failure. If it encounters 
> an object with a forwarding pointer pointing to itself, then it will 
> enqueue the cards that contains that object's fields onto 
> G1CollectedHeap::_dirty_card_queue_set.
> 
> The two above paragraphs means that after a collection, 
> G1CollectedHeap::_dirty_card_queue_set will always contain all cards 
> that contained pointers into the collection set. This is true for both a 
> successful collection and a collection that encountered evacuation 
> failure. However, these cards are exactly the cards that 
> G1RemSet::_into_cset_dirty_card_queue_set contains, so we might as well 
> remove the G1RemSet::_into_cset_dirty_card_queue_set.
> 
> Patch: http://cr.openjdk.java.net/~ehelin/8183539/00/
> Issue: https://bugs.openjdk.java.net/browse/JDK-8183539
> Testing: make test TEST=hotspot_gc on Linux x86-64 fastdebug

Do you know if any of the tests actually would have failed if rem set 
reconstruction after evacuation failure didn't work properly?

I'd feel safer with this change if you ran with some verification code 
to ensure that the into_cset queue was always useless when evac failure 
occurs.

Thanks
/Mikael

> 
> Thanks,
> Erik


From thomas.schatzl at oracle.com  Tue Jul  4 15:24:56 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 04 Jul 2017 17:24:56 +0200
Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set
In-Reply-To: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>
References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>
Message-ID: <1499181896.2757.19.camel@oracle.com>

Hi Erik,

On Tue, 2017-07-04 at 13:40 +0200, Erik Helin wrote:
> Hi all,
> 
> here comes a simple patch (just removing code) with a quite
> complicated justification :) So grab a cup of coffee, take out that
> good old pen and paper (it is almost impossible to convince yourself
> that this is correct without drawing) and enjoy the following little
> text:
> 
> 
[... long explanation...]
> Patch: http://cr.openjdk.java.net/~ehelin/8183539/00/
> Issue: https://bugs.openjdk.java.net/browse/JDK-8183539
> Testing: make test TEST=hotspot_gc on Linux x86-64 fastdebug

? while I think the explanation is good, and actually we discussed this
together, some more testing would be nice ;) Something like gcbasher
with G1EvacuationFailureALot.

Minor nit:?g1RemSet.cpp:710 the "return" statement is superfluous.
(although I have already a change in mind that re-adds a return value
;))

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Tue Jul  4 15:25:45 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 04 Jul 2017 17:25:45 +0200
Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards for
 NULL references
Message-ID: <1499181945.2757.20.camel@oracle.com>

Hi,

? can I have reviews for this change that adds a NULL-check in the
UpdateRSetDeferred closure so that we do not enqueue cards with NULL
references in it during evacuation failure?

CR:
https://bugs.openjdk.java.net/browse/JDK-8183127
Webrev:
http://cr.openjdk.java.net/~tschatzl/8183127/webrev/
Testing:
jprt,?gcbasher with G1EvacuationFailureALot for 1/2 hour

I think this amount of testing is sufficient as the reasoning for this
change is not *that* complicated.

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Tue Jul  4 15:42:33 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 04 Jul 2017 17:42:33 +0200
Subject: RFR (S): 8179679: Rearrange filters before card scanning
In-Reply-To: <1499081093.2802.30.camel@oracle.com>
References: <1499081093.2802.30.camel@oracle.com>
Message-ID: <1499182953.2757.21.camel@oracle.com>

Hi all,

On Mon, 2017-07-03 at 13:24 +0200, Thomas Schatzl wrote:
>?Hi all,
>?
>?? please have a look at this change that rearranges the checks in the
>?G1RemSet card scanning a bit in order to:
>?

?Erik had a look at this change with the following comments:

- rename?card_region_idx ->?region_idx_for_card
- factor out the two calls to claim a card and dirty its region into a
method
- move calculation of "card_region" into the scan_card() method.
- he pointed out that the change can use G1CollectedHeap::region_at()
instead of G1CollectedHeap::heap_region_containing() as it is simpler.
- there has been another comment on why the change claims the card
after checking whether the card is within the region's boundaries, and
if that wouldn't be better performed right after the is_claimed check.

Doing so will claim cards originating from stray remembered set entries
into the current survivor regions as claimed, since we do not clear
these regions later again (see G1ClearCardTableTask::work()) - their
cards need to be "Young", and this is done during allocation of the
region.

This results in the card table verification to fail later.

I think if we should think of changing the handling of survivor regions
during the clear CT phase as part of a different CR. For now I added a
comment.

Webrev:
http://cr.openjdk.java.net/~tschatzl/8179679/webrev.1_to_2?(diff)
http://cr.openjdk.java.net/~tschatzl/8179679/webrev.2?(full)
Testing:
gcbasher

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Tue Jul  4 17:43:35 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 04 Jul 2017 19:43:35 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <59510D5E.10009@oracle.com>
References: <59510D5E.10009@oracle.com>
Message-ID: <1499190215.2423.3.camel@oracle.com>

Hi,

On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote:
> Hi,
> 
> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
> 

? looks good apart from the comment at Monitor::event_types. It now
contradicts itself from one sentence to the next ("special must be
lowest" and then "oh no, after all access must be lowest"). Please try
to find some better wording here :)

> The G1 barrier queues have very awkward lock orderings for the
> following reasons:
> 
[...]
> 
> I do recognize that long term, we *might* want a lock-free solution
> or something (not saying we do or do not). But until then, the ranks
> ought to be corrected so that they do not cause these problems
> causing everyone to bash their head against the awkward G1 lock ranks
> throughout the code and make hacks around it.
> 
> Testing: JPRT with hotspot all and lots of local testing.

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Tue Jul  4 18:14:04 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 04 Jul 2017 20:14:04 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <1499190215.2423.3.camel@oracle.com>
References: <59510D5E.10009@oracle.com> <1499190215.2423.3.camel@oracle.com>
Message-ID: <1499192044.2423.8.camel@oracle.com>

Hi,

On Tue, 2017-07-04 at 19:43 +0200, Thomas Schatzl wrote:
> Hi,
> 
> On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote:
> > 
> > Hi,
> > 
> > Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
> > 
> ? looks good apart from the comment at Monitor::event_types. It now
> contradicts itself from one sentence to the next ("special must be
> lowest" and then "oh no, after all access must be lowest"). Please
> try to find some better wording here :)

Some more comments about the comment added in this change:

? 96???// The rank access is reserved for locks that may be required to
perform
? 97???// memory accesses that require special GC barriers, such as
SATB barriers.
? 98???// Since memory accesses should be able to be performed pretty
much anywhere
? 99???// in the code, that wannts being more special than the
"special" rank.

- s/wannts/requires in that comment.

- I do not think the access lock rank is used for special GC barriers,
at least the "SATB barrier" is a bad example. The SATB barrier is
commonly the pre-write barrier in generated code, and the locks do not
have a lot in common with write barriers.

Maybe the text wanted to give an example for why locks of this rank
could be called at any time - because the lock might be taken as part
of some SATB barrier code?

Thanks,
? Thomas


From rkennke at redhat.com  Tue Jul  4 18:47:52 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 4 Jul 2017 20:47:52 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level interfaces
 CollectorPolicy and CollectedHeap
Message-ID: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>

AdaptiveSizePolicy is not used/called from outside the GCs, and not all
GCs need them. It makes sense to remove it from the CollectedHeap and
CollectorPolicy interfaces and move them down to the actual subclasses
that used them.

I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only
used/implemented in the parallel GC. Also, I made this class AllStatic
(was StackObj)

Tested by running hotspot_gc jtreg tests without regressions.

http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
<http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>

Roman


From kim.barrett at oracle.com  Wed Jul  5 02:00:26 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 4 Jul 2017 22:00:26 -0400
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <59510D5E.10009@oracle.com>
References: <59510D5E.10009@oracle.com>
Message-ID: <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>

> On Jun 26, 2017, at 9:34 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
> 
> Hi,
> 
> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703

------------------------------------------------------------------------------
src/share/vm/gc/g1/ptrQueue.cpp
Removing unlock / relock around
  78   qset()->enqueue_complete_buffer(node);

I would prefer that this part of this changeset not be made at this
time.

This part isn't necessary for the main point of this changeset.  It's
a cleanup that is enabled by the lock rank changes, where the rank
changes are required for other reasons.

It also at least conflicts with, and probably breaks, a pending change
of mine.  (I have a largish stack of patches in this area that didn't
quite make it into JDK 9 before the original FC date, and which I've
been (all too slowly) trying to work my way through and bring into JDK
10.)

The RFR says:

> 2) There is an issue that the shared queue locks have a "special"
> rank, which is below the lock ranks used by the cbl monitor and free
> list monitor. This leads to an issue when these locks have to be taken
> while holding the shared queue locks. The current solution is to drop
> the shared queue locks temporarily, introducing nasty data
> races. These races are guarded, but the whole race seems very
> unnecessary.

This isn't entirely accurate, as the shared queue locks are not
"special" rank; the current lock ranks are described correctly later
in the RFR.

It's true there is an "interesting" bit of code there to temporarily
drop the shared queue lock.  I don't think it's harmful to do so, and
could have some small benefit now.  More importantly, one of the
changes in that afore mentioned stack of patches puts more (possibly
significantly more in some cases) work into that dropped-lock region.
And if that idea ultimately doesn't pan out, simply removing the
unlock/relock pair is not, IMO, the right way to clean things up;
there is some additional refactoring that ought to be done.

------------------------------------------------------------------------------

The lock ranking changes look good.


From mikael.gerdin at oracle.com  Wed Jul  5 08:30:14 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 5 Jul 2017 10:30:14 +0200
Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards
 for NULL references
In-Reply-To: <1499181945.2757.20.camel@oracle.com>
References: <1499181945.2757.20.camel@oracle.com>
Message-ID: <727d03c9-c206-ce87-093c-3eee21a20049@oracle.com>

Hi Thomas,

On 2017-07-04 17:25, Thomas Schatzl wrote:
> Hi,
> 
>    can I have reviews for this change that adds a NULL-check in the
> UpdateRSetDeferred closure so that we do not enqueue cards with NULL
> references in it during evacuation failure?
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183127
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183127/webrev/

Looks good to me. I agree that the amount of testing seems sufficient.
/Mikael

> Testing:
> jprt, gcbasher with G1EvacuationFailureALot for 1/2 hour
> 
> I think this amount of testing is sufficient as the reasoning for this
> change is not *that* complicated.
> 
> Thanks,
>    Thomas
> 


From erik.osterlund at oracle.com  Wed Jul  5 08:51:45 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 5 Jul 2017 10:51:45 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <1499192044.2423.8.camel@oracle.com>
References: <59510D5E.10009@oracle.com> <1499190215.2423.3.camel@oracle.com>
 <1499192044.2423.8.camel@oracle.com>
Message-ID: <595CA8A1.4040101@oracle.com>

Hi Thomas,

On 2017-07-04 20:14, Thomas Schatzl wrote:
> Hi,
>
> On Tue, 2017-07-04 at 19:43 +0200, Thomas Schatzl wrote:
>> Hi,
>>
>> On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote:
>>> Hi,
>>>
>>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>>>
>>    looks good apart from the comment at Monitor::event_types. It now
>> contradicts itself from one sentence to the next ("special must be
>> lowest" and then "oh no, after all access must be lowest"). Please
>> try to find some better wording here :)
> Some more comments about the comment added in this change:
>
>    96   // The rank access is reserved for locks that may be required to
> perform
>    97   // memory accesses that require special GC barriers, such as
> SATB barriers.
>    98   // Since memory accesses should be able to be performed pretty
> much anywhere
>    99   // in the code, that wannts being more special than the
> "special" rank.
>
> - s/wannts/requires in that comment.

Fixed.

> - I do not think the access lock rank is used for special GC barriers,
> at least the "SATB barrier" is a bad example. The SATB barrier is
> commonly the pre-write barrier in generated code, and the locks do not
> have a lot in common with write barriers.
> Maybe the text wanted to give an example for why locks of this rank
> could be called at any time - because the lock might be taken as part
> of some SATB barrier code?

I do not understand why SATB is a bad example. Perhaps you could elaborate?
It is specifically the SATB barriers that are the biggest issue for me 
and what made me want to make this change. It is required for both 
writes but also for all weak loads. And it is specifically the weak 
loads that give me the most headache and serves as the main motivator 
for this. These include resolving jweaks, looking up strings, keeping 
class holders alive on compiler threads when looking up metadata in 
ciEnv, and a whole bunch of other stuff. The current code for handling 
weak loads is full of hacks like these:

{ MutexLockerEx m(...);
   oop obj = load_weak_oop(...);
}

keep_alive(obj);

return obj;

...where the keep alive barrier required by SATB for weak loads has been 
moved way out from the critical section (even multiple levels up in the 
call tree) due to lock ordering problems with G1 SATB barrier code that 
forbids this barrier while holding certain locks.
For the new GC barrier interface that introduces declarative semantics, 
I need the barriers to be tightly bound to the access, and I need 
accesses to not be disallowed due to holding other VM locks. We already 
perform JNIHandles::resolve while holding "special" ranked locks today, 
and hopefully get away with it by making sure these resolutions can 
never be passed a jweak disguised as a jobject. But I do not want to 
require hotspot developers to have to think about whether what they are 
resolving could be weak and then have to consider that SATB barriers 
require locks with high ranks, requiring them to rewrite the code.

Having said that, of course the post-write barriers are problematic as 
well, as I want it to be possible to perform stores without having to 
think about random G1 locks in a similar fashion.

Speaking of which, I am entertaining the idea that perhaps the 
HeapRegionRemSet::_m lock ought to get the new access rank too. It seems 
to me like it could happen that a JavaThread performing a reference 
store decides to join in on concurrent refinement and has to take that 
_m lock when adding a reference to a remembered set. Therefore, this 
current "leaf" ranked lock could be acquired when performing a store on 
JavaThreads. The current leaf rank is not conflicting with my currently 
proposed changes, and I do not require it for refactoring weak loads. 
The reason is that:
1) Only JavaThreads help out with refinement, and only due to their 
local queue being is full (not when e.g. a card could not be parsed and 
the shared queue is grabbed).
2) JavaThreads do not acquire the shared queue lock before calling 
enqueue on the local queue in their barriers as they use their own local 
queue instead.
3) Because of 1 and 2, when collaborative concurrent refinement is 
called from the queue, the shared queue lock is not held.
4) The cbl monitor and free list lock are not held either when 
concurrent refinement is called from the queue.
5) Due to 3 and 4, no access locks from the queues are held when calling 
concurrent refinement helper code.

Having said that, it would still be good for consistency to move that 
lock down to access too, so that JavaThread reference stores can be 
performed more freely in the code. If there are plans of getting rid of 
that lock from refinement, then I think I can live with the current leaf 
rank, but if there are no plans of getting rid of that lock from 
refinement, I think I should probably squeeze that lock order change 
into this change. Perhaps the rank should be changed meanwhile anyway. 
Do you agree about this?

Thanks,
/Erik

> Thanks,
>    Thomas
>


From erik.osterlund at oracle.com  Wed Jul  5 08:53:46 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 5 Jul 2017 10:53:46 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <1499190215.2423.3.camel@oracle.com>
References: <59510D5E.10009@oracle.com> <1499190215.2423.3.camel@oracle.com>
Message-ID: <595CA91A.5070901@oracle.com>

Hi Thomas,

Thanks for the review.

On 2017-07-04 19:43, Thomas Schatzl wrote:
> Hi,
>
> On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote:
>> Hi,
>>
>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>>
>    looks good apart from the comment at Monitor::event_types. It now
> contradicts itself from one sentence to the next ("special must be
> lowest" and then "oh no, after all access must be lowest"). Please try
> to find some better wording here :)

Agreed. Will fix and send out new webrev after I receive a reply to my 
reply to your other email. That turned into a more complicated sentence 
than I anticipated.

Thanks,
/Erik

>> The G1 barrier queues have very awkward lock orderings for the
>> following reasons:
>>
> [...]
>> I do recognize that long term, we *might* want a lock-free solution
>> or something (not saying we do or do not). But until then, the ranks
>> ought to be corrected so that they do not cause these problems
>> causing everyone to bash their head against the awkward G1 lock ranks
>> throughout the code and make hacks around it.
>>
>> Testing: JPRT with hotspot all and lots of local testing.
> Thanks,
>    Thomas
>


From erik.osterlund at oracle.com  Wed Jul  5 10:24:00 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 5 Jul 2017 12:24:00 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
Message-ID: <595CBE40.5050603@oracle.com>

Hi Kim,

Thank you for the review.

On 2017-07-05 04:00, Kim Barrett wrote:
>> On Jun 26, 2017, at 9:34 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>>
>> Hi,
>>
>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
> ------------------------------------------------------------------------------
> src/share/vm/gc/g1/ptrQueue.cpp
> Removing unlock / relock around
>    78   qset()->enqueue_complete_buffer(node);
>
> I would prefer that this part of this changeset not be made at this
> time.
>
> This part isn't necessary for the main point of this changeset.  It's
> a cleanup that is enabled by the lock rank changes, where the rank
> changes are required for other reasons.

Okay.

> It also at least conflicts with, and probably breaks, a pending change
> of mine.  (I have a largish stack of patches in this area that didn't
> quite make it into JDK 9 before the original FC date, and which I've
> been (all too slowly) trying to work my way through and bring into JDK
> 10.)

I agree that it would be possible to just correct the ranks while 
allowing the spaghetti synchronization code to remain in the code base. 
Here are some comments about that to me not so attractive idea:
1) I would really like to get rid of that code, because I think it is 
poor synchronization practice and its stated reason for existence is 
gone now.
2) I have to do *something* about that part in the current change, 
otherwise the comment motivating its existence will be incorrect after 
my lock rank change. There is no longer a legitimate motivation for 
doing that unlock and re-lock. So we have the choice of making a new 
made up motivation why we do this anyway, or to remove it. For me the 
choice is easily to remove it.
3) If some new actual motivation for dropping that lock arises later on 
down the road (which I am dubious about), then I do not see an issue 
with simply re-adding it then, when/if that becomes necessary again, 
with a new corresponding motivation added in appropriately.

As far as your new changes go, I am curious what they do to motivate 
unlocking/re-locking this shared queue lock again. As outlined in my 
recent email to Thomas, we do not hold either of these queue locks when 
concurrent refinement helper code is called from GC barriers invoked 
from JavaThreads, even with my new changes. If it is in this code path 
that you will perform more work (just speculating), then that should be 
invariant of this cleanup.

Therefore I would like to know, since I am asked to withdraw the code 
that cleans up the hacky spaghetti synchronization code, with the 
motivation that there might be a new reason for doing this later on, 
that we are at least certain that this unlock/re-lock will for sure be 
needed then.

> The RFR says:
>
>> 2) There is an issue that the shared queue locks have a "special"
>> rank, which is below the lock ranks used by the cbl monitor and free
>> list monitor. This leads to an issue when these locks have to be taken
>> while holding the shared queue locks. The current solution is to drop
>> the shared queue locks temporarily, introducing nasty data
>> races. These races are guarded, but the whole race seems very
>> unnecessary.
> This isn't entirely accurate, as the shared queue locks are not
> "special" rank; the current lock ranks are described correctly later
> in the RFR.

Yes you are right.

> It's true there is an "interesting" bit of code there to temporarily
> drop the shared queue lock.  I don't think it's harmful to do so, and
> could have some small benefit now.  More importantly, one of the
> changes in that afore mentioned stack of patches puts more (possibly
> significantly more in some cases) work into that dropped-lock region.
> And if that idea ultimately doesn't pan out, simply removing the
> unlock/relock pair is not, IMO, the right way to clean things up;
> there is some additional refactoring that ought to be done.

Could you please elaborate why you do not consider removing the 
unlock/lock due to incorrect lock ranks being the right cleanup after 
that very incorrect lock rank issue has been resolved?

> ------------------------------------------------------------------------------
>
> The lock ranking changes look good.

Thanks, I am glad we agree about this.

/Erik


From mikael.gerdin at oracle.com  Wed Jul  5 11:12:20 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 5 Jul 2017 13:12:20 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
Message-ID: <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com>

Hi Roman,

On 2017-07-04 20:47, Roman Kennke wrote:
> AdaptiveSizePolicy is not used/called from outside the GCs, and not all
> GCs need them. It makes sense to remove it from the CollectedHeap and
> CollectorPolicy interfaces and move them down to the actual subclasses
> that used them.
> 
> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only
> used/implemented in the parallel GC. Also, I made this class AllStatic
> (was StackObj)
> 
> Tested by running hotspot_gc jtreg tests without regressions.
> 
> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/

Please correct me if I'm wrong here but it looks like all the non-G1 
collectors set the _should_clear_all_soft_refs based on 
gc_overhead_limit_near.
Perhaps the ClearedAllSoftRefs scoped object could be modified to only 
work with GenCollectorPolicy derived policies (which include parallel 
*shrugs*) and G1 should just stop worrying about _all_soft_refs_clear.
Looking closer, I can't even find G1 code looking at that member so 
maybe it, too, should be moved to GenCollectorPolicy?

What do you think?
/Mikael

> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>
> 
> Roman
> 


From erik.osterlund at oracle.com  Wed Jul  5 11:39:48 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 5 Jul 2017 13:39:48 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <595CA91A.5070901@oracle.com>
References: <59510D5E.10009@oracle.com> <1499190215.2423.3.camel@oracle.com>
 <595CA91A.5070901@oracle.com>
Message-ID: <595CD004.6000206@oracle.com>

Hi,

Thomas and I discussed offline and came to the following conclusions:
1) Lowering the lock rank of HeapRegionRemSet::_m to access would be 
nice indeed, but probably deserves a separate RFE with further reasoning 
and analysis. Will stick to the queue-related lock ranks in this RFE.
2) We agree mostly about the comments, but I have a new webrev that 
hopefully has even more clear comments regarding the new access rank.

Incremental webrev:
http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02_03/

Full webrev:
http://cr.openjdk.java.net/~eosterlund/8182703/webrev.03/

Thanks for reviewing, and hope the new comments are satisfactory.

/Erik

On 2017-07-05 10:53, Erik ?sterlund wrote:
> Hi Thomas,
>
> Thanks for the review.
>
> On 2017-07-04 19:43, Thomas Schatzl wrote:
>> Hi,
>>
>> On Mon, 2017-06-26 at 15:34 +0200, Erik ?sterlund wrote:
>>> Hi,
>>>
>>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>>>
>>    looks good apart from the comment at Monitor::event_types. It now
>> contradicts itself from one sentence to the next ("special must be
>> lowest" and then "oh no, after all access must be lowest"). Please try
>> to find some better wording here :)
>
> Agreed. Will fix and send out new webrev after I receive a reply to my 
> reply to your other email. That turned into a more complicated 
> sentence than I anticipated.
>
> Thanks,
> /Erik
>
>>> The G1 barrier queues have very awkward lock orderings for the
>>> following reasons:
>>>
>> [...]
>>> I do recognize that long term, we *might* want a lock-free solution
>>> or something (not saying we do or do not). But until then, the ranks
>>> ought to be corrected so that they do not cause these problems
>>> causing everyone to bash their head against the awkward G1 lock ranks
>>> throughout the code and make hacks around it.
>>>
>>> Testing: JPRT with hotspot all and lots of local testing.
>> Thanks,
>>    Thomas
>>
>


From mikael.gerdin at oracle.com  Wed Jul  5 11:58:14 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 5 Jul 2017 13:58:14 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
Message-ID: <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>

Hi Roman,

On 2017-07-03 17:05, Roman Kennke wrote:
> Am 03.07.2017 um 11:13 schrieb Roman Kennke:
>> Am 03.07.2017 um 09:35 schrieb Mikael Gerdin:
>>> Hi Roman,
>>>
>>> On 2017-06-30 18:32, Roman Kennke wrote:
>>>> I came across one problem using this approach: We will have 2 instances
>>>> of CollectedHeap around, where there's usually only 1, and some code
>>>> expects only 1. For example, in CollectedHeap constructor, we create new
>>>> PerfData variables, and we now create them 2x, which leads to an assert
>>>> being thrown. I suspect there is more code like that.
>>>>
>>>> I will attempt to refactor this a little more, maybe it's not that bad,
>>>> but it's probably not worth spending too much time on it.
>>> I think refactoring the code to not expect a singleton CollectedHeap
>>> instance is a bit too much.
>>> Perhaps there is another way to share common code between Serial and
>>> CMS but that might require a bit more thought.
>> Yeah, definitely. I hit another difficulty: pretty much the same issues
>> that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now show up
>> with Generation and its subclasses..
>>
>> How about we push the original patch that I've posted, and work from
>> there? In fact, I *have* found some little things I would change (some
>> more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have
>> overlooked in my first pass...)
> 
> So here's the little change (two more places in genCollectedHeap.hpp
> where UseConcMarkSweepGC was used to alter behaviour:
> 
> http://cr.openjdk.java.net/~rkennke/8179387/webrev.02/
> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.02/>
> 
> Ok to push this?

I think this looks like a good step in the right direction!
One thing I noticed is that you can put "enum GCH_strong_roots_tasks" 
inside of GenCollectedHeap to avoid tainting the global namespace with 
the enum members. Just above the declaration of _process_strong_tasks 
seems like an excellent location for the enum declaration :)

This looks like it's not needed anymore.
bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) {
   if (!UseConcMarkSweepGC) {
     return false;
   }

/Mikael

> 
> Roman
> 


From thomas.schatzl at oracle.com  Wed Jul  5 12:37:26 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 05 Jul 2017 14:37:26 +0200
Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards
 for NULL references
In-Reply-To: <727d03c9-c206-ce87-093c-3eee21a20049@oracle.com>
References: <1499181945.2757.20.camel@oracle.com>
 <727d03c9-c206-ce87-093c-3eee21a20049@oracle.com>
Message-ID: <1499258246.15955.3.camel@oracle.com>

Hi Mikael,

On Wed, 2017-07-05 at 10:30 +0200, Mikael Gerdin wrote:
> Hi Thomas,
> 
> On 2017-07-04 17:25, Thomas Schatzl wrote:
> > 
> > Hi,
> > 
> > ???can I have reviews for this change that adds a NULL-check in the
> > UpdateRSetDeferred closure so that we do not enqueue cards with
> > NULL
> > references in it during evacuation failure?
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8183127
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8183127/webrev/
> Looks good to me. I agree that the amount of testing seems
> sufficient.

? thanks for your review.

Thomas


From daniel.daugherty at oracle.com  Wed Jul  5 18:30:37 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 5 Jul 2017 12:30:37 -0600
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
Message-ID: <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>

On 6/27/17 1:47 PM, Roman Kennke wrote:
> Hi Robbin,
>
> Ugh. Thanks for catching this.
> Problem was that I was accounting the thread-local deflations twice:
> once in thread-local processing (basically a leftover from my earlier
> attempt to implement this accounting) and then again in
> finish_deflate_idle_monitors(). Should be fixed here:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.09/>

Are you thinking that this fix resolves all three bugs:

     8132849 Increased stop time in cleanup phase because of single-threaded
             walk of thread stacks in NMethodSweeper::mark_active_nmethods()
     8153224 Monitor deflation prolong safepoints
     8180932 Parallelize safepoint cleanup

JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
review of this fix also.

General comments:
   - Please don't forget to update Copyright years as needed before pushing

src/share/vm/gc/shared/collectedHeap.hpp
     No comments.

src/share/vm/runtime/safepoint.hpp
     L78:   enum SafepointCleanupTasks {
         You might want to add a comment here:
              // The enums are listed in the order of the tasks when 
done serially.

src/share/vm/runtime/safepoint.cpp
     L556:         ! thread->is_Code_cache_sweeper_thread()) {
     L581:     if (! 
_subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
{
     L589:     if (! 
_subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_UPDATE_INLINE_CACHES)) 
{
     L597:     if (! 
_subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_COMPILATION_POLICY)) 
{
     L605:     if (! 
_subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_SYMBOL_TABLE_REHASH)) 
{
     L615:     if (! 
_subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_STRING_TABLE_REHASH)) 
{
     L625:     if (! 
_subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_CLD_PURGE)) 
{
         nit: HotSpot style doesn't usually have a space after unary '!'.

     L638: // Various cleaning tasks that should be done periodically at 
safepoints
     L641:   // Prepare for monitor deflation
         nit: Please add a period to the end of these sentences.

src/share/vm/runtime/sweeper.hpp
     No comments.

src/share/vm/runtime/sweeper.cpp
     L205:     // TODO: Is this really needed?
     L206:     OrderAccess::storestore();
         That's a good question. Looks like that storestore() was
         added by this changeset:

         $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
         changeset:   5357:510fbd28919c
         user:        anoll
         date:        Fri Sep 27 10:50:55 2013 +0200
         summary:     8020151: PSR:PERF Large performance regressions 
when code cache is filled

         The changeset is not small and it looks like two
         OrderAccess::storestore() calls were added (and one
         load_ptr_acquire() was deleted):

         $ hg diff -r 5356 -r 5357 | grep OrderAccess
         +      OrderAccess::storestore();
         -  nmethod *code = (nmethod 
*)OrderAccess::load_ptr_acquire(&_code);
         +  OrderAccess::storestore();

         It could be that the storestore() is matching an existing
         OrderAccess operation or it could have been added in an
         abundance of caution. We definitely need a Compiler team
         person to take a look here.

src/share/vm/runtime/synchronizer.hpp
     L36:   int nInuse;         // currently associated with objects
     L37:   int nInCirculation; // extant
     L38:   int nScavenged;      // reclaimed
         nit: Please add one more space before '//' on L36,L37.

src/share/vm/runtime/synchronizer.cpp
     L1663: // Walk a given monitor list, and deflate idle monitors
     L1664: // The given list could be a per-thread list or a global list
     L1665: // Caller acquires gListLock
     L1666: int ObjectSynchronizer::deflate_monitor_list(ObjectMonitor** 
listHeadp,
     <snip>
     L1802:   int deflated_count = 
deflate_monitor_list(thread->omInUseList_addr(), &freeHeadp, &freeTailp);
     L1804:   Thread::muxAcquire(&gListLock, "scavenge - return");
         The above deflate_monitor_list() now occurs outside of the
         gListLock where the old code held the gListLock for this call.

         Yes, it is operating on the thread local list, but what keeps
         two different worker threads from trying to deflate_monitor_list()
         on the same JavaThread at the same time?

         Update: OK, so it looks like when we're doing parallel cleanup,
         each worker thread cleans up thread local monitors for the
         JavaThreads. I don't know this WorkGang stuff, but are these
         distinct threads from the JavaThreads? Or is each JavaThread
         "borrowed" to do its down monitor cleanup while we're at the
         safepoint? (How in the world would that idea work? Maybe I
         need more coffee here...)

         Without the gListLock, I don't see how the worker threads
         avoid conflicting over the same thread local list. Minimally,
         the comment on L1665 needs updating.

     L1697:   counters->nInuse = 0;         // currently associated with 
objects
     L1698:   counters->nInCirculation = 0; // extant
     L1699:   counters->nScavenged = 0;      // reclaimed
         nit: Please add one more space before '//' on L1697, L1698.

     old L1698:   int nInuse = 0;
     old L1713:     int inUse = 0;
         Nice catch here. I've read this code countless times and missed
         this bug until now. It explains why some of my Java monitor testing
         had odd "in use" counts.

     L1797:   if (! MonitorInUseLists) return;
         nit: HotSpot style doesn't usually have a space after unary '!'.

     L1808:   thread->omInUseCount-= deflated_count;
         nit: Please add a space before '-='.

src/share/vm/runtime/thread.hpp
     No comments.

src/share/vm/runtime/thread.cpp
     No comments.

This is very nice work and a great cleanup for a complicated part of
the system. David Simms did some recent work on the MonitorInUseLists
stuff. If he has time, it might be good for him to take a quick look
at this changeset, but I don't know his summer vacation schedule so
that may not be possible.

The only comment I need resolved is about the locking for the thread
local deflate_monitor_list() call. Everything else is minor.

Dan


>
> Side question: which jtreg targets do you usually run?
>
> Trying: make test TEST=hotspot_all
> gives me *lots* of failures due to missing jcstress stuff (?!)
>   And even other subsets seem to depend on several bits and pieces that I
> have no idea about.
>
> Roman
>
> Am 27.06.2017 um 16:51 schrieb Robbin Ehn:
>> Hi Roman,
>>
>> There is something wrong in calculations:
>> INFO: Deflate: InCirc=43 InUse=18 Scavenged=25 ForceMonitorScavenge=0
>> : pop=27051 free=215487
>>
>> free is larger than population, have not had the time to dig into this.
>>
>> Thanks, Robbin
>>
>> On 06/22/2017 10:19 PM, Roman Kennke wrote:
>>> So here's the latest iteration of that patch:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.08/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.08/>
>>>
>>> I checked and fixed all the counters. The problem here is that they are
>>> not updated in a single place (deflate_idle_monitors() ) but in several
>>> places, potentially by multiple threads. I split up deflation into
>>> prepare_.. and a finish_.. methods to initialize local and update global
>>> counters respectively, and pass around a counters object (allocated on
>>> stack) to the various code paths that use it. Updating the counters
>>> always happen under a lock, there's no need to do anything special with
>>> regards to concurrency.
>>>
>>> I also checked the nmethod marking, but there doesn't seem to be
>>> anything in that code that looks problematic under concurrency. The
>>> worst that can happen is that two threads write the same value into an
>>> nmethod field. I think we can live with that ;-)
>>>
>>> Good to go?
>>>
>>> Tested by running specjvm and jcstress fastdebug+release without issues.
>>>
>>> Roman
>>>
>>> Am 02.06.2017 um 12:39 schrieb Robbin Ehn:
>>>> Hi Roman,
>>>>
>>>> On 06/02/2017 11:41 AM, Roman Kennke wrote:
>>>>> Hi David,
>>>>> thanks for reviewing. I'll be on vacation the next two weeks too, with
>>>>> only sporadic access to work stuff.
>>>>> Yes, exposure will not be as good as otherwise, but it's not totally
>>>>> untested either: the serial code path is the same as the parallel, the
>>>>> only difference is that it's not actually called by multiple threads.
>>>>> It's ok I think.
>>>>>
>>>>> I found two more issues that I think should be addressed:
>>>>> - There are some counters in deflate_idle_monitors() and I'm not
>>>>> sure I
>>>>> correctly handle them in the split-up and MT'ed thread-local/ global
>>>>> list deflation
>>>>> - nmethod marking seems to unconditionally poke true or something like
>>>>> that in nmethod fields. This doesn't hurt correctness-wise, but it's
>>>>> probably worth checking if it's already true, especially when doing
>>>>> this
>>>>> with multiple threads concurrently.
>>>>>
>>>>> I'll send an updated patch around later, I hope I can get to it
>>>>> today...
>>>> I'll review that when you get it out.
>>>> I think this looks as a reasonable step before we tackle this with a
>>>> major effort, such as the JEP you and Carsten doing.
>>>> And another effort to 'fix' nmethods marking.
>>>>
>>>> Internal discussion yesterday lead us to conclude that the runtime
>>>> will probably need more threads.
>>>> This would be a good driver to do a 'global' worker pool which serves
>>>> both gc, runtime and safepoints with threads.
>>>>
>>>>> Roman
>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> I am about to disappear on an extended vacation so will let others
>>>>>> pursue this. IIUC this is longer an opt-in by the user at runtime,
>>>>>> but
>>>>>> an opt-in by the particular GC developers. Okay. My only concern with
>>>>>> that is if Shenandoah is the only GC that currently opts in then this
>>>>>> code is not going to get much testing and will be more prone to
>>>>>> incidental breakage.
>>>> As I mentioned before, it seem like Erik ? have some idea, maybe he
>>>> can do this after his barrier patch.
>>>>
>>>> Thanks!
>>>>
>>>> /Robbin
>>>>
>>>>>> Cheers,
>>>>>> David
>>>>>>
>>>>>> On 2/06/2017 2:21 AM, Roman Kennke wrote:
>>>>>>> Am 01.06.2017 um 17:50 schrieb Roman Kennke:
>>>>>>>> Am 01.06.2017 um 14:18 schrieb Robbin Ehn:
>>>>>>>>> Hi Roman,
>>>>>>>>>
>>>>>>>>> On 06/01/2017 11:29 AM, Roman Kennke wrote:
>>>>>>>>>> Am 31.05.2017 um 22:06 schrieb Robbin Ehn:
>>>>>>>>>>> Hi Roman, I agree that is really needed but:
>>>>>>>>>>>
>>>>>>>>>>> On 05/31/2017 10:27 AM, Roman Kennke wrote:
>>>>>>>>>>>> I realized that sharing workers with GC is not so easy.
>>>>>>>>>>>>
>>>>>>>>>>>> We need to be able to use the workers at a safepoint during
>>>>>>>>>>>> concurrent
>>>>>>>>>>>> GC work (which also uses the same workers). This does not only
>>>>>>>>>>>> require
>>>>>>>>>>>> that those workers be suspended, like e.g.
>>>>>>>>>>>> SuspendibleThreadSet::yield(), but they need to be idle, i.e.
>>>>>>>>>>>> have
>>>>>>>>>>>> finished their tasks. This needs some careful handling to work
>>>>>>>>>>>> without
>>>>>>>>>>>> races: it requires a SuspendibleThreadSetJoiner around the
>>>>>>>>>>>> corresponding
>>>>>>>>>>>> run_task() call and also the tasks themselves need to join the
>>>>>>>>>>>> STS and
>>>>>>>>>>>> handle requests for safepoints not by yielding, but by leaving
>>>>>>>>>>>> the
>>>>>>>>>>>> task.
>>>>>>>>>>>> This is far too peculiar for me to make the call to hook up GC
>>>>>>>>>>>> workers
>>>>>>>>>>>> for safepoint cleanup, and I thus removed those parts. I
>>>>>>>>>>>> left the
>>>>>>>>>>>> API in
>>>>>>>>>>>> CollectedHeap in place. I think GC devs who know better
>>>>>>>>>>>> about G1
>>>>>>>>>>>> and CMS
>>>>>>>>>>>> should make that call, or else just use a separate thread pool.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/
>>>>>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.05/>
>>>>>>>>>>>>
>>>>>>>>>>>> Is it ok now?
>>>>>>>>>>> I still think you should put the "Parallel Safepoint Cleanup"
>>>>>>>>>>> workers
>>>>>>>>>>> inside Shenandoah,
>>>>>>>>>>> so the SafepointSynchronizer only calls get_safepoint_workers,
>>>>>>>>>>> e.g.:
>>>>>>>>>>>
>>>>>>>>>>> _cleanup_workers = heap->get_safepoint_workers();
>>>>>>>>>>> _num_cleanup_workers = _cleanup_workers != NULL ?
>>>>>>>>>>> _cleanup_workers->total_workers() : 1;
>>>>>>>>>>> ParallelSPCleanupTask cleanup(_cleanup_subtasks);
>>>>>>>>>>> StrongRootsScope srs(_num_cleanup_workers);
>>>>>>>>>>> if (_cleanup_workers != NULL) {
>>>>>>>>>>>        _cleanup_workers->run_task(&cleanup,
>>>>>>>>>>> _num_cleanup_workers);
>>>>>>>>>>> } else {
>>>>>>>>>>>        cleanup.work(0);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> That way you don't even need your new flags, but it will be
>>>>>>>>>>> up to
>>>>>>>>>>> the
>>>>>>>>>>> other GCs to make their worker available
>>>>>>>>>>> or cheat with a separate workgang.
>>>>>>>>>> I can do that, I don't mind. The question is, do we want that?
>>>>>>>>> The problem is that we do not want to haste such decision, we
>>>>>>>>> believe
>>>>>>>>> there is a better solution.
>>>>>>>>> I think you also would want another solution.
>>>>>>>>> But it's seems like such solution with 1 'global' thread pool
>>>>>>>>> either
>>>>>>>>> own by GC or the VM it self is quite the undertaking.
>>>>>>>>> Since this probably will not be done any time soon my
>>>>>>>>> suggestion is,
>>>>>>>>> to not hold you back (we also want this), just to make
>>>>>>>>> the code parallel and as an intermediate step ask the GC if it
>>>>>>>>> minds
>>>>>>>>> sharing it's thread.
>>>>>>>>>
>>>>>>>>> Now when Shenandoah is merged it's possible that e.g. G1 will
>>>>>>>>> share
>>>>>>>>> the code for a separate thread pool, do something of it's own or
>>>>>>>>> wait until the bigger question about thread pool(s) have been
>>>>>>>>> resolved.
>>>>>>>>>
>>>>>>>>> By adding a thread pool directly to the SafepointSynchronizer and
>>>>>>>>> flags for it we might limit our future options.
>>>>>>>>>
>>>>>>>>>> I wouldn't call it 'cheating with a separate workgang' though. I
>>>>>>>>>> see
>>>>>>>>>> that both G1 and CMS suspend their worker threads at a safepoint.
>>>>>>>>>> However:
>>>>>>>>> Yes it's not cheating but I want decent heuristics between e.g.
>>>>>>>>> number
>>>>>>>>> of concurrent marking threads and parallel safepoint threads since
>>>>>>>>> they compete for cpu time.
>>>>>>>>> As the code looks now, I think that decisions must be made by the
>>>>>>>>> GC.
>>>>>>>> Ok, I see your point. I updated the proposed patch accordingly:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.06/
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.06/>
>>>>>>> Oops. Minor mistake there. Correction:
>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.07/
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.07/>
>>>>>>>
>>>>>>> (Removed 'class WorkGang' from safepoint.hpp, and forgot to add it
>>>>>>> into
>>>>>>> collectedHeap.hpp, resulting in build failure...)
>>>>>>>
>>>>>>> Roman
>>>>>>>
>


From rkennke at redhat.com  Wed Jul  5 21:17:51 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 5 Jul 2017 23:17:51 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
Message-ID: <8def1665-1fb3-c7a2-bc0d-0b63601a0c56@redhat.com>

Am 05.07.2017 um 20:30 schrieb Daniel D. Daugherty:
> On 6/27/17 1:47 PM, Roman Kennke wrote:
>> Hi Robbin,
>>
>> Ugh. Thanks for catching this.
>> Problem was that I was accounting the thread-local deflations twice:
>> once in thread-local processing (basically a leftover from my earlier
>> attempt to implement this accounting) and then again in
>> finish_deflate_idle_monitors(). Should be fixed here:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.09/>
>
> Are you thinking that this fix resolves all three bugs:
>
>     8132849 Increased stop time in cleanup phase because of
> single-threaded
>             walk of thread stacks in
> NMethodSweeper::mark_active_nmethods()
Yes. It requires additional support code by a GC though to become
actually multithreaded.
>     8153224 Monitor deflation prolong safepoints
Yes. But there's more that we want to do:
- deflate monitors during GC thread scanning (this is a huge winner)
- ultimately, deflate monitors concurrently (a JEP is on the way to
address this)

>     8180932 Parallelize safepoint cleanup
Yes :-)

> JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
> review of this fix also.
Ok, I will reach out to him.

> General comments:
>   - Please don't forget to update Copyright years as needed before
> pushing
Fixed.
>
> src/share/vm/runtime/safepoint.hpp
>     L78:   enum SafepointCleanupTasks {
>         You might want to add a comment here:
>              // The enums are listed in the order of the tasks when
> done serially.
Good idea. Done.
> src/share/vm/runtime/safepoint.cpp
>     L556:         ! thread->is_Code_cache_sweeper_thread()) {
>     L581:     if (!
> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS))
> {
>     L589:     if (!
> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_UPDATE_INLINE_CACHES))
> {
>     L597:     if (!
> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_COMPILATION_POLICY))
> {
>     L605:     if (!
> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_SYMBOL_TABLE_REHASH))
> {
>     L615:     if (!
> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_STRING_TABLE_REHASH))
> {
>     L625:     if (!
> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_CLD_PURGE))
> {
>         nit: HotSpot style doesn't usually have a space after unary '!'.
Ok, thanks! I didn't know that. Is there a document that describes the
Hotspot style? Because, from the top of my head, I can name 3 source
files all in entirely different styles ;-)
>
>     L638: // Various cleaning tasks that should be done periodically
> at safepoints
>     L641:   // Prepare for monitor deflation
>         nit: Please add a period to the end of these sentences.
>
Done.
> src/share/vm/runtime/sweeper.cpp
>     L205:     // TODO: Is this really needed?
>     L206:     OrderAccess::storestore();
>         That's a good question. Looks like that storestore() was
>         added by this changeset:
>
>         $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
>         changeset:   5357:510fbd28919c
>         user:        anoll
>         date:        Fri Sep 27 10:50:55 2013 +0200
>         summary:     8020151: PSR:PERF Large performance regressions
> when code cache is filled
>
>         The changeset is not small and it looks like two
>         OrderAccess::storestore() calls were added (and one
>         load_ptr_acquire() was deleted):
>
>         $ hg diff -r 5356 -r 5357 | grep OrderAccess
>         +      OrderAccess::storestore();
>         -  nmethod *code = (nmethod
> *)OrderAccess::load_ptr_acquire(&_code);
>         +  OrderAccess::storestore();
>
>         It could be that the storestore() is matching an existing
>         OrderAccess operation or it could have been added in an
>         abundance of caution. We definitely need a Compiler team
>         person to take a look here.
I looked around a little bit. As far as I can tell, all compiler threads
are stopped at a safepoint there. And I don't see anything else that
uses the affected fields during the safepoint. There's a fence() before
resuming safepointed threads. I think it should be safe without
storestore(), but would like to get confirmation from compiler team too.
> src/share/vm/runtime/synchronizer.hpp
>     L36:   int nInuse;         // currently associated with objects
>     L37:   int nInCirculation; // extant
>     L38:   int nScavenged;      // reclaimed
>         nit: Please add one more space before '//' on L36,L37.
Oops. Done.
> src/share/vm/runtime/synchronizer.cpp
>     L1663: // Walk a given monitor list, and deflate idle monitors
>     L1664: // The given list could be a per-thread list or a global list
>     L1665: // Caller acquires gListLock
>     L1666: int
> ObjectSynchronizer::deflate_monitor_list(ObjectMonitor** listHeadp,
>     <snip>
>     L1802:   int deflated_count =
> deflate_monitor_list(thread->omInUseList_addr(), &freeHeadp, &freeTailp);
>     L1804:   Thread::muxAcquire(&gListLock, "scavenge - return");
>         The above deflate_monitor_list() now occurs outside of the
>         gListLock where the old code held the gListLock for this call.
>
>         Yes, it is operating on the thread local list, but what keeps
>         two different worker threads from trying to
> deflate_monitor_list()
>         on the same JavaThread at the same time?
The mechanics in Threads::parallel_java_threads_do() (which I adapted
from Threads::possibly_parallel_oops_do()) ensure that each worker
thread claims a Java thread before processing it. This ensures that each
Java thread is processed by exactly one worker thread.
>         Without the gListLock, I don't see how the worker threads
>         avoid conflicting over the same thread local list. Minimally,
>         the comment on L1665 needs updating.
Okidoki, I added those blocks there:

// In the case of parallel processing of thread local monitor lists,
// work is done by Threads::parallel_threads_do() which ensures that
// each Java thread is processed by exactly one worker thread, and
// thus avoid conflicts that would arise when worker threads would
// process the same monitor lists concurrently.
//
// See also ParallelSPCleanupTask and
// SafepointSynchronizer::do_cleanup_tasks() in safepoint.cpp and
// Threads::parallel_java_threads_do() in thread.cpp.

>
>     L1697:   counters->nInuse = 0;         // currently associated
> with objects
>     L1698:   counters->nInCirculation = 0; // extant
>     L1699:   counters->nScavenged = 0;      // reclaimed
>         nit: Please add one more space before '//' on L1697, L1698.
Done.
>     old L1698:   int nInuse = 0;
>     old L1713:     int inUse = 0;
>         Nice catch here. I've read this code countless times and missed
>         this bug until now. It explains why some of my Java monitor
> testing
>         had odd "in use" counts.
Hmm. I am not aware of a bug there. the inUse declaration was unused,
that is all (I think..)
>     L1797:   if (! MonitorInUseLists) return;
>         nit: HotSpot style doesn't usually have a space after unary '!'.
Done.
>     L1808:   thread->omInUseCount-= deflated_count;
>         nit: Please add a space before '-='.
Done. Also some lines up:

 gOmInUseCount-= deflated_count;

> The only comment I need resolved is about the locking for the thread
> local deflate_monitor_list() call. Everything else is minor.

Thanks so much for the thorough review!

So here's revision #11:

http://cr.openjdk.java.net/~rkennke/8180932/webrev.10/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.10/>

Roman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170705/0c130331/attachment.htm>

From daniel.daugherty at oracle.com  Wed Jul  5 23:30:42 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 5 Jul 2017 17:30:42 -0600
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <8def1665-1fb3-c7a2-bc0d-0b63601a0c56@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <8def1665-1fb3-c7a2-bc0d-0b63601a0c56@redhat.com>
Message-ID: <2770cc80-3dfe-4c0b-7e64-36778d82fbae@oracle.com>

On 7/5/17 3:17 PM, Roman Kennke wrote:
> Am 05.07.2017 um 20:30 schrieb Daniel D. Daugherty:
>> On 6/27/17 1:47 PM, Roman Kennke wrote:
>>> Hi Robbin,
>>>
>>> Ugh. Thanks for catching this.
>>> Problem was that I was accounting the thread-local deflations twice:
>>> once in thread-local processing (basically a leftover from my earlier
>>> attempt to implement this accounting) and then again in
>>> finish_deflate_idle_monitors(). Should be fixed here:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.09/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.09/>
>>
>> Are you thinking that this fix resolves all three bugs:
>>
>>     8132849 Increased stop time in cleanup phase because of 
>> single-threaded
>>             walk of thread stacks in 
>> NMethodSweeper::mark_active_nmethods()
> Yes. It requires additional support code by a GC though to become 
> actually multithreaded.
>> 8153224 Monitor deflation prolong safepoints
> Yes. But there's more that we want to do:
> - deflate monitors during GC thread scanning (this is a huge winner)
> - ultimately, deflate monitors concurrently (a JEP is on the way to 
> address this)
>
>> 8180932 Parallelize safepoint cleanup
> Yes :-)
>
>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
>> review of this fix also.
> Ok, I will reach out to him.
>
>> General comments:
>>   - Please don't forget to update Copyright years as needed before 
>> pushing
> Fixed.
>>
>> src/share/vm/runtime/safepoint.hpp
>>     L78:   enum SafepointCleanupTasks {
>>         You might want to add a comment here:
>>              // The enums are listed in the order of the tasks when 
>> done serially.
> Good idea. Done.
>> src/share/vm/runtime/safepoint.cpp
>>     L556:         ! thread->is_Code_cache_sweeper_thread()) {
>>     L581:     if (! 
>> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
>> {
>>     L589:     if (! 
>> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_UPDATE_INLINE_CACHES)) 
>> {
>>     L597:     if (! 
>> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_COMPILATION_POLICY)) 
>> {
>>     L605:     if (! 
>> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_SYMBOL_TABLE_REHASH)) 
>> {
>>     L615:     if (! 
>> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_STRING_TABLE_REHASH)) 
>> {
>>     L625:     if (! 
>> _subtasks.is_task_claimed(SafepointSynchronize::SAFEPOINT_CLEANUP_CLD_PURGE)) 
>> {
>>         nit: HotSpot style doesn't usually have a space after unary '!'.
> Ok, thanks! I didn't know that. Is there a document that describes the 
> Hotspot style?

There is such a document:

https://wiki.openjdk.java.net/display/HotSpot/StyleGuide

I believe John Rose is the usual maintainer of the doc...


> Because, from the top of my head, I can name 3 source files all in 
> entirely different styles ;-)

True, very true... unfortunately. I don't know if John's doc mentions
it, but a general rule is to follow the prevailing style in the file.
Sometime this is impossible because sometimes we see multiple styles
in the same file (and we pull our hair out)...


>>
>>     L638: // Various cleaning tasks that should be done periodically 
>> at safepoints
>>     L641:   // Prepare for monitor deflation
>>         nit: Please add a period to the end of these sentences.
>>
> Done.
>> src/share/vm/runtime/sweeper.cpp
>>     L205:     // TODO: Is this really needed?
>>     L206:     OrderAccess::storestore();
>>         That's a good question. Looks like that storestore() was
>>         added by this changeset:
>>
>>         $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
>>         changeset:   5357:510fbd28919c
>>         user:        anoll
>>         date:        Fri Sep 27 10:50:55 2013 +0200
>>         summary:     8020151: PSR:PERF Large performance regressions 
>> when code cache is filled
>>
>>         The changeset is not small and it looks like two
>>         OrderAccess::storestore() calls were added (and one
>>         load_ptr_acquire() was deleted):
>>
>>         $ hg diff -r 5356 -r 5357 | grep OrderAccess
>>         +      OrderAccess::storestore();
>>         -  nmethod *code = (nmethod 
>> *)OrderAccess::load_ptr_acquire(&_code);
>>         +  OrderAccess::storestore();
>>
>>         It could be that the storestore() is matching an existing
>>         OrderAccess operation or it could have been added in an
>>         abundance of caution. We definitely need a Compiler team
>>         person to take a look here.
> I looked around a little bit. As far as I can tell, all compiler 
> threads are stopped at a safepoint there. And I don't see anything 
> else that uses the affected fields during the safepoint. There's a 
> fence() before resuming safepointed threads. I think it should be safe 
> without storestore(), but would like to get confirmation from compiler 
> team too.

Good idea! :-)


>> src/share/vm/runtime/synchronizer.hpp
>>     L36:   int nInuse;         // currently associated with objects
>>     L37:   int nInCirculation; // extant
>>     L38:   int nScavenged;      // reclaimed
>>         nit: Please add one more space before '//' on L36,L37.
> Oops. Done.
>> src/share/vm/runtime/synchronizer.cpp
>>     L1663: // Walk a given monitor list, and deflate idle monitors
>>     L1664: // The given list could be a per-thread list or a global list
>>     L1665: // Caller acquires gListLock
>>     L1666: int 
>> ObjectSynchronizer::deflate_monitor_list(ObjectMonitor** listHeadp,
>>     <snip>
>>     L1802:   int deflated_count = 
>> deflate_monitor_list(thread->omInUseList_addr(), &freeHeadp, 
>> &freeTailp);
>>     L1804:   Thread::muxAcquire(&gListLock, "scavenge - return");
>>         The above deflate_monitor_list() now occurs outside of the
>>         gListLock where the old code held the gListLock for this call.
>>
>>         Yes, it is operating on the thread local list, but what keeps
>>         two different worker threads from trying to 
>> deflate_monitor_list()
>>         on the same JavaThread at the same time?
> The mechanics in Threads::parallel_java_threads_do() (which I adapted 
> from Threads::possibly_parallel_oops_do()) ensure that each worker 
> thread claims a Java thread before processing it. This ensures that 
> each Java thread is processed by exactly one worker thread.

Cool. No race there.


>> Without the gListLock, I don't see how the worker threads
>>         avoid conflicting over the same thread local list. Minimally,
>>         the comment on L1665 needs updating.
> Okidoki, I added those blocks there:
>
> // In the case of parallel processing of thread local monitor lists,
> // work is done by Threads::parallel_threads_do() which ensures that
> // each Java thread is processed by exactly one worker thread, and
> // thus avoid conflicts that would arise when worker threads would
> // process the same monitor lists concurrently.
> //
> // See also ParallelSPCleanupTask and
> // SafepointSynchronizer::do_cleanup_tasks() in safepoint.cpp and
> // Threads::parallel_java_threads_do() in thread.cpp.

I like the comment. (Others may find it wordy, but my comments are
often thought to be wordy...)


>
>>
>>     L1697:   counters->nInuse = 0;         // currently associated 
>> with objects
>>     L1698:   counters->nInCirculation = 0; // extant
>>     L1699:   counters->nScavenged = 0;      // reclaimed
>>         nit: Please add one more space before '//' on L1697, L1698.
> Done.
>> old L1698:   int nInuse = 0;
>>     old L1713:     int inUse = 0;
>>         Nice catch here. I've read this code countless times and missed
>>         this bug until now. It explains why some of my Java monitor 
>> testing
>>         had odd "in use" counts.
> Hmm. I am not aware of a bug there. the inUse declaration was unused, 
> that is all (I think..)

You would have that that when I pasted the two lines into the comment,
I would have noticed the difference in the names... sigh...


>> L1797:   if (! MonitorInUseLists) return;
>>         nit: HotSpot style doesn't usually have a space after unary '!'.
> Done.
>> L1808:   thread->omInUseCount-= deflated_count;
>>         nit: Please add a space before '-='.
> Done. Also some lines up:
>
>  gOmInUseCount-= deflated_count;
>
>> The only comment I need resolved is about the locking for the thread
>> local deflate_monitor_list() call. Everything else is minor.
>
> Thanks so much for the thorough review!
>
> So here's revision #11:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.10/ 
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.10/>
>
> Roman

src/share/vm/runtime/synchronizer.cpp
     L1664: // Caller acquires gListLock.
         The new stuff you added below the existing comment is fine.
         However, that existing comment is still wrong because the caller
         doesn't always acquire gListLock. Perhaps:

         // Caller acquires gListLock when operating on a global list.

Thanks for making the changes.

Thumbs up!

Dan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170705/b9003c2c/attachment.htm>

From erik.helin at oracle.com  Thu Jul  6 08:06:25 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 6 Jul 2017 10:06:25 +0200
Subject: RFR (S): 8179679: Rearrange filters before card scanning
In-Reply-To: <1499182953.2757.21.camel@oracle.com>
References: <1499081093.2802.30.camel@oracle.com>
 <1499182953.2757.21.camel@oracle.com>
Message-ID: <51aa3d5a-70d9-da28-12cd-2a05e949c4f9@oracle.com>

Hi Thomas,

looks good to me, Reviewed.

Thanks,
Erik

On 07/04/2017 05:42 PM, Thomas Schatzl wrote:
> Hi all,
> 
> On Mon, 2017-07-03 at 13:24 +0200, Thomas Schatzl wrote:
>>  Hi all,
>>  
>>    please have a look at this change that rearranges the checks in the
>>  G1RemSet card scanning a bit in order to:
>>  
> 
>  Erik had a look at this change with the following comments:
> 
> - rename card_region_idx -> region_idx_for_card
> - factor out the two calls to claim a card and dirty its region into a
> method
> - move calculation of "card_region" into the scan_card() method.
> - he pointed out that the change can use G1CollectedHeap::region_at()
> instead of G1CollectedHeap::heap_region_containing() as it is simpler.
> - there has been another comment on why the change claims the card
> after checking whether the card is within the region's boundaries, and
> if that wouldn't be better performed right after the is_claimed check.
> 
> Doing so will claim cards originating from stray remembered set entries
> into the current survivor regions as claimed, since we do not clear
> these regions later again (see G1ClearCardTableTask::work()) - their
> cards need to be "Young", and this is done during allocation of the
> region.
> 
> This results in the card table verification to fail later.
> 
> I think if we should think of changing the handling of survivor regions
> during the clear CT phase as part of a different CR. For now I added a
> comment.
> 
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8179679/webrev.1_to_2 (diff)
> http://cr.openjdk.java.net/~tschatzl/8179679/webrev.2 (full)
> Testing:
> gcbasher
> 
> Thanks,
>   Thomas
> 


From erik.helin at oracle.com  Thu Jul  6 08:12:26 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 6 Jul 2017 10:12:26 +0200
Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards
 for NULL references
In-Reply-To: <1499181945.2757.20.camel@oracle.com>
References: <1499181945.2757.20.camel@oracle.com>
Message-ID: <b1d8edf9-cc35-f266-32af-36bcc41c7929@oracle.com>

On 07/04/2017 05:25 PM, Thomas Schatzl wrote:
> Hi,
> 
>   can I have reviews for this change that adds a NULL-check in the
> UpdateRSetDeferred closure so that we do not enqueue cards with NULL
> references in it during evacuation failure?
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183127
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183127/webrev/

Looks good, Reviewed.

Thanks,
Erik

> Testing:
> jprt, gcbasher with G1EvacuationFailureALot for 1/2 hour
> 
> I think this amount of testing is sufficient as the reasoning for this
> change is not *that* complicated.
> 
> Thanks,
>   Thomas
> 


From erik.helin at oracle.com  Thu Jul  6 08:20:42 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 6 Jul 2017 10:20:42 +0200
Subject: RFR (XS): 8183397: Ensure consistent closure filtering during
 evacuation
In-Reply-To: <1499081088.2802.29.camel@oracle.com>
References: <1499081088.2802.29.camel@oracle.com>
Message-ID: <64943738-9d9f-0d88-b44e-9a9ec0812f33@oracle.com>

On 07/03/2017 01:24 PM, Thomas Schatzl wrote:
> Hi all,

Hi Thomas,

>   can I have reviews for this change that fixes an observation that has
> been made recently by Erik, i.e. that the "else" part of several
> evacuation closures inconsistently filters out non-cross-region
> references before checking whether the referenced object is a humongous
> or ext region.
> 
> This causes somewhat hard to diagnose performance issues, and earlier
> filtering does not hurt if done anyway.
> 
> (Note that the current way of checking in all but the UpdateRS closure
> using HeapRegion::is_in_same_region() seems optimal. The only reason
> why the other way in the UpdateRS closure is better because the code
> needs the "to" HeapRegion pointer anyway)
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183397
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183397/webrev/

-  } else if (in_cset_state.is_humongous()) {
+  } else {
+    if (in_cset_state.is_humongous()) {

Why change `else if` to `else { if (...) {` here? Does it result in the
compiler generating faster code for this case?

Thanks,
Erik

> Testing:
> jprt, performance regression analysis
> 
> Thanks,
>   Thomas
> 


From thomas.schatzl at oracle.com  Thu Jul  6 08:28:21 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 06 Jul 2017 10:28:21 +0200
Subject: RFR (XS): 8183397: Ensure consistent closure filtering during
 evacuation
In-Reply-To: <64943738-9d9f-0d88-b44e-9a9ec0812f33@oracle.com>
References: <1499081088.2802.29.camel@oracle.com>
 <64943738-9d9f-0d88-b44e-9a9ec0812f33@oracle.com>
Message-ID: <1499329701.2760.3.camel@oracle.com>

Hi Erik,

On Thu, 2017-07-06 at 10:20 +0200, Erik Helin wrote:
> On 07/03/2017 01:24 PM, Thomas Schatzl wrote:
> > 
> > Hi all,
> Hi Thomas,
> 
> > 
> > ? can I have reviews for this change that fixes an observation that
> > has
> > been made recently by Erik, i.e. that the "else" part of several
> > evacuation closures inconsistently filters out non-cross-region
> > references before checking whether the referenced object is a
> > humongous
> > or ext region.
> > 
> > This causes somewhat hard to diagnose performance issues, and
> > earlier
> > filtering does not hurt if done anyway.
> > 
> > (Note that the current way of checking in all but the UpdateRS
> > closure
> > using HeapRegion::is_in_same_region() seems optimal. The only
> > reason
> > why the other way in the UpdateRS closure is better because the
> > code
> > needs the "to" HeapRegion pointer anyway)
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8183397
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8183397/webrev/
> -??} else if (in_cset_state.is_humongous()) {
> +??} else {
> +????if (in_cset_state.is_humongous()) {
> 
> Why change `else if` to `else { if (...) {` here? Does it result in
> the
> compiler generating faster code for this case?

? no. It only makes this do_oop_*() method look similar in structure to
our do_oop_*() methods in the closures.

I.e.

if (in_cset.state.is_in_cset()) {
? // do stuff for refs into cset
} else {
? // expanding handle_non_cset_obj_common()
? if (state.is_humongous()) {
? } else ...
}

I felt this improves overall readability, but this may only be because
I have been working in this code a lot recently. I can revert this
change.

Thanks for your review,
? Thomas


From thomas.schatzl at oracle.com  Thu Jul  6 08:29:08 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 06 Jul 2017 10:29:08 +0200
Subject: RFR (XS): 8183127: UpdateRSetDeferred should not enqueue cards
 for NULL references
In-Reply-To: <b1d8edf9-cc35-f266-32af-36bcc41c7929@oracle.com>
References: <1499181945.2757.20.camel@oracle.com>
 <b1d8edf9-cc35-f266-32af-36bcc41c7929@oracle.com>
Message-ID: <1499329748.2760.4.camel@oracle.com>

Hi Erik,

On Thu, 2017-07-06 at 10:12 +0200, Erik Helin wrote:
>?On 07/04/2017 05:25 PM, Thomas Schatzl wrote:
> >
>?>?
[...]
>?>?CR:
>?>?https://bugs.openjdk.java.net/browse/JDK-8183127
>?>?Webrev:
>?>?http://cr.openjdk.java.net/~tschatzl/8183127/webrev/
>?Looks good, Reviewed.
>?
>?Thanks,
>?Erik

Thanks for your review,
? Thomas


From stefan.johansson at oracle.com  Thu Jul  6 08:46:15 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 6 Jul 2017 10:46:15 +0200
Subject: RFR (S): 8179679: Rearrange filters before card scanning
In-Reply-To: <51aa3d5a-70d9-da28-12cd-2a05e949c4f9@oracle.com>
References: <1499081093.2802.30.camel@oracle.com>
 <1499182953.2757.21.camel@oracle.com>
 <51aa3d5a-70d9-da28-12cd-2a05e949c4f9@oracle.com>
Message-ID: <b9138c77-4233-59f7-8713-7e369adbcd14@oracle.com>


On 2017-07-06 10:06, Erik Helin wrote:
> Hi Thomas,
>
> looks good to me, Reviewed.
+1

Nice cleanup,
StefanJ
> Thanks,
> Erik
>
> On 07/04/2017 05:42 PM, Thomas Schatzl wrote:
>> Hi all,
>>
>> On Mon, 2017-07-03 at 13:24 +0200, Thomas Schatzl wrote:
>>>   Hi all,
>>>   
>>>     please have a look at this change that rearranges the checks in the
>>>   G1RemSet card scanning a bit in order to:
>>>   
>>   Erik had a look at this change with the following comments:
>>
>> - rename card_region_idx -> region_idx_for_card
>> - factor out the two calls to claim a card and dirty its region into a
>> method
>> - move calculation of "card_region" into the scan_card() method.
>> - he pointed out that the change can use G1CollectedHeap::region_at()
>> instead of G1CollectedHeap::heap_region_containing() as it is simpler.
>> - there has been another comment on why the change claims the card
>> after checking whether the card is within the region's boundaries, and
>> if that wouldn't be better performed right after the is_claimed check.
>>
>> Doing so will claim cards originating from stray remembered set entries
>> into the current survivor regions as claimed, since we do not clear
>> these regions later again (see G1ClearCardTableTask::work()) - their
>> cards need to be "Young", and this is done during allocation of the
>> region.
>>
>> This results in the card table verification to fail later.
>>
>> I think if we should think of changing the handling of survivor regions
>> during the clear CT phase as part of a different CR. For now I added a
>> comment.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8179679/webrev.1_to_2 (diff)
>> http://cr.openjdk.java.net/~tschatzl/8179679/webrev.2 (full)
>> Testing:
>> gcbasher
>>
>> Thanks,
>>    Thomas
>>


From mikael.gerdin at oracle.com  Thu Jul  6 09:20:32 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 6 Jul 2017 11:20:32 +0200
Subject: RFR (M) 8183923: Get rid of FreeBlockDictionary and dithering
Message-ID: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com>

Hi all,

Please review this cleanup inspired by looking at Roman's CMS cleanup :)

FreeBlockDictionary is an old abstraction for multiple CMS freelist 
datastructures which never appear to have been implemented, getting rid 
of it also simplifies some code in Metaspace so it's not all CMS stuff.

Testing: jprt
Bug: https://bugs.openjdk.java.net/browse/JDK-8183923
Webrev: http://cr.openjdk.java.net/~mgerdin/8183923/webrev.0/index.html

Thanks
/Mikael


From thomas.schatzl at oracle.com  Thu Jul  6 10:08:35 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 06 Jul 2017 12:08:35 +0200
Subject: RFR (S): 8179679: Rearrange filters before card scanning
In-Reply-To: <b9138c77-4233-59f7-8713-7e369adbcd14@oracle.com>
References: <1499081093.2802.30.camel@oracle.com>
 <1499182953.2757.21.camel@oracle.com>
 <51aa3d5a-70d9-da28-12cd-2a05e949c4f9@oracle.com>
 <b9138c77-4233-59f7-8713-7e369adbcd14@oracle.com>
Message-ID: <1499335715.2760.6.camel@oracle.com>

On Thu, 2017-07-06 at 10:46 +0200, Stefan Johansson wrote:
> 
> On 2017-07-06 10:06, Erik Helin wrote:
> > 
> > Hi Thomas,
> > 
> > looks good to me, Reviewed.
> +1
> 

Thanks for your reviews Stefan and Erik!

Thomas


From tobias.hartmann at oracle.com  Thu Jul  6 10:14:26 2017
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 6 Jul 2017 12:14:26 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
Message-ID: <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>

Hi,

On 05.07.2017 20:30, Daniel D. Daugherty wrote:
> JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
> review of this fix also.

Thanks for the notification. The sweeper/safepoint changes look good to me!

> src/share/vm/runtime/sweeper.cpp
>     L205:     // TODO: Is this really needed?
>     L206:     OrderAccess::storestore();
>         That's a good question. Looks like that storestore() was
>         added by this changeset:
> 
>         $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
>         changeset:   5357:510fbd28919c
>         user:        anoll
>         date:        Fri Sep 27 10:50:55 2013 +0200
>         summary:     8020151: PSR:PERF Large performance regressions when code cache is filled
> 
>         The changeset is not small and it looks like two
>         OrderAccess::storestore() calls were added (and one
>         load_ptr_acquire() was deleted):
> 
>         $ hg diff -r 5356 -r 5357 | grep OrderAccess
>         +      OrderAccess::storestore();
>         -  nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code);
>         +  OrderAccess::storestore();
> 
>         It could be that the storestore() is matching an existing
>         OrderAccess operation or it could have been added in an
>         abundance of caution. We definitely need a Compiler team
>         person to take a look here.

Unfortunately, I'm also not sure if that barrier is required. Looking at the old RFR thread:
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html

It seems that Igor V. suggested this:
"You definitely need a store-store barrier for non-TSO architectures after the mark_as_seen_on_stack() call on line 1360. Otherwise it still can be reordered by the CPU with respect to the following state assignment. Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores."
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html

The requested OrderAccess::storestore() was added to nmethod::make_not_entrant_or_zombie() but seems like Albert also added one to NMethodSweeper::mark_active_nmethods().

I'll ping Igor, maybe he knows more.

Thanks,
Tobias


From erik.helin at oracle.com  Thu Jul  6 12:52:27 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 6 Jul 2017 14:52:27 +0200
Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure
In-Reply-To: <1499156663.2761.6.camel@oracle.com>
References: <1499156663.2761.6.camel@oracle.com>
Message-ID: <08286762-411b-3079-9802-814c806af946@oracle.com>

Hi Thomas,

On 07/04/2017 10:24 AM, Thomas Schatzl wrote:
> Hi all,
>
>   can I get reviews for this change that renames and cleans up the use
> of RefineCardTableEntryClosure in the code?
>
> RefineCardTableEntryClosure is the closure that is applied by the
> concurrent refinement threads. This change renames it slightly to
> indicate its use (G1RefineCardConcurrentlyClosure) and moves it to the
> G1RemSet files close to the closure that we use for refinement/Update
> RS during GC.

great cleanup! Looking at the code, what do you think about moving 
G1RefineCardConcurrentlyClosure into concurrentG1RefineThread.cpp (and 
make it a private class to ConcurrentG1RefineThread)? AFAICS, 
ConcurrentG1RefineThread is the only code using this closure.

If we do it this way, then we can actually make 
DirtyCardQueueSet::apply_closure_to_completed_buffer a template method, 
taking the Closure a template, as in:
template <typename Closure>
bool apply_closure_to_completed_buffer(Closure* cl,
                                        uint worker_i,
                                        size_t stop_at,
                                        bool during_pause)
This means that closures could get inlined, which doesn't mean that much 
for G1RefineCardConcurrentlyClosure, but could give a small boost for 
G1RefineCardClosure (for that to work, 
G1CollectedHeap::iterate_dirty_card_closure must take a 
G1RefineCardClosure, but that is ok, because that is the only closure 
type we pass to that method).

Also, you do not need the forward declaration in G1CollectedHeap, it 
will not make use of this closure then :)

If you want to "go the extra mile", then you can also pass a G1RemSet* 
as an argument to the G1RefineCardConcurrentlyClosure constructor and 
store it in a field, to avoid accessing the G1CollectedHeap via the 
singleton:
G1CollectedHeap::heap()->g1_rem_set()->refine_card_concurrently(card_ptr, 
worker_i);
(plus, G1RefineCardConcurrentlyClosure only needs a G1RemSet* pointer 
anyway ;))

Thanks,
Erik

> This change is dependent on "JDK-8183226: Remembered set summarization
> accesses not fully initialized java thread DCQS" which is also
> currently out for review - that change reorganizes G1CollectedHeap
> initialization so that the change can actually move the closure.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183128
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183128/webrev/
> Testing:
> jprt, local benchmarks
>
> Thanks,
>   Thomas
>


From rkennke at redhat.com  Thu Jul  6 13:18:07 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 6 Jul 2017 15:18:07 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
Message-ID: <a26deee3-371b-c835-9d59-bf5800db571f@redhat.com>

Am 06.07.2017 um 12:14 schrieb Tobias Hartmann:
> Hi,
>
> On 05.07.2017 20:30, Daniel D. Daugherty wrote:
>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
>> review of this fix also.
> Thanks for the notification. The sweeper/safepoint changes look good to me!
Thanks!

I guess I'm going to need a sponsor when the orderAccess::storestore()
issue is resolved.

I'd say *if* we decide to keep the storestore() as conservative measure,
it makes sense to also add it to the parallel processing routines like this:

diff --git a/src/share/vm/runtime/safepoint.cpp
b/src/share/vm/runtime/safepoint.cpp
--- a/src/share/vm/runtime/safepoint.cpp
+++ b/src/share/vm/runtime/safepoint.cpp
@@ -550,6 +550,12 @@
     _counters(counters),
     _nmethod_cl(NMethodSweeper::prepare_mark_active_nmethods()) {}
 
+  ~ParallelSPCleanupThreadClosure() {
+    // This is here to be consistent with sweeper.cpp
NMethodSweeper::mark_active_nmethods().
+    // TODO: Is this really needed?
+    OrderAccess::storestore();
+  }
+
   void do_thread(Thread* thread) {
     ObjectSynchronizer::deflate_thread_local_monitors(thread, _counters);
     if (_nmethod_cl != NULL && thread->is_Java_thread() &&


I've included this in the following (final?) webrev:

http://cr.openjdk.java.net/~rkennke/8180932/webrev.11/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.11/>

(I've also added Tobias to Reviewed-by: list... if anybody wants to
sponsor it as-is, simply grab the changeset from here:

http://cr.openjdk.java.net/~rkennke/8180932/webrev.11/hotspot.changeset
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.11/hotspot.changeset>

)

Cheers, Roman


>> src/share/vm/runtime/sweeper.cpp
>>     L205:     // TODO: Is this really needed?
>>     L206:     OrderAccess::storestore();
>>         That's a good question. Looks like that storestore() was
>>         added by this changeset:
>>
>>         $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
>>         changeset:   5357:510fbd28919c
>>         user:        anoll
>>         date:        Fri Sep 27 10:50:55 2013 +0200
>>         summary:     8020151: PSR:PERF Large performance regressions when code cache is filled
>>
>>         The changeset is not small and it looks like two
>>         OrderAccess::storestore() calls were added (and one
>>         load_ptr_acquire() was deleted):
>>
>>         $ hg diff -r 5356 -r 5357 | grep OrderAccess
>>         +      OrderAccess::storestore();
>>         -  nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code);
>>         +  OrderAccess::storestore();
>>
>>         It could be that the storestore() is matching an existing
>>         OrderAccess operation or it could have been added in an
>>         abundance of caution. We definitely need a Compiler team
>>         person to take a look here.
> Unfortunately, I'm also not sure if that barrier is required. Looking at the old RFR thread:
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html
>
> It seems that Igor V. suggested this:
> "You definitely need a store-store barrier for non-TSO architectures after the mark_as_seen_on_stack() call on line 1360. Otherwise it still can be reordered by the CPU with respect to the following state assignment. Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores."
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html
>
> The requested OrderAccess::storestore() was added to nmethod::make_not_entrant_or_zombie() but seems like Albert also added one to NMethodSweeper::mark_active_nmethods().
>
> I'll ping Igor, maybe he knows more.
>
> Thanks,
> Tobias


From mikael.gerdin at oracle.com  Thu Jul  6 13:48:57 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 6 Jul 2017 15:48:57 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
Message-ID: <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>

Hi Roman,

On 2017-07-05 13:58, Mikael Gerdin wrote:
> Hi Roman,
> 
> On 2017-07-03 17:05, Roman Kennke wrote:
>> Am 03.07.2017 um 11:13 schrieb Roman Kennke:
>>> Am 03.07.2017 um 09:35 schrieb Mikael Gerdin:
>>>> Hi Roman,
>>>>
>>>> On 2017-06-30 18:32, Roman Kennke wrote:
>>>>> I came across one problem using this approach: We will have 2 
>>>>> instances
>>>>> of CollectedHeap around, where there's usually only 1, and some code
>>>>> expects only 1. For example, in CollectedHeap constructor, we 
>>>>> create new
>>>>> PerfData variables, and we now create them 2x, which leads to an 
>>>>> assert
>>>>> being thrown. I suspect there is more code like that.
>>>>>
>>>>> I will attempt to refactor this a little more, maybe it's not that 
>>>>> bad,
>>>>> but it's probably not worth spending too much time on it.
>>>> I think refactoring the code to not expect a singleton CollectedHeap
>>>> instance is a bit too much.
>>>> Perhaps there is another way to share common code between Serial and
>>>> CMS but that might require a bit more thought.
>>> Yeah, definitely. I hit another difficulty: pretty much the same issues
>>> that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now show up
>>> with Generation and its subclasses..
>>>
>>> How about we push the original patch that I've posted, and work from
>>> there? In fact, I *have* found some little things I would change (some
>>> more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have
>>> overlooked in my first pass...)
>>
>> So here's the little change (two more places in genCollectedHeap.hpp
>> where UseConcMarkSweepGC was used to alter behaviour:
>>
>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.02/
>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.02/>
>>
>> Ok to push this?

I just realized that your change doesn't build on Windows since you 
didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky 
about that.
/Mikael

> 
> I think this looks like a good step in the right direction!
> One thing I noticed is that you can put "enum GCH_strong_roots_tasks" 
> inside of GenCollectedHeap to avoid tainting the global namespace with 
> the enum members. Just above the declaration of _process_strong_tasks 
> seems like an excellent location for the enum declaration :)
> 
> This looks like it's not needed anymore.
> bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) {
>    if (!UseConcMarkSweepGC) {
>      return false;
>    }
> 
> /Mikael
> 
>>
>> Roman
>>


From rkennke at redhat.com  Thu Jul  6 16:23:39 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 6 Jul 2017 18:23:39 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
Message-ID: <13358626-e399-e352-1711-587416621aac@redhat.com>

Am 06.07.2017 um 15:48 schrieb Mikael Gerdin:
> Hi Roman,
>
> On 2017-07-05 13:58, Mikael Gerdin wrote:
>> Hi Roman,
>>
>> On 2017-07-03 17:05, Roman Kennke wrote:
>>> Am 03.07.2017 um 11:13 schrieb Roman Kennke:
>>>> Am 03.07.2017 um 09:35 schrieb Mikael Gerdin:
>>>>> Hi Roman,
>>>>>
>>>>> On 2017-06-30 18:32, Roman Kennke wrote:
>>>>>> I came across one problem using this approach: We will have 2
>>>>>> instances
>>>>>> of CollectedHeap around, where there's usually only 1, and some code
>>>>>> expects only 1. For example, in CollectedHeap constructor, we
>>>>>> create new
>>>>>> PerfData variables, and we now create them 2x, which leads to an
>>>>>> assert
>>>>>> being thrown. I suspect there is more code like that.
>>>>>>
>>>>>> I will attempt to refactor this a little more, maybe it's not
>>>>>> that bad,
>>>>>> but it's probably not worth spending too much time on it.
>>>>> I think refactoring the code to not expect a singleton CollectedHeap
>>>>> instance is a bit too much.
>>>>> Perhaps there is another way to share common code between Serial and
>>>>> CMS but that might require a bit more thought.
>>>> Yeah, definitely. I hit another difficulty: pretty much the same
>>>> issues
>>>> that I'm having with GenCollectedHeap/CMSHeap/CollectedHeap now
>>>> show up
>>>> with Generation and its subclasses..
>>>>
>>>> How about we push the original patch that I've posted, and work from
>>>> there? In fact, I *have* found some little things I would change (some
>>>> more if (UseConcMarkSweepGC) branches in GenCollectedHeap that I have
>>>> overlooked in my first pass...)
>>>
>>> So here's the little change (two more places in genCollectedHeap.hpp
>>> where UseConcMarkSweepGC was used to alter behaviour:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.02/
>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.02/>
>>>
>>> Ok to push this?
>
> I just realized that your change doesn't build on Windows since you
> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky
> about that.
> /Mikael

Uhhh.
Ok, here's revision #3 with precompiled added in:

http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.03/>

Roman


From igor.veresov at oracle.com  Thu Jul  6 16:47:01 2017
From: igor.veresov at oracle.com (Igor Veresov)
Date: Thu, 6 Jul 2017 09:47:01 -0700
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
Message-ID: <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>


> On Jul 6, 2017, at 3:14 AM, Tobias Hartmann <tobias.hartmann at oracle.com> wrote:
> 
> Hi,
> 
> On 05.07.2017 20:30, Daniel D. Daugherty wrote:
>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
>> review of this fix also.
> 
> Thanks for the notification. The sweeper/safepoint changes look good to me!
> 
>> src/share/vm/runtime/sweeper.cpp
>>    L205:     // TODO: Is this really needed?
>>    L206:     OrderAccess::storestore();
>>        That's a good question. Looks like that storestore() was
>>        added by this changeset:
>> 
>>        $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
>>        changeset:   5357:510fbd28919c
>>        user:        anoll
>>        date:        Fri Sep 27 10:50:55 2013 +0200
>>        summary:     8020151: PSR:PERF Large performance regressions when code cache is filled
>> 
>>        The changeset is not small and it looks like two
>>        OrderAccess::storestore() calls were added (and one
>>        load_ptr_acquire() was deleted):
>> 
>>        $ hg diff -r 5356 -r 5357 | grep OrderAccess
>>        +      OrderAccess::storestore();
>>        -  nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code);
>>        +  OrderAccess::storestore();
>> 
>>        It could be that the storestore() is matching an existing
>>        OrderAccess operation or it could have been added in an
>>        abundance of caution. We definitely need a Compiler team
>>        person to take a look here.
> 
> Unfortunately, I'm also not sure if that barrier is required. Looking at the old RFR thread:
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html
> 
> It seems that Igor V. suggested this:
> "You definitely need a store-store barrier for non-TSO architectures after the mark_as_seen_on_stack() call on line 1360. Otherwise it still can be reordered by the CPU with respect to the following state assignment. Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores."
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html
> 
> The requested OrderAccess::storestore() was added to nmethod::make_not_entrant_or_zombie() but seems like Albert also added one to NMethodSweeper::mark_active_nmethods().
> 
> I'll ping Igor, maybe he knows more.


I think the reason is explained in the comment:

    // Must happen before state change. Otherwise we have a race condition in
    // nmethod::can_not_entrant_be_converted(). I.e., a method can immediately
    // transition its state from 'not_entrant' to 'zombie' without having to wait
    // for stack scanning.
    if (state == not_entrant) {
      mark_as_seen_on_stack();
      OrderAccess::storestore();
    }

    // Change state
    _state = state;

Although can_not_entrant_be_converted() is now called can_convert_to_zombie(). The scenario can so like this:
1. We?re setting the state to not_entrant. But the _state assignment happens before setting the traversal count in mark_as_seen_on_stack().
2. While we?re doing this, the sweeper scans nmethods and is in process_compiled_method():

  } else if (cm->is_not_entrant()) {
    // If there are no current activations of this method on the
    // stack we can safely convert it to a zombie method
    if (cm->can_convert_to_zombie()) {
      // Clear ICStubs to prevent back patching stubs of zombie or flushed
      // nmethods during the next safepoint (see ICStub::finalize).
      {
        MutexLocker cl(CompiledIC_lock);
        cm->clear_ic_stubs();
      }
      // Code cache state change is tracked in make_zombie()
      cm->make_zombie();


So if state change happens before setting the traversal mark, the sweeper can go ahead and make it a zombie.


Makes sense? Or am I missing something?

igor


> 
> Thanks,
> Tobias

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170706/480e3c76/attachment.htm>

From rkennke at redhat.com  Thu Jul  6 16:53:48 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 6 Jul 2017 18:53:48 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
Message-ID: <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>

Am 06.07.2017 um 18:47 schrieb Igor Veresov:
>
>> On Jul 6, 2017, at 3:14 AM, Tobias Hartmann
>> <tobias.hartmann at oracle.com <mailto:tobias.hartmann at oracle.com>> wrote:
>>
>> Hi,
>>
>> On 05.07.2017 20:30, Daniel D. Daugherty wrote:
>>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
>>> review of this fix also.
>>
>> Thanks for the notification. The sweeper/safepoint changes look good
>> to me!
>>
>>> src/share/vm/runtime/sweeper.cpp
>>>    L205:     // TODO: Is this really needed?
>>>    L206:     OrderAccess::storestore();
>>>        That's a good question. Looks like that storestore() was
>>>        added by this changeset:
>>>
>>>        $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
>>>        changeset:   5357:510fbd28919c
>>>        user:        anoll
>>>        date:        Fri Sep 27 10:50:55 2013 +0200
>>>        summary:     8020151: PSR:PERF Large performance regressions
>>> when code cache is filled
>>>
>>>        The changeset is not small and it looks like two
>>>        OrderAccess::storestore() calls were added (and one
>>>        load_ptr_acquire() was deleted):
>>>
>>>        $ hg diff -r 5356 -r 5357 | grep OrderAccess
>>>        +      OrderAccess::storestore();
>>>        -  nmethod *code = (nmethod
>>> *)OrderAccess::load_ptr_acquire(&_code);
>>>        +  OrderAccess::storestore();
>>>
>>>        It could be that the storestore() is matching an existing
>>>        OrderAccess operation or it could have been added in an
>>>        abundance of caution. We definitely need a Compiler team
>>>        person to take a look here.
>>
>> Unfortunately, I'm also not sure if that barrier is required. Looking
>> at the old RFR thread:
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html
>>
>> It seems that Igor V. suggested this:
>> "You definitely need a store-store barrier for non-TSO architectures
>> after the mark_as_seen_on_stack() call on line 1360. Otherwise it
>> still can be reordered by the CPU with respect to the following state
>> assignment. Also neither of these state variables are volatile in
>> nmethod, so even the compiler may reorder the stores."
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html
>>
>> The requested OrderAccess::storestore() was added to
>> nmethod::make_not_entrant_or_zombie() but seems like Albert also
>> added one to NMethodSweeper::mark_active_nmethods().
>>
>> I'll ping Igor, maybe he knows more.
>
>
> I think the reason is explained in the comment:
>
>     // Must happen before state change. Otherwise we have a race
> condition in
>     // nmethod::can_not_entrant_be_converted(). I.e., a method can
> immediately
>     // transition its state from 'not_entrant' to 'zombie' without
> having to wait
>     // for stack scanning.
>     if (state == not_entrant) {
>       mark_as_seen_on_stack();
>       OrderAccess::storestore();
>     }
>
>     // Change state
>     _state = state;
>
> Although can_not_entrant_be_converted() is now called
> can_convert_to_zombie(). The scenario can so like this:
> 1. We?re setting the state to not_entrant. But the _state assignment
> happens before setting the traversal count in mark_as_seen_on_stack().
> 2. While we?re doing this, the sweeper scans nmethods and is in
> process_compiled_method():
>
>   } else if (cm->is_not_entrant()) {
>     // If there are no current activations of this method on the
>     // stack we can safely convert it to a zombie method
>     if (cm->can_convert_to_zombie()) {
>       // Clear ICStubs to prevent back patching stubs of zombie or flushed
>       // nmethods during the next safepoint (see ICStub::finalize).
>       {
>         MutexLocker cl(CompiledIC_lock);
>         cm->clear_ic_stubs();
>       }
>       // Code cache state change is tracked in make_zombie()
>       cm->make_zombie();
>
>
> So if state change happens before setting the traversal mark, the
> sweeper can go ahead and make it a zombie.
>
>
> Makes sense? Or am I missing something?

I have probably not fully digged the code. As far as I can see:
- sweeper thread runs outside safepoint
- VMThread (which is doing the nmethod marking in the case that I'm
looking at) runs while all other threads (incl. the sweeper) is holding
still.

In between we have a guaranteed fence().

There should be no need for a storestore() (at least in sweeper.cpp...
in nmethod.cpp it seems to actually make sense as you pointed out
above). *However* it doesn't really hurt to OrderAccess::storestore()
there... so play it conservative and leave it in, as RFR'd in my last patch?

Roman

>
> igor
>
>
>>
>> Thanks,
>> Tobias
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170706/b5fbc422/attachment.htm>

From daniel.daugherty at oracle.com  Thu Jul  6 17:14:38 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 6 Jul 2017 11:14:38 -0600
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
Message-ID: <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>

On 7/6/17 10:53 AM, Roman Kennke wrote:
> Am 06.07.2017 um 18:47 schrieb Igor Veresov:
>>
>>> On Jul 6, 2017, at 3:14 AM, Tobias Hartmann 
>>> <tobias.hartmann at oracle.com <mailto:tobias.hartmann at oracle.com>> wrote:
>>>
>>> Hi,
>>>
>>> On 05.07.2017 20:30, Daniel D. Daugherty wrote:
>>>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
>>>> review of this fix also.
>>>
>>> Thanks for the notification. The sweeper/safepoint changes look good 
>>> to me!
>>>
>>>> src/share/vm/runtime/sweeper.cpp
>>>>    L205:     // TODO: Is this really needed?
>>>>    L206:     OrderAccess::storestore();
>>>>        That's a good question. Looks like that storestore() was
>>>>        added by this changeset:
>>>>
>>>>        $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
>>>>        changeset:   5357:510fbd28919c
>>>>        user:        anoll
>>>>        date:        Fri Sep 27 10:50:55 2013 +0200
>>>>        summary:     8020151: PSR:PERF Large performance regressions 
>>>> when code cache is filled
>>>>
>>>>        The changeset is not small and it looks like two
>>>>        OrderAccess::storestore() calls were added (and one
>>>>        load_ptr_acquire() was deleted):
>>>>
>>>>        $ hg diff -r 5356 -r 5357 | grep OrderAccess
>>>>        +      OrderAccess::storestore();
>>>>        -  nmethod *code = (nmethod 
>>>> *)OrderAccess::load_ptr_acquire(&_code);
>>>>        +  OrderAccess::storestore();
>>>>
>>>>        It could be that the storestore() is matching an existing
>>>>        OrderAccess operation or it could have been added in an
>>>>        abundance of caution. We definitely need a Compiler team
>>>>        person to take a look here.
>>>
>>> Unfortunately, I'm also not sure if that barrier is required. 
>>> Looking at the old RFR thread:
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html
>>>
>>> It seems that Igor V. suggested this:
>>> "You definitely need a store-store barrier for non-TSO architectures 
>>> after the mark_as_seen_on_stack() call on line 1360. Otherwise it 
>>> still can be reordered by the CPU with respect to the following 
>>> state assignment. Also neither of these state variables are volatile 
>>> in nmethod, so even the compiler may reorder the stores."
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html
>>>
>>> The requested OrderAccess::storestore() was added to 
>>> nmethod::make_not_entrant_or_zombie() but seems like Albert also 
>>> added one to NMethodSweeper::mark_active_nmethods().
>>>
>>> I'll ping Igor, maybe he knows more.
>>
>>
>> I think the reason is explained in the comment:
>>
>> // Must happen before state change. Otherwise we have a race condition in
>> // nmethod::can_not_entrant_be_converted(). I.e., a method can 
>> immediately
>> // transition its state from 'not_entrant' to 'zombie' without having 
>> to wait
>> // for stack scanning.
>> if (state == not_entrant) {
>> mark_as_seen_on_stack();
>> OrderAccess::storestore();
>>     }
>>
>> // Change state
>> _state = state;
>>
>> Although can_not_entrant_be_converted() is now called 
>> can_convert_to_zombie(). The scenario can so like this:
>> 1. We?re setting the state to not_entrant. But the _state assignment 
>> happens before setting the traversal count in mark_as_seen_on_stack().
>> 2. While we?re doing this, the sweeper scans nmethods and is in 
>> process_compiled_method():
>>
>>   } else if (cm->is_not_entrant()) {
>> // If there are no current activations of this method on the
>> // stack we can safely convert it to a zombie method
>> if (cm->can_convert_to_zombie()) {
>> // Clear ICStubs to prevent back patching stubs of zombie or flushed
>> // nmethods during the next safepoint (see ICStub::finalize).
>>       {
>> MutexLocker cl(CompiledIC_lock);
>> cm->clear_ic_stubs();
>>       }
>> // Code cache state change is tracked in make_zombie()
>> cm->make_zombie();
>>
>>
>> So if state change happens before setting the traversal mark, the 
>> sweeper can go ahead and make it a zombie.
>>
>>
>> Makes sense? Or am I missing something?
>
> I have probably not fully digged the code. As far as I can see:
> - sweeper thread runs outside safepoint
> - VMThread (which is doing the nmethod marking in the case that I'm 
> looking at) runs while all other threads (incl. the sweeper) is 
> holding still.
>
> In between we have a guaranteed fence().
>
> There should be no need for a storestore() (at least in sweeper.cpp... 
> in nmethod.cpp it seems to actually make sense as you pointed out 
> above). *However* it doesn't really hurt to OrderAccess::storestore() 
> there... so play it conservative and leave it in, as RFR'd in my last 
> patch?

If we are going to have the OrderAccess::storestore() calls, then
we have to have a proper comment explaining why they are needed.

Unfortunately, the OrderAccess::storestore() call that was added
by anoll to src/share/vm/runtime/sweeper.cpp back in 2013 was not
properly documented and we're bumping into that with this review.

I'm not happy about this change:

+  ~ParallelSPCleanupThreadClosure() {
+    // This is here to be consistent with sweeper.cpp
NMethodSweeper::mark_active_nmethods().
+    // TODO: Is this really needed?
+    OrderAccess::storestore();
+  }

because we're adding an OrderAccess::storestore() to be consistent
with an OrderAccess::storestore() that's not properly documented
which is only increasing the technical debt.

So a couple of things above don't make sense to me:

 > - sweeper thread runs outside safepoint
 > - VMThread (which is doing the nmethod marking in the case that
 >   I'm looking at) runs while all other threads (incl. the sweeper)
 >   is holding still.

and:

 > There should be no need for a storestore() (at least in sweeper.cpp...

If the sweeper thread is running "outside safepoint", then how is
the sweeper thread "holding still" while the VMThread is doing the
nmethod marking? Those two points are contradictory.

If the sweeper thread is indeed executing outside a safepoint, then
a storestore() is needed for its memory changes to be seen by the
VMThread which is doing things in parallel. That means that the
comment that sweeper.cpp doesn't need the storestore() is also
contradictory.

So what do you mean by this comment:

 > - sweeper thread runs outside safepoint

and once we know that we can figure out the rest...

Dan


>
> Roman
>
>>
>> igor
>>
>>
>>>
>>> Thanks,
>>> Tobias
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170706/b127206b/attachment.htm>

From igor.veresov at oracle.com  Thu Jul  6 18:02:27 2017
From: igor.veresov at oracle.com (Igor Veresov)
Date: Thu, 6 Jul 2017 11:02:27 -0700
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <486b5a72-bef8-4ebc-2729-3fe3aa3ab3b9@oracle.com>
 <ab4c5cb0-c4b4-d816-6b03-dddae55cb223@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
Message-ID: <AEDC5F42-D0E6-4E16-8D7F-6AE0D2653799@oracle.com>


> On Jul 6, 2017, at 9:53 AM, Roman Kennke <rkennke at redhat.com> wrote:
> 
> Am 06.07.2017 um 18:47 schrieb Igor Veresov:
>> 
>>> On Jul 6, 2017, at 3:14 AM, Tobias Hartmann <tobias.hartmann at oracle.com <mailto:tobias.hartmann at oracle.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> On 05.07.2017 20:30, Daniel D. Daugherty wrote:
>>>> JDK-8132849 is assigned to Tobias; it would be good to get Tobias'
>>>> review of this fix also.
>>> 
>>> Thanks for the notification. The sweeper/safepoint changes look good to me!
>>> 
>>>> src/share/vm/runtime/sweeper.cpp
>>>>    L205:     // TODO: Is this really needed?
>>>>    L206:     OrderAccess::storestore();
>>>>        That's a good question. Looks like that storestore() was
>>>>        added by this changeset:
>>>> 
>>>>        $ hg log -r 5357 src/share/vm/runtime/sweeper.cpp
>>>>        changeset:   5357:510fbd28919c
>>>>        user:        anoll
>>>>        date:        Fri Sep 27 10:50:55 2013 +0200
>>>>        summary:     8020151: PSR:PERF Large performance regressions when code cache is filled
>>>> 
>>>>        The changeset is not small and it looks like two
>>>>        OrderAccess::storestore() calls were added (and one
>>>>        load_ptr_acquire() was deleted):
>>>> 
>>>>        $ hg diff -r 5356 -r 5357 | grep OrderAccess
>>>>        +      OrderAccess::storestore();
>>>>        -  nmethod *code = (nmethod *)OrderAccess::load_ptr_acquire(&_code);
>>>>        +  OrderAccess::storestore();
>>>> 
>>>>        It could be that the storestore() is matching an existing
>>>>        OrderAccess operation or it could have been added in an
>>>>        abundance of caution. We definitely need a Compiler team
>>>>        person to take a look here.
>>> 
>>> Unfortunately, I'm also not sure if that barrier is required. Looking at the old RFR thread:
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011588.html>
>>> 
>>> It seems that Igor V. suggested this:
>>> "You definitely need a store-store barrier for non-TSO architectures after the mark_as_seen_on_stack() call on line 1360. Otherwise it still can be reordered by the CPU with respect to the following state assignment. Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores."
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-September/011729.html>
>>> 
>>> The requested OrderAccess::storestore() was added to nmethod::make_not_entrant_or_zombie() but seems like Albert also added one to NMethodSweeper::mark_active_nmethods().
>>> 
>>> I'll ping Igor, maybe he knows more.
>> 
>> 
>> I think the reason is explained in the comment:
>> 
>>     // Must happen before state change. Otherwise we have a race condition in
>>     // nmethod::can_not_entrant_be_converted(). I.e., a method can immediately
>>     // transition its state from 'not_entrant' to 'zombie' without having to wait
>>     // for stack scanning.
>>     if (state == not_entrant) {
>>       mark_as_seen_on_stack();
>>       OrderAccess::storestore();
>>     }
>> 
>>     // Change state
>>     _state = state;
>> 
>> Although can_not_entrant_be_converted() is now called can_convert_to_zombie(). The scenario can so like this:
>> 1. We?re setting the state to not_entrant. But the _state assignment happens before setting the traversal count in mark_as_seen_on_stack().
>> 2. While we?re doing this, the sweeper scans nmethods and is in process_compiled_method():
>> 
>>   } else if (cm->is_not_entrant()) {
>>     // If there are no current activations of this method on the
>>     // stack we can safely convert it to a zombie method
>>     if (cm->can_convert_to_zombie()) {
>>       // Clear ICStubs to prevent back patching stubs of zombie or flushed
>>       // nmethods during the next safepoint (see ICStub::finalize).
>>       {
>>         MutexLocker cl(CompiledIC_lock);
>>         cm->clear_ic_stubs();
>>       }
>>       // Code cache state change is tracked in make_zombie()
>>       cm->make_zombie();
>> 
>> 
>> So if state change happens before setting the traversal mark, the sweeper can go ahead and make it a zombie.
>> 
>> 
>> Makes sense? Or am I missing something?
> 
> I have probably not fully digged the code. As far as I can see:
> - sweeper thread runs outside safepoint
> - VMThread (which is doing the nmethod marking in the case that I'm looking at) runs while all other threads (incl. the sweeper) is holding still.
> 
> In between we have a guaranteed fence().
> 
> There should be no need for a storestore() (at least in sweeper.cpp... in nmethod.cpp it seems to actually make sense as you pointed out above). *However* it doesn't really hurt to OrderAccess::storestore() there... so play it conservative and leave it in, as RFR'd in my last patch?
> 

A method can be made not entrant outside of a safepoint. And as you say sweeper thread runs outside safepoint too. That?s why there is a problem.

igor


> Roman
> 
>> 
>> igor
>> 
>> 
>>> 
>>> Thanks,
>>> Tobias
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170706/9e2490be/attachment.htm>

From rkennke at redhat.com  Thu Jul  6 18:05:01 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 6 Jul 2017 20:05:01 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
Message-ID: <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>


>
> I'm not happy about this change:
>
> +  ~ParallelSPCleanupThreadClosure() {
> +    // This is here to be consistent with sweeper.cpp
> NMethodSweeper::mark_active_nmethods().
> +    // TODO: Is this really needed?
> +    OrderAccess::storestore();
> +  }
>
> because we're adding an OrderAccess::storestore() to be consistent
> with an OrderAccess::storestore() that's not properly documented
> which is only increasing the technical debt.
>
> So a couple of things above don't make sense to me:
>
> > - sweeper thread runs outside safepoint
> > - VMThread (which is doing the nmethod marking in the case that
> >   I'm looking at) runs while all other threads (incl. the sweeper)
> >   is holding still.
>
> and:
>
> > There should be no need for a storestore() (at least in sweeper.cpp...

Either one or the other are running. Either the VMThread is marking
nmethods (during safepoint) or the sweeper threads are running (outside
safepoint). Between the two phases, there is a guaranteed
OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
should be necessary.

>From Igor's comment I can see how it happened though: Apparently there
*is* a race in sweeper's own concurrent processing (concurrent with
compiler threads, as far as I understand). And there's a call to
nmethod::mark_as_seen_on_stack() after which a storestore() is required
(as per Igor's explanation). So the logic probably was: we have
mark_as_seen_on_stack() followed by storestore() here, so let's also put
a storestore() in the other places that call mark_as_seen_on_stack(),
one of which happens to be the safepoint cleanup code that we're
discussing. (why the storestore() hasn't been put right into
mark_as_seen_on_stack() I don't understand). In short, one storestore()
really was necessary, the other looks like it has been put there 'for
consistency' or just conservatively. But it shouldn't be necessary in
the safepoint cleanup code that we're discussing.

So what should we do? Remove the storestore() for good? Refactor the
code so that both paths at least call the storestore() in the same
place? (E.g. make mark_active_nmethods() use the closure and call
storestore() in the dtor as proposed?)


Roman


>
> If the sweeper thread is running "outside safepoint", then how is
> the sweeper thread "holding still" while the VMThread is doing the
> nmethod marking? Those two points are contradictory.
>
> If the sweeper thread is indeed executing outside a safepoint, then
> a storestore() is needed for its memory changes to be seen by the
> VMThread which is doing things in parallel. That means that the
> comment that sweeper.cpp doesn't need the storestore() is also
> contradictory.
>
> So what do you mean by this comment:
>
> > - sweeper thread runs outside safepoint
>
> and once we know that we can figure out the rest...
>
> Dan
>
>
>>
>> Roman
>>
>>>
>>> igor
>>>
>>>
>>>>
>>>> Thanks,
>>>> Tobias
>>>
>>
>


From kim.barrett at oracle.com  Thu Jul  6 20:11:47 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 6 Jul 2017 16:11:47 -0400
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
Message-ID: <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com>

> On Jul 4, 2017, at 10:00 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> The lock ranking changes look good.

I'm going to retract that.

How does these new lock rankings interact with various assertions that
rank() == or != Mutex::special?  I'm not sure those places handle
these new ranks properly.  (I'm not sure those places handle
Mutex::event rank properly either.)


From kim.barrett at oracle.com  Thu Jul  6 20:15:43 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 6 Jul 2017 16:15:43 -0400
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
 <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com>
Message-ID: <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com>

> On Jul 6, 2017, at 4:11 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>> On Jul 4, 2017, at 10:00 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>> The lock ranking changes look good.
> 
> I'm going to retract that.
> 
> How does these new lock rankings interact with various assertions that
> rank() == or != Mutex::special?  I'm not sure those places handle
> these new ranks properly.  (I'm not sure those places handle
> Mutex::event rank properly either.)

And maybe this change needs to be discussed on hotspot-dev rather than hotspot-gc-dev.


From robbin.ehn at oracle.com  Thu Jul  6 21:02:50 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Thu, 6 Jul 2017 23:02:50 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
Message-ID: <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>

Hi,

Far down ->

On 07/06/2017 08:05 PM, Roman Kennke wrote:
> 
>>
>> I'm not happy about this change:
>>
>> +  ~ParallelSPCleanupThreadClosure() {
>> +    // This is here to be consistent with sweeper.cpp
>> NMethodSweeper::mark_active_nmethods().
>> +    // TODO: Is this really needed?
>> +    OrderAccess::storestore();
>> +  }
>>
>> because we're adding an OrderAccess::storestore() to be consistent
>> with an OrderAccess::storestore() that's not properly documented
>> which is only increasing the technical debt.
>>
>> So a couple of things above don't make sense to me:
>>
>>> - sweeper thread runs outside safepoint
>>> - VMThread (which is doing the nmethod marking in the case that
>>>    I'm looking at) runs while all other threads (incl. the sweeper)
>>>    is holding still.
>>
>> and:
>>
>>> There should be no need for a storestore() (at least in sweeper.cpp...
> 
> Either one or the other are running. Either the VMThread is marking
> nmethods (during safepoint) or the sweeper threads are running (outside
> safepoint). Between the two phases, there is a guaranteed
> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
> should be necessary.
> 
>  From Igor's comment I can see how it happened though: Apparently there
> *is* a race in sweeper's own concurrent processing (concurrent with
> compiler threads, as far as I understand). And there's a call to
> nmethod::mark_as_seen_on_stack() after which a storestore() is required
> (as per Igor's explanation). So the logic probably was: we have
> mark_as_seen_on_stack() followed by storestore() here, so let's also put
> a storestore() in the other places that call mark_as_seen_on_stack(),
> one of which happens to be the safepoint cleanup code that we're
> discussing. (why the storestore() hasn't been put right into
> mark_as_seen_on_stack() I don't understand). In short, one storestore()
> really was necessary, the other looks like it has been put there 'for
> consistency' or just conservatively. But it shouldn't be necessary in
> the safepoint cleanup code that we're discussing.
> 
> So what should we do? Remove the storestore() for good? Refactor the
> code so that both paths at least call the storestore() in the same
> place? (E.g. make mark_active_nmethods() use the closure and call
> storestore() in the dtor as proposed?)

I took a quick look, maybe I'm missing some stuff but:

So there is a slight optimization when not running sweeper to skip compiler barrier/fence in stw.

Don't think that matter, so I propose something like:
-  long  stack_traversal_mark()                    { return _stack_traversal_mark; }
-  void  set_stack_traversal_mark(long l)          { _stack_traversal_mark = l; }
+  long  stack_traversal_mark()                    { return OrderAccess::load_acquire(&_stack_traversal_mark); }
+  void  set_stack_traversal_mark(long l)          { OrderAccess::release_store(&_stack_traversal_mark, l); }

Maybe make _stack_traversal_mark volatile also, just as a marking that it is concurrent accessed.
And remove both storestore.

"Also neither of these state variables are volatile in nmethod, so even the compiler may reorder the stores"
Fortunately at least _state is volatile now.

I think _state also should use la/rs semantics instead, but that's another story.

Thanks, Robbin

> 
> 
> Roman
> 
> 
>>
>> If the sweeper thread is running "outside safepoint", then how is
>> the sweeper thread "holding still" while the VMThread is doing the
>> nmethod marking? Those two points are contradictory.
>>
>> If the sweeper thread is indeed executing outside a safepoint, then
>> a storestore() is needed for its memory changes to be seen by the
>> VMThread which is doing things in parallel. That means that the
>> comment that sweeper.cpp doesn't need the storestore() is also
>> contradictory.
>>
>> So what do you mean by this comment:
>>
>>> - sweeper thread runs outside safepoint
>>
>> and once we know that we can figure out the rest...
>>
>> Dan
>>
>>
>>>
>>> Roman
>>>
>>>>
>>>> igor
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>
>>>
>>
> 


From email.sundarms at gmail.com  Thu Jul  6 23:03:17 2017
From: email.sundarms at gmail.com (Sundara Mohan M)
Date: Thu, 6 Jul 2017 16:03:17 -0700
Subject: Adaptive size policy and MemoryPoolMXBean
Message-ID: <CAEY0QqAL_38oFk9a2XG28HNBEFHnF--aR0TpsEpmn4Si6z8uQA@mail.gmail.com>

Hi,
   I am trying to understand how will be values returned by
memorypoolmxbean change when adaptive size is used

For ex,

CMS GC memory pool mx bean returned values
*Name: Par Eden Space, usage: init = 71630848(69952K) used = 2865272(2798K)
committed = 71630848(69952K) max = 286326784(279616K)*
*Name: Par Survivor Space, usage: init = 8912896(8704K) used = 0(0K)
committed = 8912896(8704K) max = 35782656(34944K)*
*Name: CMS Old Gen, usage: init = 178978816(174784K) used = 0(0K) committed
= 178978816(174784K) max = 715849728(699072K)*

*G1GC *memory pool mx bean returned values
*Name: G1 Eden Space, usage: init = 27262976(26624K) used = 0(0K) committed
= 27262976(26624K) max = -1(-1K)*
*Name: G1 Survivor Space, usage: init = 0(0K) used = 0(0K) committed =
0(0K) max = -1(-1K)*
*Name: G1 Old Gen, usage: init = 241172480(235520K) used = 0(0K) committed
= 241172480(235520K) max = 1073741824(1048576K)*

This is the value returned after starting jvm, will this(specifically
committed or max) value updated after adaptive size is applied?

Motivation: I am trying to understand how i can use this info (got to know
about this from
https://techblug.wordpress.com/2011/07/21/detecting-low-memory-in-java-part-2/)
detect low memory and drop some objects in my map.

Thanks,
Sundar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170706/5f8fddf3/attachment.htm>

From kim.barrett at oracle.com  Thu Jul  6 23:29:47 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 6 Jul 2017 19:29:47 -0400
Subject: RFR (M) 8183923: Get rid of FreeBlockDictionary and dithering
In-Reply-To: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com>
References: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com>
Message-ID: <F31970B4-154B-459C-ADFC-AA2B76AA3ABF@oracle.com>

> On Jul 6, 2017, at 5:20 AM, Mikael Gerdin <mikael.gerdin at oracle.com> wrote:
> 
> Hi all,
> 
> Please review this cleanup inspired by looking at Roman's CMS cleanup :)
> 
> FreeBlockDictionary is an old abstraction for multiple CMS freelist datastructures which never appear to have been implemented, getting rid of it also simplifies some code in Metaspace so it's not all CMS stuff.
> 
> Testing: jprt
> Bug: https://bugs.openjdk.java.net/browse/JDK-8183923
> Webrev: http://cr.openjdk.java.net/~mgerdin/8183923/webrev.0/index.html
> 
> Thanks
> /Mikael

Looks good.


From rkennke at redhat.com  Fri Jul  7 08:53:44 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 7 Jul 2017 10:53:44 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com>
Message-ID: <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com>

Am 05.07.2017 um 13:12 schrieb Mikael Gerdin:
> Hi Roman,
>
> On 2017-07-04 20:47, Roman Kennke wrote:
>> AdaptiveSizePolicy is not used/called from outside the GCs, and not all
>> GCs need them. It makes sense to remove it from the CollectedHeap and
>> CollectorPolicy interfaces and move them down to the actual subclasses
>> that used them.
>>
>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only
>> used/implemented in the parallel GC. Also, I made this class AllStatic
>> (was StackObj)
>>
>> Tested by running hotspot_gc jtreg tests without regressions.
>>
>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
>
> Please correct me if I'm wrong here but it looks like all the non-G1
> collectors set the _should_clear_all_soft_refs based on
> gc_overhead_limit_near.
> Perhaps the ClearedAllSoftRefs scoped object could be modified to only
> work with GenCollectorPolicy derived policies (which include parallel
> *shrugs*) and G1 should just stop worrying about _all_soft_refs_clear.
> Looking closer, I can't even find G1 code looking at that member so
> maybe it, too, should be moved to GenCollectorPolicy?
I can't find any place where should_clear_all_soft_refs() would become
true for G1. And, as you mention, G1 doesn't even look at
all_soft_refs_clear() either. I removed those parts from G1, and moved
all soft_refs stuff down to GenCollectorPolicy.

I also changed the way the casting accessors as_generation_policy() etc
work: the as_* accessors now crash with ShouldNotReachHere() when called
for the wrong policy type, and the is_* accessors now return constant
true/false based on their type (so that it doesn't crash with
ShouldNotReachHere() ..). I think this is more useful than the way it's
been done before.

http://cr.openjdk.java.net/~rkennke/8179268/webrev.01/
<http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.01/>


Tested by: hotspot_gc jtreg tests.

What do you think?

Roman


From rkennke at redhat.com  Fri Jul  7 09:10:46 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 7 Jul 2017 11:10:46 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <b64c0b62-5aca-2c88-5d4f-a5be4a5d697a@oracle.com>
 <5c80f8df-27c9-f9a9-dc6d-47f9c6019a61@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
Message-ID: <2b4ea576-5133-4d5e-6fdb-1d60f40ec037@redhat.com>


>
> I'm not happy about this change:
>
> +  ~ParallelSPCleanupThreadClosure() {
> +    // This is here to be consistent with sweeper.cpp
> NMethodSweeper::mark_active_nmethods().
> +    // TODO: Is this really needed?
> +    OrderAccess::storestore();
> +  }
>
> because we're adding an OrderAccess::storestore() to be consistent
> with an OrderAccess::storestore() that's not properly documented
> which is only increasing the technical debt.
>
> So a couple of things above don't make sense to me:
>
> > - sweeper thread runs outside safepoint
> > - VMThread (which is doing the nmethod marking in the case that
> >   I'm looking at) runs while all other threads (incl. the sweeper)
> >   is holding still.
>
> and:
>
> > There should be no need for a storestore() (at least in sweeper.cpp...

Either one or the other are running. Either the VMThread is marking
nmethods (during safepoint) or the sweeper threads are running (outside
safepoint). Between the two phases, there is a guaranteed
OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
should be necessary.

>From Igor's comment I can see how it happened though: Apparently there
*is* a race in sweeper's own concurrent processing (concurrent with
compiler threads, as far as I understand). And there's a call to
nmethod::mark_as_seen_on_stack() after which a storestore() is required
(as per Igor's explanation). So the logic probably was: we have
mark_as_seen_on_stack() followed by storestore() here, so let's also put
a storestore() in the other places that call mark_as_seen_on_stack(),
one of which happens to be the safepoint cleanup code that we're
discussing. (why the storestore() hasn't been put right into
mark_as_seen_on_stack() I don't understand). In short, one storestore()
really was necessary, the other looks like it has been put there 'for
consistency' or just conservatively. But it shouldn't be necessary in
the safepoint cleanup code that we're discussing.

So what should we do? Remove the storestore() for good? Refactor the
code so that both paths at least call the storestore() in the same
place? (E.g. make mark_active_nmethods() use the closure and call
storestore() in the dtor as proposed?)


Roman


>
> If the sweeper thread is running "outside safepoint", then how is
> the sweeper thread "holding still" while the VMThread is doing the
> nmethod marking? Those two points are contradictory.
>
> If the sweeper thread is indeed executing outside a safepoint, then
> a storestore() is needed for its memory changes to be seen by the
> VMThread which is doing things in parallel. That means that the
> comment that sweeper.cpp doesn't need the storestore() is also
> contradictory.
>
> So what do you mean by this comment:
>
> > - sweeper thread runs outside safepoint
>
> and once we know that we can figure out the rest...
>
> Dan
>
>
>>
>> Roman
>>
>>>
>>> igor
>>>
>>>
>>>>
>>>> Thanks,
>>>> Tobias
>>>
>>
>


From rkennke at redhat.com  Fri Jul  7 10:51:38 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 7 Jul 2017 12:51:38 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <46ad874e-eb41-7927-265a-40dea92dfe1e@oracle.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
Message-ID: <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>

Hi Robbin,

>
> Far down ->
>
> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>
>>>
>>> I'm not happy about this change:
>>>
>>> +  ~ParallelSPCleanupThreadClosure() {
>>> +    // This is here to be consistent with sweeper.cpp
>>> NMethodSweeper::mark_active_nmethods().
>>> +    // TODO: Is this really needed?
>>> +    OrderAccess::storestore();
>>> +  }
>>>
>>> because we're adding an OrderAccess::storestore() to be consistent
>>> with an OrderAccess::storestore() that's not properly documented
>>> which is only increasing the technical debt.
>>>
>>> So a couple of things above don't make sense to me:
>>>
>>>> - sweeper thread runs outside safepoint
>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>    I'm looking at) runs while all other threads (incl. the sweeper)
>>>>    is holding still.
>>>
>>> and:
>>>
>>>> There should be no need for a storestore() (at least in sweeper.cpp...
>>
>> Either one or the other are running. Either the VMThread is marking
>> nmethods (during safepoint) or the sweeper threads are running (outside
>> safepoint). Between the two phases, there is a guaranteed
>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
>> should be necessary.
>>
>>  From Igor's comment I can see how it happened though: Apparently there
>> *is* a race in sweeper's own concurrent processing (concurrent with
>> compiler threads, as far as I understand). And there's a call to
>> nmethod::mark_as_seen_on_stack() after which a storestore() is required
>> (as per Igor's explanation). So the logic probably was: we have
>> mark_as_seen_on_stack() followed by storestore() here, so let's also put
>> a storestore() in the other places that call mark_as_seen_on_stack(),
>> one of which happens to be the safepoint cleanup code that we're
>> discussing. (why the storestore() hasn't been put right into
>> mark_as_seen_on_stack() I don't understand). In short, one storestore()
>> really was necessary, the other looks like it has been put there 'for
>> consistency' or just conservatively. But it shouldn't be necessary in
>> the safepoint cleanup code that we're discussing.
>>
>> So what should we do? Remove the storestore() for good? Refactor the
>> code so that both paths at least call the storestore() in the same
>> place? (E.g. make mark_active_nmethods() use the closure and call
>> storestore() in the dtor as proposed?)
>
> I took a quick look, maybe I'm missing some stuff but:
>
> So there is a slight optimization when not running sweeper to skip
> compiler barrier/fence in stw.
>
> Don't think that matter, so I propose something like:
> -  long  stack_traversal_mark()                    { return
> _stack_traversal_mark; }
> -  void  set_stack_traversal_mark(long l)          {
> _stack_traversal_mark = l; }
> +  long  stack_traversal_mark()                    { return
> OrderAccess::load_acquire(&_stack_traversal_mark); }
> +  void  set_stack_traversal_mark(long l)          {
> OrderAccess::release_store(&_stack_traversal_mark, l); }
>
> Maybe make _stack_traversal_mark volatile also, just as a marking that
> it is concurrent accessed.
> And remove both storestore.
>
> "Also neither of these state variables are volatile in nmethod, so
> even the compiler may reorder the stores"
> Fortunately at least _state is volatile now.
>
> I think _state also should use la/rs semantics instead, but that's
> another story.

Like this?

http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>

Roman


From erik.helin at oracle.com  Fri Jul  7 11:16:40 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 7 Jul 2017 13:16:40 +0200
Subject: RFR (S): 8183226: Remembered set summarization accesses not fully
 initialized java thread DCQS
In-Reply-To: <1499083970.2802.33.camel@oracle.com>
References: <1499083970.2802.33.camel@oracle.com>
Message-ID: <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com>

On 07/03/2017 02:12 PM, Thomas Schatzl wrote:
> Hi all,

Hi Thomas,

>   can I get reviews for the following change that breaks some
> dependency cycle in g1remset initialization to fix some (at this time
> benign) bug when printing remembered set summarization information?
>
> The problem is that G1Remset initializes its internal remembered set
> summarization helper data structure in the constructor, which accesses
> some DCQS members before we call the initialize methods on the various
> global DCQS'es in G1CollectedHeap::initialize().
> By splitting the initialization of the remembered set summarization
> into an extra method, this one can be called at the very end of
> G1CollectedHeap::initialize(), thus breaking the dependency.

I think there is an easier way to achieve this :) The default 
constructor for G1RemSetSummary sets up almost all fields, and we make 
it really set up _all_ fields, then I believe we are good:
- G1RemSetSummary::_num_vtimes can be set up in the constructor,
   because the number of entries only depends on
   ConcurrentG1Refine::thread_num(), which is a static function that
   only return G1ConcRefinementThreads.
- G1RemSetSummary::_rs_threads_vtimes can be allocated in the
   constructor.
- The value for _rs_threads_vtimes can be initialized to 0, since the
   accumulated virtual time for the each concurrent refinement thread
   should be 0 (since they haven't even started yet).
- Same reasoning as above goes for _sampling_thread_vtime.
- _rem_set can be NULL

With the above changes, G1RemSet will call the default constructor (same 
as it currently does). The call to _prev_period_summary.initialize() 
will be removed from the G1RemSet constructor.

With the above changes, G1RemSetSummary::G1RemSetSummary() has no 
dependencies on any other class, and is still initialized to the correct 
values. I think this is all that is needed to solve this problem.

The rest, below this line, is just existing code that could really 
benefit from a cleanup :)

The G1RemSetSummary::initialize method is no longer needed, 
G1RemSetSummary can now instead have a constructor taking a G1RemSet* as 
argument. That constructor will do what G1RemSetSummary::initialize does 
today.

In G1RemSet::print_periodic_summary_info, the code can then look like:

   G1RemSetSummary current(this);
   _prev_period_summary.subtract(&current);

For extra, extra bonus points, we should make 
G1RemSetSummary::subtract_from work the other way around, so that the 
above code reads:

   G1RemSetSummary current(this);
   current.subtract(_prev_period_summary); // current -= prev

instead of what the code does today:

   prev.subtract_from(current); // prev = current - prev

which to me reads completely backwards :)

Finally, it would be very nice for G1RemSetSummary to get a proper copy 
constructor, so that the last line in print_periodic_summary:

   _prev_period_summary.set(&current);

can just become:

   _prev_period_summary = current;

(G1RemSetSummary::set is just a copy-constructor in disguise)

You don't need to do all the cleanups, but I think having a fully 
functioning default constructor is a better way to solve this problem, 
rather than shuffling the call to initialize around. What do you think?

Thanks,
Erik

> Benign because the values accessed at that time have the same values as
> the values after initialization.
>
> This also allows for grouping together the initialization of
> G1RemSet/DCQS/G1ConcurrentRefine related data structures more easily in
> G1CollectedHeap::initialize().
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183226
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183226/webrev/
> Testing:
> local testing running remembered set summarization manually, jprt
>
> Thanks,
>   Thomas
>


From robbin.ehn at oracle.com  Fri Jul  7 11:23:30 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Fri, 7 Jul 2017 13:23:30 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
Message-ID: <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>

Hi Roman,

On 07/07/2017 12:51 PM, Roman Kennke wrote:
> Hi Robbin,
> 
>>
>> Far down ->
>>
>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>
>>>>
>>>> I'm not happy about this change:
>>>>
>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>> +    // This is here to be consistent with sweeper.cpp
>>>> NMethodSweeper::mark_active_nmethods().
>>>> +    // TODO: Is this really needed?
>>>> +    OrderAccess::storestore();
>>>> +  }
>>>>
>>>> because we're adding an OrderAccess::storestore() to be consistent
>>>> with an OrderAccess::storestore() that's not properly documented
>>>> which is only increasing the technical debt.
>>>>
>>>> So a couple of things above don't make sense to me:
>>>>
>>>>> - sweeper thread runs outside safepoint
>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>     I'm looking at) runs while all other threads (incl. the sweeper)
>>>>>     is holding still.
>>>>
>>>> and:
>>>>
>>>>> There should be no need for a storestore() (at least in sweeper.cpp...
>>>
>>> Either one or the other are running. Either the VMThread is marking
>>> nmethods (during safepoint) or the sweeper threads are running (outside
>>> safepoint). Between the two phases, there is a guaranteed
>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
>>> should be necessary.
>>>
>>>   From Igor's comment I can see how it happened though: Apparently there
>>> *is* a race in sweeper's own concurrent processing (concurrent with
>>> compiler threads, as far as I understand). And there's a call to
>>> nmethod::mark_as_seen_on_stack() after which a storestore() is required
>>> (as per Igor's explanation). So the logic probably was: we have
>>> mark_as_seen_on_stack() followed by storestore() here, so let's also put
>>> a storestore() in the other places that call mark_as_seen_on_stack(),
>>> one of which happens to be the safepoint cleanup code that we're
>>> discussing. (why the storestore() hasn't been put right into
>>> mark_as_seen_on_stack() I don't understand). In short, one storestore()
>>> really was necessary, the other looks like it has been put there 'for
>>> consistency' or just conservatively. But it shouldn't be necessary in
>>> the safepoint cleanup code that we're discussing.
>>>
>>> So what should we do? Remove the storestore() for good? Refactor the
>>> code so that both paths at least call the storestore() in the same
>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>> storestore() in the dtor as proposed?)
>>
>> I took a quick look, maybe I'm missing some stuff but:
>>
>> So there is a slight optimization when not running sweeper to skip
>> compiler barrier/fence in stw.
>>
>> Don't think that matter, so I propose something like:
>> -  long  stack_traversal_mark()                    { return
>> _stack_traversal_mark; }
>> -  void  set_stack_traversal_mark(long l)          {
>> _stack_traversal_mark = l; }
>> +  long  stack_traversal_mark()                    { return
>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>> +  void  set_stack_traversal_mark(long l)          {
>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>
>> Maybe make _stack_traversal_mark volatile also, just as a marking that
>> it is concurrent accessed.
>> And remove both storestore.
>>
>> "Also neither of these state variables are volatile in nmethod, so
>> even the compiler may reorder the stores"
>> Fortunately at least _state is volatile now.
>>
>> I think _state also should use la/rs semantics instead, but that's
>> another story.
> 
> Like this?
> 
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>

Yes, exactly, I like this!
Dan? Igor ? Tobias?

Thanks Roman!

BTW I'm going on vacation (5w) in a few hours, but I will follow this thread/changeset to the end!

/Robbin

> 
> Roman
> 


From erik.helin at oracle.com  Fri Jul  7 12:23:21 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 7 Jul 2017 14:23:21 +0200
Subject: RFR (M) 8183923: Get rid of FreeBlockDictionary and dithering
In-Reply-To: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com>
References: <6ac9125f-87ac-4211-b90c-29526b6aae29@oracle.com>
Message-ID: <b6a5b05d-4d46-234f-4a7d-b1c47d0521ed@oracle.com>

On 07/06/2017 11:20 AM, Mikael Gerdin wrote:
> Hi all,
>
> Please review this cleanup inspired by looking at Roman's CMS cleanup :)
>
> FreeBlockDictionary is an old abstraction for multiple CMS freelist
> datastructures which never appear to have been implemented, getting rid
> of it also simplifies some code in Metaspace so it's not all CMS stuff.
>
> Testing: jprt
> Bug: https://bugs.openjdk.java.net/browse/JDK-8183923
> Webrev: http://cr.openjdk.java.net/~mgerdin/8183923/webrev.0/index.html

Looks good, Reviewed. Thanks for cleaning this up Mikael!
Erik

> Thanks
> /Mikael


From erik.helin at oracle.com  Fri Jul  7 12:35:21 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 7 Jul 2017 14:35:21 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <13358626-e399-e352-1711-587416621aac@redhat.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
Message-ID: <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>

On 07/06/2017 06:23 PM, Roman Kennke wrote:
>>>> Ok to push this?
>>
>> I just realized that your change doesn't build on Windows since you
>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky
>> about that.
>> /Mikael
>
> Uhhh.
> Ok, here's revision #3 with precompiled added in:
>
> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/
> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.03/>

Hi Roman,

I just started looking :) I think GenCollectedHeap::gc_prologue and 
GenCollectedHeap::gc_epilogue should be virtual, and 
always_do_update_barrier = UseConcMarkSweepGC moved down 
CMSHeap::gc_epilogue.

What do you think?

Thanks,
Erik

> Roman
>


From erik.helin at oracle.com  Fri Jul  7 13:07:02 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 7 Jul 2017 15:07:02 +0200
Subject: RFR (XS): 8183397: Ensure consistent closure filtering during
 evacuation
In-Reply-To: <1499329701.2760.3.camel@oracle.com>
References: <1499081088.2802.29.camel@oracle.com>
 <64943738-9d9f-0d88-b44e-9a9ec0812f33@oracle.com>
 <1499329701.2760.3.camel@oracle.com>
Message-ID: <9c538d19-e9e9-cb61-640a-7476d2e0c725@oracle.com>

On 07/06/2017 10:28 AM, Thomas Schatzl wrote:
> Hi Erik,
>
> On Thu, 2017-07-06 at 10:20 +0200, Erik Helin wrote:
>> On 07/03/2017 01:24 PM, Thomas Schatzl wrote:
>>>
>>> Hi all,
>> Hi Thomas,
>>
>>>
>>>   can I have reviews for this change that fixes an observation that
>>> has
>>> been made recently by Erik, i.e. that the "else" part of several
>>> evacuation closures inconsistently filters out non-cross-region
>>> references before checking whether the referenced object is a
>>> humongous
>>> or ext region.
>>>
>>> This causes somewhat hard to diagnose performance issues, and
>>> earlier
>>> filtering does not hurt if done anyway.
>>>
>>> (Note that the current way of checking in all but the UpdateRS
>>> closure
>>> using HeapRegion::is_in_same_region() seems optimal. The only
>>> reason
>>> why the other way in the UpdateRS closure is better because the
>>> code
>>> needs the "to" HeapRegion pointer anyway)
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8183397
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8183397/webrev/
>> -  } else if (in_cset_state.is_humongous()) {
>> +  } else {
>> +    if (in_cset_state.is_humongous()) {
>>
>> Why change `else if` to `else { if (...) {` here? Does it result in
>> the
>> compiler generating faster code for this case?
>
>   no. It only makes this do_oop_*() method look similar in structure to
> our do_oop_*() methods in the closures.
>
> I.e.
>
> if (in_cset.state.is_in_cset()) {
>   // do stuff for refs into cset
> } else {
>   // expanding handle_non_cset_obj_common()
>   if (state.is_humongous()) {
>   } else ...
> }
>
> I felt this improves overall readability, but this may only be because
> I have been working in this code a lot recently. I can revert this
> change.

Yeah, I suspected this was your reasoning. IMO, the code is a bit too 
spread out for this to work here, a reader of 
g1ParScanThreadState.inline.hpp might not be aware of the idioms used is 
g1OopClosures.inline.hpp.

So, for me, please use `else if` in g1ParScanThreadState.inline.hpp :) I 
do not need to re-review that change.

Great work Thomas, thanks!
Erik

> Thanks for your review,
>   Thomas
>


From rkennke at redhat.com  Fri Jul  7 13:21:06 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 7 Jul 2017 15:21:06 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
Message-ID: <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>

Am 07.07.2017 um 14:35 schrieb Erik Helin:
> On 07/06/2017 06:23 PM, Roman Kennke wrote:
>>>>> Ok to push this?
>>>
>>> I just realized that your change doesn't build on Windows since you
>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky
>>> about that.
>>> /Mikael
>>
>> Uhhh.
>> Ok, here's revision #3 with precompiled added in:
>>
>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/
>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.03/>
>
> Hi Roman,
>
> I just started looking :) I think GenCollectedHeap::gc_prologue and
> GenCollectedHeap::gc_epilogue should be virtual, and
> always_do_update_barrier = UseConcMarkSweepGC moved down
> CMSHeap::gc_epilogue.
>
> What do you think?

Yes, I have seen that. My original plan was to leave it as is because I
know that Erik ?. is working on a big barrier set refactoring that would
remove this code anyway. However, it doesn't really matter, here's the
cleaned up patch:

http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.04/>

Roman


From erik.osterlund at oracle.com  Fri Jul  7 15:17:39 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Fri, 7 Jul 2017 17:17:39 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
 <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com>
 <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com>
Message-ID: <595FA613.7090306@oracle.com>

Hi Kim,

Added hotspot-dev as requested.

To answer your worries we must first understand what invariant these 
checks for 'special' locks seek to achieve.
The invariant is, AFAIK, that locks with a rank of 'special' and below 
(now including 'access' as well as 'event' that was already there 
before) must *not* check for safepoints when grabbed from JavaThreads. 
Safepoint checks translate to performing ThreadBlockInVM which must 
*not* be performed in 'special' or below ranked locks.
The reason this is necessary, is that those locks must be usable from 
safepoint-unsafe places, e.g. leaf calls, where we can not yield to the 
safepoint synchronizer. I believe that must have been what the name 
'special' originated from, correct me if I'm wrong. Since locking in 
safepoint-unsafe places potentially blocks the safepoint synchronizer, a 
deadlock would arise if a thread, 'Thread A', acquires a special lock 
with Java/VM thread state. But that special lock is held by another 
thread, 'Thread B', that has blocked on a non-special lock, because it 
yielded to the safepoint synchronizer on the VM thread, that is 
concurrently synchronizing a safepoint. However, the safepoint 
synchronizer is blocked waiting for the thread acquiring the special 
lock to yield. But it never will. In the end, 'Thread A' waits for 
'Thread B' to release the special lock, and 'Thread B' waits for the 
VMThread that is safepoint synchronizing, and the VMThread is waiting 
for 'Thread A' - a deadlock.
With the provided invariant however, it is impossible to acquire 
non-safepoint non-special locks while already holding 'special' and 
below locks. Therefore, by checking for the condition of the invariant, 
we will be certain such a deadlock can not happen, as that would violate 
the usual lock rank ordering that we all know about - one can not 
acquire a lock that has a rank higher than locks already acquired.

First of all, before we delve deeper into this rabbit hole. Note that if 
our new 'access' locks are used properly, e.g. do not perform safepoint 
checks or call code that needs to have a safepoint-safe state under the 
lock, we are good. The G1 barrier code conforms to those restrictions 
and the code has since forever been called in leaf calls while in Java 
thread state, without oop maps, making it a not safepoint-safe state. So 
that is all good. The remaining question to answer is, had we done crazy 
things while holding those access locks, would the deadlock detection 
system have detected that?

To answer that question we must examine whether that invariant holds or not.

There are three paths for locking a Mutex/Monitor, try_lock(), lock() 
and lock_without_safepoint_check(). Among these, only lock() violates 
the invariant for 'special' and below locks taken from a JavaThread. It 
is the only locking operation that performs a safepoint check. 
try_lock() instead returns false if the lock could not be acquired, and 
lock_without_safepoint_check does not check for safepoints.

MutexLocker and MutexLockerEx are abstractions around Mutex that boil 
down to calls to either lock() or lock_without_safepoint_check(). So if 
lock() catches the broken invariant, all locking in hotspot will somehow 
catch it. Let's examine what happens in lock().

share/vm/runtime/mutex.cpp:933:21:    assert(rank() > Mutex::special, 
"Potential deadlock with special or lesser rank mutex");
This is the check in Monitor::lock() *with* safepoint check. It 
definitely catches illegal uses of lock() on 'special', 'event' and 
'access' ranked locks on JavaThreads. Since we always catch misuse of 
special and below ranked locks here, the deadlock detection system works 
correctly even for 'event' and the new 'access' ranked locks. The other 
checks are mostly redundant and will eventually boil down to this check.

Examples:

share/vm/runtime/mutexLocker.hpp:165:29:    assert(mutex->rank() != 
Mutex::special
share/vm/runtime/mutexLocker.hpp:173:29:    assert(mutex->rank() != 
Mutex::special,
These two are constructors for MutexLocker which will call lock(). 
Therefore, this check is redundant. The 'event' and 'access' ranks will 
both miss this redundant check, but subsequently run into the check in 
lock(), which is the one that matters and still catches the broken 
invariant.

share/vm/runtime/mutexLocker.hpp:208:30: assert(mutex->rank() > 
Mutex::special || no_safepoint_check,
This check in MutexLockerEx checks that either the lock should be over 
special or it must not check for safepoints. This works as intended with 
'event' and 'access' locks. They are forced to perform a lock without 
safepoint check, and hence not enter the lock() path of Mutex. However, 
if they did, it would still be redundantly sanity checked there too.

share/vm/runtime/mutex.cpp:1384:23:         || rank() == Mutex::special, 
"wrong thread state for using locks");
share/vm/runtime/mutex.cpp:1389:30:    debug_only(if (rank() != 
Mutex::special) \
These two check checks are found in Monitor::check_prelock_state(). 
Which is called from lock() and try_lock(). As for lock, we already 
concluded that using lock() *at all* on a special lock from a JavaThread 
will be found out and an assert will trigger. So these checks for 
special locks seem to be a bit redundant. As for the try_lock() case, 
they even seem wrong. Checking for valid safepoint state for a locking 
operation that can't block seems wrong. But that is invariant of whether 
the lock is special or not. It just should not check for safepoint-safe 
states on try_lock().

share/vm/runtime/thread.cpp:903:27:           cur->rank() == 
Mutex::special) {
This check is in Thread::check_for_valid_safepoint_state() where it 
walks all the monitors and makes sure that we don't have any of 
{Threads_lock, Compile_lock, VMOperationRequest_lock, 
VMOperationQueue_lock} and allow_vm_block() or it contains any special 
lock. This is performed when e.g. allocating etc. This check should 
arguably check for special *and below* ranked locks that should not be 
acquired while in safepoint-safe. However, those access locks return 
true on allow_vm_block(), and therefore will correctly be detected as 
dangerous had we done crazy things under these locks.

All in all, I believe that the deadlock detecion system has some 
redundant, and some confusing checks that involve the lock rank 
Mutex::special. But I do believe that it works and would detect 
deadlocks, but could do with some reworking to make it more explicit. 
And that is invariant of the new access rank and applies equally to the 
event rank.

However, since these access locks play well with the current deadlock 
detection as they do not do anything illegal, and even if use of these 
locks did indeed do illegal things, it would still be detected by the 
deadlock detection system, it is reasonable to say that refactoring the 
deadlock detection system is a separate RFE?

Specifically, clarifying the deadlock detection system by removing 
redundant checks, not checking for safepoint-safe state in try_lock as 
well as explicitly listing special and below locks as illegal when 
verifying Thread::check_for_valid_safepoint_state(), regardles of 
whether allow_vm_block() is true or not. Sounds like a separate RFE to me!

Thanks,
/Erik

On 2017-07-06 22:15, Kim Barrett wrote:
>> On Jul 6, 2017, at 4:11 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>
>>> On Jul 4, 2017, at 10:00 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>> The lock ranking changes look good.
>> I'm going to retract that.
>>
>> How does these new lock rankings interact with various assertions that
>> rank() == or != Mutex::special?  I'm not sure those places handle
>> these new ranks properly.  (I'm not sure those places handle
>> Mutex::event rank properly either.)
> And maybe this change needs to be discussed on hotspot-dev rather than hotspot-gc-dev.
>


From igor.veresov at oracle.com  Fri Jul  7 18:09:01 2017
From: igor.veresov at oracle.com (Igor Veresov)
Date: Fri, 7 Jul 2017 11:09:01 -0700
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <5711258b-99b0-e06f-ba6e-0b6b55d88345@redhat.com>
 <0e1e2779-9316-b756-6cc8-e0c8add14a94@oracle.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
Message-ID: <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>


> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
> 
> Hi Roman,
> 
> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>> Hi Robbin,
>>> 
>>> Far down ->
>>> 
>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>> 
>>>>> 
>>>>> I'm not happy about this change:
>>>>> 
>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>> NMethodSweeper::mark_active_nmethods().
>>>>> +    // TODO: Is this really needed?
>>>>> +    OrderAccess::storestore();
>>>>> +  }
>>>>> 
>>>>> because we're adding an OrderAccess::storestore() to be consistent
>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>> which is only increasing the technical debt.
>>>>> 
>>>>> So a couple of things above don't make sense to me:
>>>>> 
>>>>>> - sweeper thread runs outside safepoint
>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>    I'm looking at) runs while all other threads (incl. the sweeper)
>>>>>>    is holding still.
>>>>> 
>>>>> and:
>>>>> 
>>>>>> There should be no need for a storestore() (at least in sweeper.cpp...
>>>> 
>>>> Either one or the other are running. Either the VMThread is marking
>>>> nmethods (during safepoint) or the sweeper threads are running (outside
>>>> safepoint). Between the two phases, there is a guaranteed
>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
>>>> should be necessary.
>>>> 
>>>>  From Igor's comment I can see how it happened though: Apparently there
>>>> *is* a race in sweeper's own concurrent processing (concurrent with
>>>> compiler threads, as far as I understand). And there's a call to
>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is required
>>>> (as per Igor's explanation). So the logic probably was: we have
>>>> mark_as_seen_on_stack() followed by storestore() here, so let's also put
>>>> a storestore() in the other places that call mark_as_seen_on_stack(),
>>>> one of which happens to be the safepoint cleanup code that we're
>>>> discussing. (why the storestore() hasn't been put right into
>>>> mark_as_seen_on_stack() I don't understand). In short, one storestore()
>>>> really was necessary, the other looks like it has been put there 'for
>>>> consistency' or just conservatively. But it shouldn't be necessary in
>>>> the safepoint cleanup code that we're discussing.
>>>> 
>>>> So what should we do? Remove the storestore() for good? Refactor the
>>>> code so that both paths at least call the storestore() in the same
>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>> storestore() in the dtor as proposed?)
>>> 
>>> I took a quick look, maybe I'm missing some stuff but:
>>> 
>>> So there is a slight optimization when not running sweeper to skip
>>> compiler barrier/fence in stw.
>>> 
>>> Don't think that matter, so I propose something like:
>>> -  long  stack_traversal_mark()                    { return
>>> _stack_traversal_mark; }
>>> -  void  set_stack_traversal_mark(long l)          {
>>> _stack_traversal_mark = l; }
>>> +  long  stack_traversal_mark()                    { return
>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>> +  void  set_stack_traversal_mark(long l)          {
>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>> 
>>> Maybe make _stack_traversal_mark volatile also, just as a marking that
>>> it is concurrent accessed.
>>> And remove both storestore.
>>> 
>>> "Also neither of these state variables are volatile in nmethod, so
>>> even the compiler may reorder the stores"
>>> Fortunately at least _state is volatile now.
>>> 
>>> I think _state also should use la/rs semantics instead, but that's
>>> another story.
>> Like this?
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
> 
> Yes, exactly, I like this!
> Dan? Igor ? Tobias?
> 

That seems correct.

igor

> Thanks Roman!
> 
> BTW I'm going on vacation (5w) in a few hours, but I will follow this thread/changeset to the end!
> 
> /Robbin
> 
>> Roman

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170707/deeab89f/attachment.htm>

From daniel.daugherty at oracle.com  Sat Jul  8 02:46:09 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 7 Jul 2017 20:46:09 -0600
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
Message-ID: <1c976ae5-2893-9e7c-d588-1c1e4da447e4@oracle.com>

On 7/7/17 12:09 PM, Igor Veresov wrote:
>
>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com 
>> <mailto:robbin.ehn at oracle.com>> wrote:
>>
>> Hi Roman,
>>
>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>> Hi Robbin,
>>>>
>>>> Far down ->
>>>>
>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>
>>>>>>
>>>>>> I'm not happy about this change:
>>>>>>
>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>> +    // TODO: Is this really needed?
>>>>>> +    OrderAccess::storestore();
>>>>>> +  }
>>>>>>
>>>>>> because we're adding an OrderAccess::storestore() to be consistent
>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>> which is only increasing the technical debt.
>>>>>>
>>>>>> So a couple of things above don't make sense to me:
>>>>>>
>>>>>>> - sweeper thread runs outside safepoint
>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>    I'm looking at) runs while all other threads (incl. the sweeper)
>>>>>>>    is holding still.
>>>>>>
>>>>>> and:
>>>>>>
>>>>>>> There should be no need for a storestore() (at least in 
>>>>>>> sweeper.cpp...
>>>>>
>>>>> Either one or the other are running. Either the VMThread is marking
>>>>> nmethods (during safepoint) or the sweeper threads are running 
>>>>> (outside
>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
>>>>> should be necessary.
>>>>>
>>>>>  From Igor's comment I can see how it happened though: Apparently 
>>>>> there
>>>>> *is* a race in sweeper's own concurrent processing (concurrent with
>>>>> compiler threads, as far as I understand). And there's a call to
>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is 
>>>>> required
>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's 
>>>>> also put
>>>>> a storestore() in the other places that call mark_as_seen_on_stack(),
>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>> discussing. (why the storestore() hasn't been put right into
>>>>> mark_as_seen_on_stack() I don't understand). In short, one 
>>>>> storestore()
>>>>> really was necessary, the other looks like it has been put there 'for
>>>>> consistency' or just conservatively. But it shouldn't be necessary in
>>>>> the safepoint cleanup code that we're discussing.
>>>>>
>>>>> So what should we do? Remove the storestore() for good? Refactor the
>>>>> code so that both paths at least call the storestore() in the same
>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>> storestore() in the dtor as proposed?)
>>>>
>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>
>>>> So there is a slight optimization when not running sweeper to skip
>>>> compiler barrier/fence in stw.
>>>>
>>>> Don't think that matter, so I propose something like:
>>>> -  long  stack_traversal_mark()                    { return
>>>> _stack_traversal_mark; }
>>>> -  void  set_stack_traversal_mark(long l)          {
>>>> _stack_traversal_mark = l; }
>>>> +  long  stack_traversal_mark()                    { return
>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>> +  void  set_stack_traversal_mark(long l)          {
>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>
>>>> Maybe make _stack_traversal_mark volatile also, just as a marking that
>>>> it is concurrent accessed.
>>>> And remove both storestore.
>>>>
>>>> "Also neither of these state variables are volatile in nmethod, so
>>>> even the compiler may reorder the stores"
>>>> Fortunately at least _state is volatile now.
>>>>
>>>> I think _state also should use la/rs semantics instead, but that's
>>>> another story.
>>> Like this?
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>
>> Yes, exactly, I like this!
>> Dan? Igor ? Tobias?
>>
>
> That seems correct.
>
> igor

I concur. And it gets rid of my complaint about the
uncommented OrderAccess::storestore().

The deltas since webrev.10 (the last one I reviewed fully):

src/share/vm/code/nmethod.hpp
     No comments.

src/share/vm/runtime/sweeper.cpp
     No comments.

src/share/vm/runtime/vmStructs.cpp
     No comments.

Thumbs up!

Again, very nice work on this change!

Dan

>
>> Thanks Roman!
>>
>> BTW I'm going on vacation (5w) in a few hours, but I will follow this 
>> thread/changeset to the end!
>>
>> /Robbin
>>
>>> Roman
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170707/a275818c/attachment.htm>

From rkennke at redhat.com  Mon Jul 10 10:38:50 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 10 Jul 2017 12:38:50 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <1910961c-11bd-0e86-dd03-4fce66b9969f@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
Message-ID: <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>

Ok, so I guess I need a sponsor for this now:

http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>

Roman

Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>
>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>> <mailto:robbin.ehn at oracle.com>> wrote:
>>
>> Hi Roman,
>>
>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>> Hi Robbin,
>>>>
>>>> Far down ->
>>>>
>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>
>>>>>>
>>>>>> I'm not happy about this change:
>>>>>>
>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>> +    // TODO: Is this really needed?
>>>>>> +    OrderAccess::storestore();
>>>>>> +  }
>>>>>>
>>>>>> because we're adding an OrderAccess::storestore() to be consistent
>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>> which is only increasing the technical debt.
>>>>>>
>>>>>> So a couple of things above don't make sense to me:
>>>>>>
>>>>>>> - sweeper thread runs outside safepoint
>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>    I'm looking at) runs while all other threads (incl. the sweeper)
>>>>>>>    is holding still.
>>>>>>
>>>>>> and:
>>>>>>
>>>>>>> There should be no need for a storestore() (at least in
>>>>>>> sweeper.cpp...
>>>>>
>>>>> Either one or the other are running. Either the VMThread is marking
>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>> (outside
>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
>>>>> should be necessary.
>>>>>
>>>>>  From Igor's comment I can see how it happened though: Apparently
>>>>> there
>>>>> *is* a race in sweeper's own concurrent processing (concurrent with
>>>>> compiler threads, as far as I understand). And there's a call to
>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>> required
>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>> also put
>>>>> a storestore() in the other places that call mark_as_seen_on_stack(),
>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>> discussing. (why the storestore() hasn't been put right into
>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>> storestore()
>>>>> really was necessary, the other looks like it has been put there 'for
>>>>> consistency' or just conservatively. But it shouldn't be necessary in
>>>>> the safepoint cleanup code that we're discussing.
>>>>>
>>>>> So what should we do? Remove the storestore() for good? Refactor the
>>>>> code so that both paths at least call the storestore() in the same
>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>> storestore() in the dtor as proposed?)
>>>>
>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>
>>>> So there is a slight optimization when not running sweeper to skip
>>>> compiler barrier/fence in stw.
>>>>
>>>> Don't think that matter, so I propose something like:
>>>> -  long  stack_traversal_mark()                    { return
>>>> _stack_traversal_mark; }
>>>> -  void  set_stack_traversal_mark(long l)          {
>>>> _stack_traversal_mark = l; }
>>>> +  long  stack_traversal_mark()                    { return
>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>> +  void  set_stack_traversal_mark(long l)          {
>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>
>>>> Maybe make _stack_traversal_mark volatile also, just as a marking that
>>>> it is concurrent accessed.
>>>> And remove both storestore.
>>>>
>>>> "Also neither of these state variables are volatile in nmethod, so
>>>> even the compiler may reorder the stores"
>>>> Fortunately at least _state is volatile now.
>>>>
>>>> I think _state also should use la/rs semantics instead, but that's
>>>> another story.
>>> Like this?
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>
>> Yes, exactly, I like this!
>> Dan? Igor ? Tobias?
>>
>
> That seems correct.
>
> igor
>
>> Thanks Roman!
>>
>> BTW I'm going on vacation (5w) in a few hours, but I will follow this
>> thread/changeset to the end!
>>
>> /Robbin
>>
>>> Roman
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170710/fc5796c9/attachment.htm>

From thomas.schatzl at oracle.com  Mon Jul 10 12:15:47 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 10 Jul 2017 14:15:47 +0200
Subject: RFR (S): 8183226: Remembered set summarization accesses not
 fully initialized java thread DCQS
In-Reply-To: <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com>
References: <1499083970.2802.33.camel@oracle.com>
 <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com>
Message-ID: <1499688947.2793.21.camel@oracle.com>

Hi Erik (and Stefan),

? thanks for your review.

On Fri, 2017-07-07 at 13:16 +0200, Erik Helin wrote:
> On 07/03/2017 02:12 PM, Thomas Schatzl wrote:
> > 
> > Hi all,
> Hi Thomas,
> 
> > 
> > ? can I get reviews for the following change that breaks some
> > dependency cycle in g1remset initialization to fix some (at this
> > time benign) bug when printing remembered set summarization
> > information?
> > 
> > The problem is that G1Remset initializes its internal remembered
> > set summarization helper data structure in the constructor, which
> > accesses some DCQS members before we call the initialize methods on
> > the various global DCQS'es in G1CollectedHeap::initialize().
> > By splitting the initialization of the remembered set summarization
> > into an extra method, this one can be called at the very end of
> > G1CollectedHeap::initialize(), thus breaking the dependency.
> I think there is an easier way to achieve this :) The default?
> constructor for G1RemSetSummary sets up almost all fields, and we
> make?
> it really set up _all_ fields, then I believe we are good:
> - G1RemSetSummary::_num_vtimes can be set up in the constructor,
> ???because the number of entries only depends on
> ???ConcurrentG1Refine::thread_num(), which is a static function that
> ???only return G1ConcRefinementThreads.
> - G1RemSetSummary::_rs_threads_vtimes can be allocated in the
> ???constructor.
> - The value for _rs_threads_vtimes can be initialized to 0, since the
> ???accumulated virtual time for the each concurrent refinement thread
> ???should be 0 (since they haven't even started yet).
> - Same reasoning as above goes for _sampling_thread_vtime.
> - _rem_set can be NULL
> 
> With the above changes, G1RemSet will call the default constructor
> (same as it currently does). The call to
> _prev_period_summary.initialize() will be removed from the G1RemSet
> constructor.
> 
> With the above changes, G1RemSetSummary::G1RemSetSummary() has no?
> dependencies on any other class, and is still initialized to the
> correct values. I think this is all that is needed to solve this
> problem.

Fine with me.

> 
> The rest, below this line, is just existing code that could really?
> benefit from a cleanup :)
> 
> The G1RemSetSummary::initialize method is no longer needed,?
> G1RemSetSummary can now instead have a constructor taking a G1RemSet*
> as argument. That constructor will do what
> G1RemSetSummary::initialize does today.

Unfortunately, no. We can't pass "this" in the constructor as the
compilers will complain about possible use of uninitialized class (or
so).

But we can always get the G1RemSet pointer from the global variables,
as implemented.

> In G1RemSet::print_periodic_summary_info, the code can then look
> like:
> 
> ???G1RemSetSummary current(this);
> ???_prev_period_summary.subtract(&current);
> 
> For extra, extra bonus points, we should make?
> G1RemSetSummary::subtract_from work the other way around, so that the
> above code reads:
> 
> ???G1RemSetSummary current(this);
> ???current.subtract(_prev_period_summary); // current -= prev
> 
> instead of what the code does today:
> 
> ???prev.subtract_from(current); // prev = current - prev
> 
> which to me reads completely backwards :)
> 

I think there has been some reason I do not remember right now why this
has been done that way. But I agree.

> Finally, it would be very nice for G1RemSetSummary to get a proper
> copy constructor, so that the last line in print_periodic_summary:
> 
> ???_prev_period_summary.set(&current);
> 
> can just become:
> 
> ???_prev_period_summary = current;
> 
> (G1RemSetSummary::set is just a copy-constructor in disguise)

I remember trying to avoid adding a copy constructor for fear of being
"too complicated" and unusual for Hotspot code.

> You don't need to do all the cleanups, but I think having a fully?
> functioning default constructor is a better way to solve this
> problem, rather than shuffling the call to initialize around. What do
> you think?

Let's defer the other suggested cleanups to a different CR.

In the following webrev I also added StefanJ's suggestion to extract
concurrent refinement initialization into a separate method.
(I do not really understand why that method is actually returning an
error code: all error conditions in ConcurrentG1Refine call
vm_shutdown_during_initialization() anyway - even that seems
superfluous: failing to allocate memory shuts down the VM already).

Webrevs:
http://cr.openjdk.java.net/~tschatzl/8183226/webrev.0_to_1/?(diff)
http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1/?(full)

Thanks,
? Thomas


From per.liden at oracle.com  Mon Jul 10 12:23:47 2017
From: per.liden at oracle.com (Per Liden)
Date: Mon, 10 Jul 2017 14:23:47 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com>
 <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com>
Message-ID: <26a925e4-295c-0178-9586-6ddf96a64a54@oracle.com>

Hi Roman,

On 2017-07-07 10:53, Roman Kennke wrote:
> Am 05.07.2017 um 13:12 schrieb Mikael Gerdin:
>> Hi Roman,
>>
>> On 2017-07-04 20:47, Roman Kennke wrote:
>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not all
>>> GCs need them. It makes sense to remove it from the CollectedHeap and
>>> CollectorPolicy interfaces and move them down to the actual subclasses
>>> that used them.
>>>
>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only
>>> used/implemented in the parallel GC. Also, I made this class AllStatic
>>> (was StackObj)

Thanks for cleaning this up.

May I suggest that the changes related to adaptive size policy is kept 
in one patch and the soft reference clearing stuff in another.

>>>
>>> Tested by running hotspot_gc jtreg tests without regressions.
>>>
>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
>>
>> Please correct me if I'm wrong here but it looks like all the non-G1
>> collectors set the _should_clear_all_soft_refs based on
>> gc_overhead_limit_near.
>> Perhaps the ClearedAllSoftRefs scoped object could be modified to only
>> work with GenCollectorPolicy derived policies (which include parallel
>> *shrugs*) and G1 should just stop worrying about _all_soft_refs_clear.
>> Looking closer, I can't even find G1 code looking at that member so
>> maybe it, too, should be moved to GenCollectorPolicy?
> I can't find any place where should_clear_all_soft_refs() would become
> true for G1.

For G1 it becomes true when calling WB_FullGC, so your patch changes the 
behavior for G1 here. WB_FullGC is meant to clear soft refs, but I 
looked through the tests and can't find any that currently depend on 
this behavior (but I could have missed it). So, I see two options here:

1) We change the behavior of WB_FullGC to not guarantee any clearing of 
soft refs, in which case WB_FullGC should never call 
set_should_clear_all_soft_refs() for any GC. Having WB_FullGC clear soft 
refs in GCs but not others seems arbitrary and I can't see the value in 
that.

or

2) We keep the current behavior of WB_FullGC (i.e. always clear soft 
refs). This of course makes the move of set_should_clear_all_soft_refs() 
to GenCollectorPolicy problematic. We could consider changing 
CollectedHeap::collect() to also take a "bool clear_soft_ref", or we 
could say that it's up to each collector to do the right thing when they 
get called with GCCause::_wb_full_gc.

cheers,
Per

> And, as you mention, G1 doesn't even look at
> all_soft_refs_clear() either. I removed those parts from G1, and moved
> all soft_refs stuff down to GenCollectorPolicy.
>
> I also changed the way the casting accessors as_generation_policy() etc
> work: the as_* accessors now crash with ShouldNotReachHere() when called
> for the wrong policy type, and the is_* accessors now return constant
> true/false based on their type (so that it doesn't crash with
> ShouldNotReachHere() ..). I think this is more useful than the way it's
> been done before.
>
> http://cr.openjdk.java.net/~rkennke/8179268/webrev.01/
> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.01/>
>
>
> Tested by: hotspot_gc jtreg tests.
>
> What do you think?
>
> Roman
>


From thomas.schatzl at oracle.com  Mon Jul 10 12:52:34 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 10 Jul 2017 14:52:34 +0200
Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure
In-Reply-To: <08286762-411b-3079-9802-814c806af946@oracle.com>
References: <1499156663.2761.6.camel@oracle.com>
 <08286762-411b-3079-9802-814c806af946@oracle.com>
Message-ID: <1499691154.2793.26.camel@oracle.com>

Hi Erik,

? thanks for your review.

On Thu, 2017-07-06 at 14:52 +0200, Erik Helin wrote:
> Hi Thomas,
> 
> On 07/04/2017 10:24 AM, Thomas Schatzl wrote:
> > 
> > Hi all,
> > 
> > ? can I get reviews for this change that renames and cleans up the
> > use
> > of RefineCardTableEntryClosure in the code?
> > 
> > RefineCardTableEntryClosure is the closure that is applied by the
> > concurrent refinement threads. This change renames it slightly to
> > indicate its use (G1RefineCardConcurrentlyClosure) and moves it to
> > the G1RemSet files close to the closure that we use for
> > refinement/Update RS during GC.
> great cleanup! Looking at the code, what do you think about moving?
> G1RefineCardConcurrentlyClosure into concurrentG1RefineThread.cpp
> (and make it a private class to ConcurrentG1RefineThread)? AFAICS,?
> ConcurrentG1RefineThread is the only code using this closure.
> 

There are also other users of that closure, e.g. the DCQS's need a
reference to it during initialization.

However by moving the?G1RefineCardConcurrentlyClosure and some
refactoring there are (imho) some gains in encapsulation as we
discussed.

> If we do it this way, then we can actually make?
> DirtyCardQueueSet::apply_closure_to_completed_buffer a template
> method, ?taking the Closure a template, as in:
> template <typename Closure>
> bool apply_closure_to_completed_buffer(Closure* cl,
> ????????????????????????????????????????uint worker_i,
> ????????????????????????????????????????size_t stop_at,
> ????????????????????????????????????????bool during_pause)
> This means that closures could get inlined, which doesn't mean that
> much for G1RefineCardConcurrentlyClosure, but could give a small
> boost for G1RefineCardClosure (for that to work,?
> G1CollectedHeap::iterate_dirty_card_closure must take a?
> G1RefineCardClosure, but that is ok, because that is the only closure
> type we pass to that method).
> 
> Also, you do not need the forward declaration in G1CollectedHeap, it?
> will not make use of this closure then :)
> 
> If you want to "go the extra mile", then you can also pass a
> G1RemSet* as an argument to the G1RefineCardConcurrentlyClosure
> constructor and store it in a field, to avoid accessing the
> G1CollectedHeap via the singleton:
> G1CollectedHeap::heap()->g1_rem_set()-
> >refine_card_concurrently(card_ptr,?
> worker_i); (plus, G1RefineCardConcurrentlyClosure only needs a
> G1RemSet* pointer anyway ;))

I think these perf improvements should be targeted in a different CR.
:) The change already doubled in size...

Webrevs for current changes:
http://cr.openjdk.java.net/~tschatzl/8183128/webrev.0_to_1 (diff)
http://cr.openjdk.java.net/~tschatzl/8183128/webrev.1 (full)

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Mon Jul 10 13:01:20 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 10 Jul 2017 15:01:20 +0200
Subject: RFR: 8177544: Restructure G1 Full GC code
In-Reply-To: <f6253d8e-210e-b87f-2e75-48ee24f13291@oracle.com>
References: <62d1f02b-1fc0-ffcf-b8e0-e88ebacecebe@oracle.com>
 <1497346566.2829.33.camel@oracle.com>
 <f6253d8e-210e-b87f-2e75-48ee24f13291@oracle.com>
Message-ID: <1499691680.2793.29.camel@oracle.com>

Hi Stefan,

On Wed, 2017-06-14 at 16:45 +0200, Stefan Johansson wrote:
> Thanks Thomas for reviewing,
> 
> On 2017-06-13 11:36, Thomas Schatzl wrote:
> > 
> > Hi,
> > 
> > ???thanks for your hard work on the parallel full gc that starts
> > with this refactoring :)
> :)
> > 
> > On Thu, 2017-06-08 at 14:35 +0200, Stefan Johansson wrote:
> > > 
> > > Hi,
> > > 
> > > Please review this enhancement:
> > > https://bugs.openjdk.java.net/browse/JDK-8177544
> > > 
> > > Webrev:
> > > http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.00/
> > > 
> > [... lots of suggested changes from me...]

Thanks for these changes.

> > 
> > Actually, if it were for me, I would put the whole full gc setup
> > and
> > teardown into a separate class/file.
> > 
> > Have public gc_prologue()/collect()/gc_epilogue() methods where
> > gc_prologue() is the first part of do_full_collection_inner() until
> > application of the G1SerialCollector, collect() the instantiation
> > and application of G1SerialCollector, and gc_epilogue() the
> > remainder.
> > 
> > E.g. in G1CollectedHeap we only have the calls to these three
> > methods (there is no need to have all three).
> > 
> > At least I think it would help a lot if all that full gc stuff
> > would be separate physically from do-all-G1CollectedHeap.
> > With the G1FullGCScope there is almost no reference to
> > G1CollectedHeap afaics.
> > 
> > (There is _allocator->init_mutator_alloc_region() call)
> I see your point and I think it would be good. But as we discussed
> over chat, might be something to look at once everything else in this
> area is done. Will create a RFE for this.

Yes, that's fine.

> 
> > 
> > ???- g1CollectedHeap.hpp: please try to sort the definitions of the
> > new methods in order of calling them.
> Done.
> 
> Here are updated webrevs:
> Full: http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.01/
> Inc: http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.00-01/
> 

Looks good to me. Sorry for the late reply.

Thanks,
? Thomas


From erik.helin at oracle.com  Mon Jul 10 13:13:10 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 10 Jul 2017 15:13:10 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
Message-ID: <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>

On 07/07/2017 03:21 PM, Roman Kennke wrote:
> Am 07.07.2017 um 14:35 schrieb Erik Helin:
>> On 07/06/2017 06:23 PM, Roman Kennke wrote:
>>>>>> Ok to push this?
>>>>
>>>> I just realized that your change doesn't build on Windows since you
>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really picky
>>>> about that.
>>>> /Mikael
>>>
>>> Uhhh.
>>> Ok, here's revision #3 with precompiled added in:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/
>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.03/>
>>
>> Hi Roman,
>>
>> I just started looking :) I think GenCollectedHeap::gc_prologue and
>> GenCollectedHeap::gc_epilogue should be virtual, and
>> always_do_update_barrier = UseConcMarkSweepGC moved down
>> CMSHeap::gc_epilogue.
>>
>> What do you think?
>
> Yes, I have seen that. My original plan was to leave it as is because I
> know that Erik ?. is working on a big barrier set refactoring that would
> remove this code anyway. However, it doesn't really matter, here's the
> cleaned up patch:
>
> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/
> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.04/>

A few comments:

cmsHeap.hpp:
- you are missing quite a few #includes, but it works since
   genCollectedHeap.hpp #includes a whole lot of stuff. Not necessary to
   fix now, because the "missing #include" will start to pop up when
   someone tries to break apart GenCollectedHeap into smaller pieces.

- why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they
   be private in CMSHeap?

- there are two `private:` blocks, please use only one `private:`
   block.

- one extra newline here:
   32 class CMSHeap : public GenCollectedHeap {
   33

- one extra newline here:
   46
   47

cmsHeap.cpp:
- one extra newline here:
   36 CMSHeap::CMSHeap(GenCollectorPolicy *policy) : 
GenCollectedHeap(policy) {
   37

- one extra newline here:
   65
   66

- do you need to use `this` here?
   87   this->GenCollectedHeap::print_on_error(st);

   Isn't it enough to just GenCollectedHeap::print_on_error(st)?

- one extra newline here:
   92 bool CMSHeap::create_cms_collector() {
   93

- this is pre-existing, but since we are copying code, do we want to
   clean it up?
   104   if (collector == NULL || !collector->completed_initialization()) {
   105     if (collector) {
   106       delete collector;  // Be nice in embedded situation
   107     }
   108     vm_shutdown_during_initialization("Could not create CMS 
collector");
   109     return false;
   110   }

   The collector == NULL check is not needed here. CMSCollector derives
   from CHeapObj and CHeapObj::operator new will by default do
   vm_exit_out_of_memory if the returned memory is NULL. The check can
   just be:

   if (!collector->completed_initialization()) {
     vm_shutdown_during_initialization("Could not create CMS collector");
     return false;
   }
   return true;

- maybe skip the // success comment here:
   111   return true;  // success

- is it possible to end up in CMSHeap::should_do_concurrent_full_gc()
   if we are not using CMS? As in:
   123 bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) {
   124   if (!UseConcMarkSweepGC) {
   125     return false;
   126   }

- one extra newline here:
   135
   136

genCollectedHeap.hpp:
- I don't think you have to make _skip_header_HeapWords protected.
   Instead I think we can skip_header_HeapWords() virtual, make it
   return 0 in GenCollectedHeap and return
   CMSCollector::skip_header_HeapWords in CMSHeap and just remove the _
   skip_header_HeapWords variable.

- do you really need #ifdef ASSERT around check_gen_kinds?

- can you make GCH_strong_roots_tasks a protected enum in
   GenCollectedHeap? As in
   class GenCollectedHeap : public CollectedHeap {
   protected:
     enum StrongRootTasks {
       GCH_PS_Universe_oops_do,
     };
   };

Have you though about vmStructs.cpp, does it need any changes?

Thanks,
Erik

> Roman
>


From rkennke at redhat.com  Mon Jul 10 13:36:29 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 10 Jul 2017 15:36:29 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <26a925e4-295c-0178-9586-6ddf96a64a54@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com>
 <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com>
 <26a925e4-295c-0178-9586-6ddf96a64a54@oracle.com>
Message-ID: <0263e88c-0618-1c54-7a51-f8163e4b7e09@redhat.com>

Am 10.07.2017 um 14:23 schrieb Per Liden:
> Hi Roman,
>
> On 2017-07-07 10:53, Roman Kennke wrote:
>> Am 05.07.2017 um 13:12 schrieb Mikael Gerdin:
>>> Hi Roman,
>>>
>>> On 2017-07-04 20:47, Roman Kennke wrote:
>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not
>>>> all
>>>> GCs need them. It makes sense to remove it from the CollectedHeap and
>>>> CollectorPolicy interfaces and move them down to the actual subclasses
>>>> that used them.
>>>>
>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's
>>>> only
>>>> used/implemented in the parallel GC. Also, I made this class AllStatic
>>>> (was StackObj)
>
> Thanks for cleaning this up.
>
> May I suggest that the changes related to adaptive size policy is kept
> in one patch and the soft reference clearing stuff in another.

Ok... so we can go back to review the first revision of the patch and
deal with the softrefs stuff in a followup?

http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
<http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>


>
> For G1 it becomes true when calling WB_FullGC, so your patch changes
> the behavior for G1 here. WB_FullGC is meant to clear soft refs, but I
> looked through the tests and can't find any that currently depend on
> this behavior (but I could have missed it). So, I see two options here:
>
> 1) We change the behavior of WB_FullGC to not guarantee any clearing
> of soft refs, in which case WB_FullGC should never call
> set_should_clear_all_soft_refs() for any GC. Having WB_FullGC clear
> soft refs in GCs but not others seems arbitrary and I can't see the
> value in that.
>
> or
>
> 2) We keep the current behavior of WB_FullGC (i.e. always clear soft
> refs). This of course makes the move of
> set_should_clear_all_soft_refs() to GenCollectorPolicy problematic. We
> could consider changing CollectedHeap::collect() to also take a "bool
> clear_soft_ref", or we could say that it's up to each collector to do
> the right thing when they get called with GCCause::_wb_full_gc.

Ok.
I'd argue it's up to the GC. I am not totally famiiar with the WB stuff,
but I'd expect it to do something similar to what would happen if
applications call the usual API, which is, in this case, System.gc(),
which goes through JVM_GC() which in turn calls heap->collect()
*without* setting the set_should_clear_all_soft_refs(). Right?

In any case, if we don't want this stuff under this enhancement ID, then
we'll discuss it under the followup ID, right?

Roman


From erik.helin at oracle.com  Mon Jul 10 13:37:00 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 10 Jul 2017 15:37:00 +0200
Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure
In-Reply-To: <1499691154.2793.26.camel@oracle.com>
References: <1499156663.2761.6.camel@oracle.com>
 <08286762-411b-3079-9802-814c806af946@oracle.com>
 <1499691154.2793.26.camel@oracle.com>
Message-ID: <02a0acb5-2632-d7d7-18d6-41242c4d9dac@oracle.com>

On 07/10/2017 02:52 PM, Thomas Schatzl wrote:
> ...
 >
> I think these perf improvements should be targeted in a different CR.
> :) The change already doubled in size...

Alright, let me take care of that, once you have pushed this :)

> Webrevs for current changes:
> http://cr.openjdk.java.net/~tschatzl/8183128/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8183128/webrev.1 (full)

Looks very good, thank you Thomas! Reviewed!
Erik

> Thanks,
>   Thomas
>


From stefan.johansson at oracle.com  Mon Jul 10 13:52:17 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 10 Jul 2017 15:52:17 +0200
Subject: RFR (S): 8183128: Update RefineCardTableEntryClosure
In-Reply-To: <02a0acb5-2632-d7d7-18d6-41242c4d9dac@oracle.com>
References: <1499156663.2761.6.camel@oracle.com>
 <08286762-411b-3079-9802-814c806af946@oracle.com>
 <1499691154.2793.26.camel@oracle.com>
 <02a0acb5-2632-d7d7-18d6-41242c4d9dac@oracle.com>
Message-ID: <799ec7a0-9c28-4ba8-4541-8611754667c0@oracle.com>

Hi Thomas,

On 2017-07-10 15:37, Erik Helin wrote:
> On 07/10/2017 02:52 PM, Thomas Schatzl wrote:
>> ...
> >
>> I think these perf improvements should be targeted in a different CR.
>> :) The change already doubled in size...
>
> Alright, let me take care of that, once you have pushed this :)
>
>> Webrevs for current changes:
>> http://cr.openjdk.java.net/~tschatzl/8183128/webrev.0_to_1 (diff)
>> http://cr.openjdk.java.net/~tschatzl/8183128/webrev.1 (full)
>
> Looks very good, thank you Thomas! Reviewed!
Looks good,
StefanJ
> Erik
>
>> Thanks,
>>   Thomas
>>


From per.liden at oracle.com  Mon Jul 10 13:59:31 2017
From: per.liden at oracle.com (Per Liden)
Date: Mon, 10 Jul 2017 15:59:31 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <0263e88c-0618-1c54-7a51-f8163e4b7e09@redhat.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <134884f8-7001-f0ee-9e57-9ec0b2520752@oracle.com>
 <8764a8c9-995c-fec2-9c98-e35f43ccd4d6@redhat.com>
 <26a925e4-295c-0178-9586-6ddf96a64a54@oracle.com>
 <0263e88c-0618-1c54-7a51-f8163e4b7e09@redhat.com>
Message-ID: <7c177480-a74c-45b8-af92-47b1d4cbb46e@oracle.com>

Hi,

On 2017-07-10 15:36, Roman Kennke wrote:
> Am 10.07.2017 um 14:23 schrieb Per Liden:
>> Hi Roman,
>>
>> On 2017-07-07 10:53, Roman Kennke wrote:
>>> Am 05.07.2017 um 13:12 schrieb Mikael Gerdin:
>>>> Hi Roman,
>>>>
>>>> On 2017-07-04 20:47, Roman Kennke wrote:
>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not
>>>>> all
>>>>> GCs need them. It makes sense to remove it from the CollectedHeap and
>>>>> CollectorPolicy interfaces and move them down to the actual subclasses
>>>>> that used them.
>>>>>
>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's
>>>>> only
>>>>> used/implemented in the parallel GC. Also, I made this class AllStatic
>>>>> (was StackObj)
>>
>> Thanks for cleaning this up.
>>
>> May I suggest that the changes related to adaptive size policy is kept
>> in one patch and the soft reference clearing stuff in another.
>
> Ok... so we can go back to review the first revision of the patch and
> deal with the softrefs stuff in a followup?

Sounds good, I'll reply to your first mail separately.

>
> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>
>
>
>
>>
>> For G1 it becomes true when calling WB_FullGC, so your patch changes
>> the behavior for G1 here. WB_FullGC is meant to clear soft refs, but I
>> looked through the tests and can't find any that currently depend on
>> this behavior (but I could have missed it). So, I see two options here:
>>
>> 1) We change the behavior of WB_FullGC to not guarantee any clearing
>> of soft refs, in which case WB_FullGC should never call
>> set_should_clear_all_soft_refs() for any GC. Having WB_FullGC clear
>> soft refs in GCs but not others seems arbitrary and I can't see the
>> value in that.
>>
>> or
>>
>> 2) We keep the current behavior of WB_FullGC (i.e. always clear soft
>> refs). This of course makes the move of
>> set_should_clear_all_soft_refs() to GenCollectorPolicy problematic. We
>> could consider changing CollectedHeap::collect() to also take a "bool
>> clear_soft_ref", or we could say that it's up to each collector to do
>> the right thing when they get called with GCCause::_wb_full_gc.
>
> Ok.
> I'd argue it's up to the GC.

I'm fine with that, as long as we make sure all GCs actually do the same 
thing so that the meaning of GCCause::_wb_full_gc doesn't differs from 
GC to GC.

> I am not totally famiiar with the WB stuff,
> but I'd expect it to do something similar to what would happen if
> applications call the usual API, which is, in this case, System.gc(),
> which goes through JVM_GC() which in turn calls heap->collect()
> *without* setting the set_should_clear_all_soft_refs(). Right?

The WB interface is for whitebox testing, i.e. an interface for tests 
that need to tell the GC to do something more specific than just 
"System.gc()". For example, "do a young GC" (WB_YoungGC) or "clear all 
soft refs and do a full GC" (WB_FullGC).

>
> In any case, if we don't want this stuff under this enhancement ID, then
> we'll discuss it under the followup ID, right?

Sounds good! Thanks!

/Per

>
> Roman
>


From rkennke at redhat.com  Mon Jul 10 14:10:59 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 10 Jul 2017 16:10:59 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
 <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
Message-ID: <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>

Am 10.07.2017 um 15:13 schrieb Erik Helin:
> On 07/07/2017 03:21 PM, Roman Kennke wrote:
>> Am 07.07.2017 um 14:35 schrieb Erik Helin:
>>> On 07/06/2017 06:23 PM, Roman Kennke wrote:
>>>>>>> Ok to push this?
>>>>>
>>>>> I just realized that your change doesn't build on Windows since you
>>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really
>>>>> picky
>>>>> about that.
>>>>> /Mikael
>>>>
>>>> Uhhh.
>>>> Ok, here's revision #3 with precompiled added in:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.03/>
>>>
>>> Hi Roman,
>>>
>>> I just started looking :) I think GenCollectedHeap::gc_prologue and
>>> GenCollectedHeap::gc_epilogue should be virtual, and
>>> always_do_update_barrier = UseConcMarkSweepGC moved down
>>> CMSHeap::gc_epilogue.
>>>
>>> What do you think?
>>
>> Yes, I have seen that. My original plan was to leave it as is because I
>> know that Erik ?. is working on a big barrier set refactoring that would
>> remove this code anyway. However, it doesn't really matter, here's the
>> cleaned up patch:
>>
>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/
>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.04/>
>
> A few comments:
>
> cmsHeap.hpp:
> - you are missing quite a few #includes, but it works since
>   genCollectedHeap.hpp #includes a whole lot of stuff. Not necessary to
>   fix now, because the "missing #include" will start to pop up when
>   someone tries to break apart GenCollectedHeap into smaller pieces.
Right.
I always try to minimize includes, especially in header files (they are
bound to proliferate later anyway). In addition to that, if a class is
only referenced as pointer, I avoid includes and use forward class
definition instead.

>
> - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they
>   be private in CMSHeap?
They are virtual and protected in GenCollectedHeap and called by
GenCollectedHeap. Makes sense to also make them protected in CMSHeap? Or
am I missing something?

> - there are two `private:` blocks, please use only one `private:`
>   block.
>
Fixed.
> - one extra newline here:
>   32 class CMSHeap : public GenCollectedHeap {
>   33
>
> - one extra newline here:
>   46
>   47
>
> cmsHeap.cpp:
> - one extra newline here:
>   36 CMSHeap::CMSHeap(GenCollectorPolicy *policy) :
> GenCollectedHeap(policy) {
>   37
>
> - one extra newline here:
>   65
>   66
>
Removed all of them.

> - do you need to use `this` here?
>   87   this->GenCollectedHeap::print_on_error(st);
>
>   Isn't it enough to just GenCollectedHeap::print_on_error(st)?
Yes, it is. Just a habit of mine to make it more readable (to me). Fixed it.
> - one extra newline here:
>   92 bool CMSHeap::create_cms_collector() {
>   93
Fixed.
> - this is pre-existing, but since we are copying code, do we want to
>   clean it up?
>   104   if (collector == NULL ||
> !collector->completed_initialization()) {
>   105     if (collector) {
>   106       delete collector;  // Be nice in embedded situation
>   107     }
>   108     vm_shutdown_during_initialization("Could not create CMS
> collector");
>   109     return false;
>   110   }
>
>   The collector == NULL check is not needed here. CMSCollector derives
>   from CHeapObj and CHeapObj::operator new will by default do
>   vm_exit_out_of_memory if the returned memory is NULL. The check can
>   just be:
>
>   if (!collector->completed_initialization()) {
>     vm_shutdown_during_initialization("Could not create CMS collector");
>     return false;
>   }
>   return true;
>
Ok, good point. Fixed.

> - maybe skip the // success comment here:
>   111   return true;  // success
That was probably pre-existing too. Should be thankful that it did not
say return true; // return true :-P

> - is it possible to end up in CMSHeap::should_do_concurrent_full_gc()
>   if we are not using CMS? As in:
>   123 bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) {
>   124   if (!UseConcMarkSweepGC) {
>   125     return false;
>   126   }
>
Duh. Fixed.

> - one extra newline here:
>   135
>   136
>
> genCollectedHeap.hpp:
> - I don't think you have to make _skip_header_HeapWords protected.
>   Instead I think we can skip_header_HeapWords() virtual, make it
>   return 0 in GenCollectedHeap and return
>   CMSCollector::skip_header_HeapWords in CMSHeap and just remove the _
>   skip_header_HeapWords variable.
Great catch! I love it when refactoring leads to simplifications...
Fixed.
> - do you really need #ifdef ASSERT around check_gen_kinds?
>
No, not really.

> - can you make GCH_strong_roots_tasks a protected enum in
>   GenCollectedHeap? As in
>   class GenCollectedHeap : public CollectedHeap {
>   protected:
>     enum StrongRootTasks {
>       GCH_PS_Universe_oops_do,
>     };
>   };
>
Good idea. Done.

> Have you though about vmStructs.cpp, does it need any changes?
No. I don't really know what needs to go in there. I added:

  declare_constant(CollectedHeap::CMSHeap)                                \

just so that it's there next to the other heap types. Not sure what else
may be needed, if anything?

http://cr.openjdk.java.net/~rkennke/8179387/webrev.05/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.05/>

Better now?

Roman


From per.liden at oracle.com  Mon Jul 10 14:54:04 2017
From: per.liden at oracle.com (Per Liden)
Date: Mon, 10 Jul 2017 16:54:04 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
Message-ID: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>

Hi,

On 2017-07-04 20:47, Roman Kennke wrote:
> AdaptiveSizePolicy is not used/called from outside the GCs, and not all
> GCs need them. It makes sense to remove it from the CollectedHeap and
> CollectorPolicy interfaces and move them down to the actual subclasses
> that used them.
>
> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only
> used/implemented in the parallel GC. Also, I made this class AllStatic
> (was StackObj)

AdaptiveSizePolicyOutput::print() is actually called from 
runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine with 
moving it, but we should have the proper #includes in java.cpp.

(Your patch doesn't actually build in its current form. I suspect you're 
using precompiled headers which have a tendency to hide a lot of errors 
caused by missing includes)

>
> Tested by running hotspot_gc jtreg tests without regressions.
>
> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/

collectorPolicy.hpp:
--------------------
  258   void cleared_all_soft_refs();

Please declare this virtual too (that's the best we can do to signal 
intent until we have C++11/override)


collectorPolicy.cpp:
--------------------
  224   this->CollectorPolicy::cleared_all_soft_refs();

Please remove "this->" to match the super-call style used in other 
places in this file.

Btw, I can sponsor the patch if you want.

cheers,
Per

> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>
>
> Roman
>


From rkennke at redhat.com  Mon Jul 10 16:35:40 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 10 Jul 2017 18:35:40 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
Message-ID: <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>

Hi Per,

thanks for the review!

>
>> AdaptiveSizePolicy is not used/called from outside the GCs, and not all
>> GCs need them. It makes sense to remove it from the CollectedHeap and
>> CollectorPolicy interfaces and move them down to the actual subclasses
>> that used them.
>>
>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only
>> used/implemented in the parallel GC. Also, I made this class AllStatic
>> (was StackObj)
>
> AdaptiveSizePolicyOutput::print() is actually called from
> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
> with moving it, but we should have the proper #includes in java.cpp.
>
> (Your patch doesn't actually build in its current form. I suspect
> you're using precompiled headers which have a tendency to hide a lot
> of errors caused by missing includes)
>
I added the include.

>>
>> Tested by running hotspot_gc jtreg tests without regressions.
>>
>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
>
> collectorPolicy.hpp:
> --------------------
>  258   void cleared_all_soft_refs();
>
> Please declare this virtual too (that's the best we can do to signal
> intent until we have C++11/override)
>
Ok.

>
> collectorPolicy.cpp:
> --------------------
>  224   this->CollectorPolicy::cleared_all_soft_refs();
>
> Please remove "this->" to match the super-call style used in other
> places in this file.

ok.


>
> Btw, I can sponsor the patch if you want.

Find the updated webrev here:

http://cr.openjdk.java.net/~rkennke/8179268/webrev.03/
<http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.03/>

Cheers,
Roman

>
> cheers,
> Per
>
>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>
>>
>> Roman
>>


From robbin.ehn at oracle.com  Mon Jul 10 18:50:15 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 10 Jul 2017 20:50:15 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <d4218cc0-9377-0df3-4d62-074d55357286@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
Message-ID: <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>

I'll start a push now.

/Robbin

On 2017-07-10 12:38, Roman Kennke wrote:
> Ok, so I guess I need a sponsor for this now:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>
> Roman
>
> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>
>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>
>>> Hi Roman,
>>>
>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>> Hi Robbin,
>>>>>
>>>>> Far down ->
>>>>>
>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>
>>>>>>>
>>>>>>> I'm not happy about this change:
>>>>>>>
>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>> +    // TODO: Is this really needed?
>>>>>>> +    OrderAccess::storestore();
>>>>>>> +  }
>>>>>>>
>>>>>>> because we're adding an OrderAccess::storestore() to be consistent
>>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>>> which is only increasing the technical debt.
>>>>>>>
>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>
>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>>    I'm looking at) runs while all other threads (incl. the sweeper)
>>>>>>>>    is holding still.
>>>>>>>
>>>>>>> and:
>>>>>>>
>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>> sweeper.cpp...
>>>>>>
>>>>>> Either one or the other are running. Either the VMThread is marking
>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>> (outside
>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
>>>>>> should be necessary.
>>>>>>
>>>>>>  From Igor's comment I can see how it happened though: Apparently
>>>>>> there
>>>>>> *is* a race in sweeper's own concurrent processing (concurrent with
>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>> required
>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>> also put
>>>>>> a storestore() in the other places that call mark_as_seen_on_stack(),
>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>> storestore()
>>>>>> really was necessary, the other looks like it has been put there 'for
>>>>>> consistency' or just conservatively. But it shouldn't be necessary in
>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>
>>>>>> So what should we do? Remove the storestore() for good? Refactor the
>>>>>> code so that both paths at least call the storestore() in the same
>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>>> storestore() in the dtor as proposed?)
>>>>>
>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>
>>>>> So there is a slight optimization when not running sweeper to skip
>>>>> compiler barrier/fence in stw.
>>>>>
>>>>> Don't think that matter, so I propose something like:
>>>>> -  long  stack_traversal_mark()                    { return
>>>>> _stack_traversal_mark; }
>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>> _stack_traversal_mark = l; }
>>>>> +  long  stack_traversal_mark()                    { return
>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>
>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking that
>>>>> it is concurrent accessed.
>>>>> And remove both storestore.
>>>>>
>>>>> "Also neither of these state variables are volatile in nmethod, so
>>>>> even the compiler may reorder the stores"
>>>>> Fortunately at least _state is volatile now.
>>>>>
>>>>> I think _state also should use la/rs semantics instead, but that's
>>>>> another story.
>>>> Like this?
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>
>>> Yes, exactly, I like this!
>>> Dan? Igor ? Tobias?
>>>
>>
>> That seems correct.
>>
>> igor
>>
>>> Thanks Roman!
>>>
>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this
>>> thread/changeset to the end!
>>>
>>> /Robbin
>>>
>>>> Roman
>>
>


From robbin.ehn at oracle.com  Mon Jul 10 19:22:59 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 10 Jul 2017 21:22:59 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <2b466176-b688-53a8-bef9-c7ec2c8c745b@oracle.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
Message-ID: <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>

Hi, unfortunately the push failed on 32-bit.

(looks like _stack_traversal_mark should be jlong, I feel a bit guilty)

I do not have anytime to look at this, so here is the error.

/Robbin

make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In member function 'long int nmethod::stack_traversal_mark()':
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: note: candidates are:
In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: note: static jint OrderAccess::load_acquire(const volatile jint*) <near match>
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: note:   no known conversion for argument 1 from 'volatile long int*' to 'const volatile jint* {aka const volatile int*}'
In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: note: static juint OrderAccess::load_acquire(const volatile juint*) <near match>
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: note:   no known conversion for argument 1 from 'volatile long int*' to 'const volatile juint* {aka const volatile unsigned int*}'
In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In member function 'void nmethod::set_stack_traversal_mark(long int)':
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: error: call of overloaded 'release_store(volatile long int*, long int&)' is ambiguous
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: note: candidates are:
In file included from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
                  from /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: note: static void OrderAccess::release_store(volatile jint*, jint) <near match>
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: note:   no known conversion for argument 1 from 'volatile long int*' to 'volatile jint* {aka volatile int*}'
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: note: static void OrderAccess::release_store(volatile juint*, juint) <near match>
/opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: note:   no known conversion for argument 1 from 'volatile long int*' to 'volatile juint* {aka volatile unsigned int*}'

On 2017-07-10 20:50, Robbin Ehn wrote:
> I'll start a push now.
>
> /Robbin
>
> On 2017-07-10 12:38, Roman Kennke wrote:
>> Ok, so I guess I need a sponsor for this now:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>
>> Roman
>>
>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>
>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>
>>>> Hi Roman,
>>>>
>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>> Hi Robbin,
>>>>>>
>>>>>> Far down ->
>>>>>>
>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> I'm not happy about this change:
>>>>>>>>
>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>> +    OrderAccess::storestore();
>>>>>>>> +  }
>>>>>>>>
>>>>>>>> because we're adding an OrderAccess::storestore() to be consistent
>>>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>>>> which is only increasing the technical debt.
>>>>>>>>
>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>
>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>>>    I'm looking at) runs while all other threads (incl. the
>>>>>>>>> sweeper)
>>>>>>>>>    is holding still.
>>>>>>>>
>>>>>>>> and:
>>>>>>>>
>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>> sweeper.cpp...
>>>>>>>
>>>>>>> Either one or the other are running. Either the VMThread is marking
>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>> (outside
>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no storestore()
>>>>>>> should be necessary.
>>>>>>>
>>>>>>>  From Igor's comment I can see how it happened though: Apparently
>>>>>>> there
>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent with
>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>> required
>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>> also put
>>>>>>> a storestore() in the other places that call
>>>>>>> mark_as_seen_on_stack(),
>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>> storestore()
>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>> 'for
>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>> necessary in
>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>
>>>>>>> So what should we do? Remove the storestore() for good? Refactor the
>>>>>>> code so that both paths at least call the storestore() in the same
>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>>>> storestore() in the dtor as proposed?)
>>>>>>
>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>
>>>>>> So there is a slight optimization when not running sweeper to skip
>>>>>> compiler barrier/fence in stw.
>>>>>>
>>>>>> Don't think that matter, so I propose something like:
>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>> _stack_traversal_mark; }
>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>> _stack_traversal_mark = l; }
>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>
>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>> that
>>>>>> it is concurrent accessed.
>>>>>> And remove both storestore.
>>>>>>
>>>>>> "Also neither of these state variables are volatile in nmethod, so
>>>>>> even the compiler may reorder the stores"
>>>>>> Fortunately at least _state is volatile now.
>>>>>>
>>>>>> I think _state also should use la/rs semantics instead, but that's
>>>>>> another story.
>>>>> Like this?
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>
>>>> Yes, exactly, I like this!
>>>> Dan? Igor ? Tobias?
>>>>
>>>
>>> That seems correct.
>>>
>>> igor
>>>
>>>> Thanks Roman!
>>>>
>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this
>>>> thread/changeset to the end!
>>>>
>>>> /Robbin
>>>>
>>>>> Roman
>>>
>>


From rkennke at redhat.com  Mon Jul 10 20:07:59 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 10 Jul 2017 22:07:59 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
 <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
Message-ID: <266bd634-b1a5-0f93-733a-22faf5e785f3@redhat.com>

Ugh.

I changed the field and accessors and a few related entries
(vmStructs..) to jlong. I am doing this blindly... I have no way to test
32bit here. It does build for me ;-)

http://cr.openjdk.java.net/~rkennke/8180932/webrev.13/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.13/>

Roman

Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
> Hi, unfortunately the push failed on 32-bit.
>
> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>
> I do not have anytime to look at this, so here is the error.
>
> /Robbin
>
> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
> member function 'long int nmethod::stack_traversal_mark()':
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
> error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
> note: candidates are:
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
> note: static jint OrderAccess::load_acquire(const volatile jint*)
> <near match>
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
> note:   no known conversion for argument 1 from 'volatile long int*'
> to 'const volatile jint* {aka const volatile int*}'
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
> note: static juint OrderAccess::load_acquire(const volatile juint*)
> <near match>
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
> note:   no known conversion for argument 1 from 'volatile long int*'
> to 'const volatile juint* {aka const volatile unsigned int*}'
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
> member function 'void nmethod::set_stack_traversal_mark(long int)':
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
> error: call of overloaded 'release_store(volatile long int*, long
> int&)' is ambiguous
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
> note: candidates are:
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
> note: static void OrderAccess::release_store(volatile jint*, jint)
> <near match>
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
> note:   no known conversion for argument 1 from 'volatile long int*'
> to 'volatile jint* {aka volatile int*}'
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
> note: static void OrderAccess::release_store(volatile juint*, juint)
> <near match>
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
> note:   no known conversion for argument 1 from 'volatile long int*'
> to 'volatile juint* {aka volatile unsigned int*}'
>
> On 2017-07-10 20:50, Robbin Ehn wrote:
>> I'll start a push now.
>>
>> /Robbin
>>
>> On 2017-07-10 12:38, Roman Kennke wrote:
>>> Ok, so I guess I need a sponsor for this now:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>
>>> Roman
>>>
>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>
>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>
>>>>> Hi Roman,
>>>>>
>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>> Hi Robbin,
>>>>>>>
>>>>>>> Far down ->
>>>>>>>
>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm not happy about this change:
>>>>>>>>>
>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>> +  }
>>>>>>>>>
>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>> consistent
>>>>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>
>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>
>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>>>>    I'm looking at) runs while all other threads (incl. the
>>>>>>>>>> sweeper)
>>>>>>>>>>    is holding still.
>>>>>>>>>
>>>>>>>>> and:
>>>>>>>>>
>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>> sweeper.cpp...
>>>>>>>>
>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>> marking
>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>> (outside
>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>> storestore()
>>>>>>>> should be necessary.
>>>>>>>>
>>>>>>>>  From Igor's comment I can see how it happened though: Apparently
>>>>>>>> there
>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>> with
>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>> required
>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>> also put
>>>>>>>> a storestore() in the other places that call
>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>> storestore()
>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>> 'for
>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>> necessary in
>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>
>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>> Refactor the
>>>>>>>> code so that both paths at least call the storestore() in the same
>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>
>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>
>>>>>>> So there is a slight optimization when not running sweeper to skip
>>>>>>> compiler barrier/fence in stw.
>>>>>>>
>>>>>>> Don't think that matter, so I propose something like:
>>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>>> _stack_traversal_mark; }
>>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>>> _stack_traversal_mark = l; }
>>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>
>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>> that
>>>>>>> it is concurrent accessed.
>>>>>>> And remove both storestore.
>>>>>>>
>>>>>>> "Also neither of these state variables are volatile in nmethod, so
>>>>>>> even the compiler may reorder the stores"
>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>
>>>>>>> I think _state also should use la/rs semantics instead, but that's
>>>>>>> another story.
>>>>>> Like this?
>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>
>>>>> Yes, exactly, I like this!
>>>>> Dan? Igor ? Tobias?
>>>>>
>>>>
>>>> That seems correct.
>>>>
>>>> igor
>>>>
>>>>> Thanks Roman!
>>>>>
>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this
>>>>> thread/changeset to the end!
>>>>>
>>>>> /Robbin
>>>>>
>>>>>> Roman
>>>>
>>>


From shade at redhat.com  Mon Jul 10 20:14:07 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 10 Jul 2017 22:14:07 +0200
Subject: RFC: Epsilon GC JEP
Message-ID: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>

Hi,

I would like to solicit feedback on Epsilon GC JEP:
  https://bugs.openjdk.java.net/browse/JDK-8174901
  http://openjdk.java.net/jeps/8174901

The JEP text should be pretty self-contained, but we can certainly add more
points after the discussion happens.

For the last few months, there were quite a few instances where Epsilon proved a
good vehicle to do GC performance research, especially on object locality and
code generation fronts. I think it also serves as the trivial target for
Erik's/Roman's GC interface work.

The implementation and tests are there in the Sandbox, for those who are curious.

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170710/ca80e3e0/signature.asc>

From kim.barrett at oracle.com  Mon Jul 10 21:33:09 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 10 Jul 2017 17:33:09 -0400
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <595FA613.7090306@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
 <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com>
 <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com>
 <595FA613.7090306@oracle.com>
Message-ID: <5B0584AA-C49D-4426-AD02-3D23AF45F0CD@oracle.com>

On 2017-07-06 22:15, Kim Barrett wrote:
>> On Jul 6, 2017, at 4:11 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>> 
>>> On Jul 4, 2017, at 10:00 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>> The lock ranking changes look good.
>> I'm going to retract that.
>> 
>> How does these new lock rankings interact with various assertions that
>> rank() == or != Mutex::special?  I'm not sure those places handle
>> these new ranks properly.  (I'm not sure those places handle
>> Mutex::event rank properly either.)

> On Jul 7, 2017, at 11:17 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
> 
> [?]
> All in all, I believe that the deadlock detecion system has some redundant, and some confusing checks that involve the lock rank Mutex::special. But I do believe that it works and would detect deadlocks, but could do with some reworking to make it more explicit. And that is invariant of the new access rank and applies equally to the event rank.
> 
> However, since these access locks play well with the current deadlock detection as they do not do anything illegal, and even if use of these locks did indeed do illegal things, it would still be detected by the deadlock detection system, it is reasonable to say that refactoring the deadlock detection system is a separate RFE?
> 
> Specifically, clarifying the deadlock detection system by removing redundant checks, not checking for safepoint-safe state in try_lock as well as explicitly listing special and below locks as illegal when verifying Thread::check_for_valid_safepoint_state(), regardles of whether allow_vm_block() is true or not. Sounds like a separate RFE to me!

Thanks for the additional analysis. I agree that so long as one does
what one is supposed to (e.g. these locks always need to avoid
safepoint checks), there won't be any undesired assertions.  And I
also agree there won't be any bad consequences (e.g. incorrect code
possibly slipping through) from misuse, though the indicative failures
might not always be where one might prefer.

I don't think the redundant checks are necessarily bad, as they make
it more obvious to future readers what the requirements are at various
levels.  However, I agree it should be a separate RFE to do some
cleanup in this area, particularly where [non-]equality with
Mutex::special ought to be an ordered comparison.


From kim.barrett at oracle.com  Tue Jul 11 02:19:04 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 10 Jul 2017 22:19:04 -0400
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <595CBE40.5050603@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
 <595CBE40.5050603@oracle.com>
Message-ID: <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com>

> On Jul 5, 2017, at 6:24 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
> On 2017-07-05 04:00, Kim Barrett wrote:
>>> On Jun 26, 2017, at 9:34 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>> ------------------------------------------------------------------------------
>> src/share/vm/gc/g1/ptrQueue.cpp
>> Removing unlock / relock around
>>   78   qset()->enqueue_complete_buffer(node);
>> 
>> I would prefer that this part of this changeset not be made at this
>> time.
>> 
>> This part isn't necessary for the main point of this changeset.  It's
>> a cleanup that is enabled by the lock rank changes, where the rank
>> changes are required for other reasons.
> 
> Okay.
> 
>> It also at least conflicts with, and probably breaks, a pending change
>> of mine.  (I have a largish stack of patches in this area that didn't
>> quite make it into JDK 9 before the original FC date, and which I've
>> been (all too slowly) trying to work my way through and bring into JDK
>> 10.)
> 
> I agree that it would be possible to just correct the ranks while allowing the spaghetti synchronization code to remain in the code base. Here are some comments about that to me not so attractive idea:
> 1) I would really like to get rid of that code, because I think it is poor synchronization practice and its stated reason for existence is gone now.
> 2) I have to do *something* about that part in the current change, otherwise the comment motivating its existence will be incorrect after my lock rank change. There is no longer a legitimate motivation for doing that unlock and re-lock. So we have the choice of making a new made up motivation why we do this anyway, or to remove it. For me the choice is easily to remove it.

Or leave it be for now, to avoid knowingly creating more work for
someone else by inflicting merge conflicts or other breakage on them.
(But see below.) If the occasional out of date comment was the worst
of the problems we faced, that would be pretty fabulous.

> 3) If some new actual motivation for dropping that lock arises later on down the road (which I am dubious about), then I do not see an issue with simply re-adding it then, when/if that becomes necessary again, with a new corresponding motivation added in appropriately.
> 
> As far as your new changes go, I am curious what they do to motivate unlocking/re-locking this shared queue lock again. As outlined in my recent email to Thomas, we do not hold either of these queue locks when concurrent refinement helper code is called from GC barriers invoked from JavaThreads, even with my new changes. If it is in this code path that you will perform more work (just speculating), then that should be invariant of this cleanup.

One possibility I was thinking of was the buffer filtering step.  I
mis-remembered and thought that wasn't done for the (locked) shared
queues, and that one of my pending changes was to change that.  (It's
been over a year since I worked on those changes, and haven't had time
to really page them back in.)  But I now see that we already do the
filtering of the shared SATB queue (dirty card queues don't presently
have any filtering, but might in the future) while holding its lock.

This suggests a potential (though seemingly hard to avoid) fragility
resulting from the lowered lock rank.

The present SATB filtering doesn't seem to acquire any locks, but it's
a non-trivial amount of code spread over multiple files, so would be
easy to miss something or break it in that respect.  Reducing the lock
ranks requires being very careful with the SATB filtering code.

The "mutator" help for dirty card queue processing is not presently
done for the shared queue, but I think could be today.  I'm less sure
about that with lowered queue lock ranks; I *think* there aren't any
relevant locks there (other than the very rare shared queue lock in
refine_card_concurrently), but that's a substantially larger and more
complex amount of code than SATB queue filtering.  It looks like
something along this line is part of my pending changes.  That would
certainly be broken by the proposed removal of the temporary
unlocking.  At the time I was working on it, it seemed like having
that little unlocking dance simplified things elsewhere.  I can cope
with the merge conflict (especially since it *is* a merge conflict and
not silent breakage that I may have forgotten about by the time I get
back to it), though I would prefer not to have to.  (I can also think
of some reasons why this might not be worth doing or even a bad idea,
and don't recall right now what I may have done to address those.)

But while looking at the mutator helper, I realized there may be a
different problem.  Lowering these lock ranks may not be sufficient to
allow enqueue in "arbitrary" lock contexts.  The difficulty is that in
the mutator help case (only applies for dirty card queue right now,
and currently only if a Java thread dealing with its thread-local
queue), the allocation of the temporary worker_id is done under the
CBL lock (which is ok), but if there isn't a free worker_id, it
*waits* for one, and that's not ok in an arbitrary lock context.
Right now, we should not be able to hit that wait while not holding
"critial" locks, because the present CBL rank is too high to (safely)
be in enqueue in such a context.  But lowering the CBL rank is not
sufficient to enqueue while holding critical locks; that potential
wait also needs to be eliminated.  (This is assuming there's a place
where a Java thread can need an enqueue while holding a critical lock.
I don't have such a place in mind, but proving it can never happen now
or in the future seems hard, and contrary to the intent of the
proposed lock rank changes.)

Eliminating that wait doesn't need to be part of this change, but
seems like it might be required before taking advantage of the change
to move some potentially enqueuing operations.

It shouldn't be too hard to eliminate the wait, but it's a somewhat
fundamental behavioral change.  The present mechanism places a real
choke hold on the mutator when concurrent refinement can't keep up.
Without a blocking operation in there, the mutator could overwhelm
concurrent refinement, leading to longer pauses. Not that said choke
hold is all that pleasant either.


From per.liden at oracle.com  Tue Jul 11 06:34:21 2017
From: per.liden at oracle.com (Per Liden)
Date: Tue, 11 Jul 2017 08:34:21 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
Message-ID: <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>

Hi,

On 2017-07-10 18:35, Roman Kennke wrote:
> Hi Per,
>
> thanks for the review!
>
>>
>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not all
>>> GCs need them. It makes sense to remove it from the CollectedHeap and
>>> CollectorPolicy interfaces and move them down to the actual subclasses
>>> that used them.
>>>
>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's only
>>> used/implemented in the parallel GC. Also, I made this class AllStatic
>>> (was StackObj)
>>
>> AdaptiveSizePolicyOutput::print() is actually called from
>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
>> with moving it, but we should have the proper #includes in java.cpp.
>>
>> (Your patch doesn't actually build in its current form. I suspect
>> you're using precompiled headers which have a tendency to hide a lot
>> of errors caused by missing includes)
>>
> I added the include.
>
>>>
>>> Tested by running hotspot_gc jtreg tests without regressions.
>>>
>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
>>
>> collectorPolicy.hpp:
>> --------------------
>>  258   void cleared_all_soft_refs();
>>
>> Please declare this virtual too (that's the best we can do to signal
>> intent until we have C++11/override)
>>
> Ok.
>
>>
>> collectorPolicy.cpp:
>> --------------------
>>  224   this->CollectorPolicy::cleared_all_soft_refs();
>>
>> Please remove "this->" to match the super-call style used in other
>> places in this file.
>
> ok.
>
>
>>
>> Btw, I can sponsor the patch if you want.
>
> Find the updated webrev here:
>
> http://cr.openjdk.java.net/~rkennke/8179268/webrev.03/
> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.03/>


Looks good!

(Awaiting a second review before I can push)

cheers,
Per

>
> Cheers,
> Roman
>
>>
>> cheers,
>> Per
>>
>>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>
>>>
>>> Roman
>>>
>


From thomas.schatzl at oracle.com  Tue Jul 11 07:27:54 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 11 Jul 2017 09:27:54 +0200
Subject: RFR (S): 8183226: Remembered set summarization accesses not
 fully initialized java thread DCQS
In-Reply-To: <1499688947.2793.21.camel@oracle.com>
References: <1499083970.2802.33.camel@oracle.com>
 <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com>
 <1499688947.2793.21.camel@oracle.com>
Message-ID: <1499758074.3483.4.camel@oracle.com>

Hi again,

On Mon, 2017-07-10 at 14:15 +0200, Thomas Schatzl wrote:
> Hi Erik (and Stefan),
> 
> ? thanks for your review.
> 
> On Fri, 2017-07-07 at 13:16 +0200, Erik Helin wrote:
> > 
> > On 07/03/2017 02:12 PM, Thomas Schatzl wrote:
> > > ? can I get reviews for the following change that breaks some
> > > dependency cycle in g1remset initialization to fix some (at this
> > > time benign) bug when printing remembered set summarization
> > > information?
> > > 
> > > The problem is that G1Remset initializes its internal remembered
> > > [...]
> > You don't need to do all the cleanups, but I think having a fully?
> > functioning default constructor is a better way to solve this
> > problem, rather than shuffling the call to initialize around. What
> > do
> > you think?
> Let's defer the other suggested cleanups to a different CR.
> 
> In the following webrev I also added StefanJ's suggestion to extract
> concurrent refinement initialization into a separate method.
> (I do not really understand why that method is actually returning an
> error code: all error conditions in ConcurrentG1Refine call
> vm_shutdown_during_initialization() anyway - even that seems
> superfluous: failing to allocate memory shuts down the VM already).
> 
> Webrevs:
> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.0_to_1/?(diff)
> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1/?(full)
> 

Erik pointed out that by having two constructors, one taking a
G1RemSet, we can save a few more lines of code, avoiding the
G1RemSetSummary::initialize() method completely. :)

Here is an implementation of this idea.

Webrevs:
http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1_to_2/?(diff)
http://cr.openjdk.java.net/~tschatzl/8183226/webrev.2/?(full)

Thanks,
? Thomas


From stefan.johansson at oracle.com  Tue Jul 11 08:05:00 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 11 Jul 2017 10:05:00 +0200
Subject: RFR (S): 8183226: Remembered set summarization accesses not fully
 initialized java thread DCQS
In-Reply-To: <1499758074.3483.4.camel@oracle.com>
References: <1499083970.2802.33.camel@oracle.com>
 <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com>
 <1499688947.2793.21.camel@oracle.com> <1499758074.3483.4.camel@oracle.com>
Message-ID: <c66e277d-9609-0cfe-c40b-abb6eadb3919@oracle.com>


On 2017-07-11 09:27, Thomas Schatzl wrote:
> Hi again,
>
> On Mon, 2017-07-10 at 14:15 +0200, Thomas Schatzl wrote:
>> Hi Erik (and Stefan),
>>
>>    thanks for your review.
>>
>> On Fri, 2017-07-07 at 13:16 +0200, Erik Helin wrote:
>>> On 07/03/2017 02:12 PM, Thomas Schatzl wrote:
>>>>    can I get reviews for the following change that breaks some
>>>> dependency cycle in g1remset initialization to fix some (at this
>>>> time benign) bug when printing remembered set summarization
>>>> information?
>>>>
>>>> The problem is that G1Remset initializes its internal remembered
>>>> [...]
>>> You don't need to do all the cleanups, but I think having a fully
>>> functioning default constructor is a better way to solve this
>>> problem, rather than shuffling the call to initialize around. What
>>> do
>>> you think?
>> Let's defer the other suggested cleanups to a different CR.
>>
>> In the following webrev I also added StefanJ's suggestion to extract
>> concurrent refinement initialization into a separate method.
>> (I do not really understand why that method is actually returning an
>> error code: all error conditions in ConcurrentG1Refine call
>> vm_shutdown_during_initialization() anyway - even that seems
>> superfluous: failing to allocate memory shuts down the VM already).
>>
>> Webrevs:
>> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.0_to_1/ (diff)
>> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1/ (full)
>>
> Erik pointed out that by having two constructors, one taking a
> G1RemSet, we can save a few more lines of code, avoiding the
> G1RemSetSummary::initialize() method completely. :)
>
> Here is an implementation of this idea.
>
> Webrevs:
> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1_to_2/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.2/ (full)
Looks good,
StefanJ
> Thanks,
>    Thomas
>


From erik.osterlund at oracle.com  Tue Jul 11 10:28:44 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 11 Jul 2017 12:28:44 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <5B0584AA-C49D-4426-AD02-3D23AF45F0CD@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
 <6D1B2CB2-366E-4DBD-9F2E-672325459343@oracle.com>
 <6B5ACCE3-CA0C-41C9-A45A-C79467FB8CE2@oracle.com>
 <595FA613.7090306@oracle.com>
 <5B0584AA-C49D-4426-AD02-3D23AF45F0CD@oracle.com>
Message-ID: <5964A85C.6080807@oracle.com>


On 2017-07-10 23:33, Kim Barrett wrote:
> On 2017-07-06 22:15, Kim Barrett wrote:
>>> On Jul 6, 2017, at 4:11 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>>
>>>> On Jul 4, 2017, at 10:00 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>>> The lock ranking changes look good.
>>> I'm going to retract that.
>>>
>>> How does these new lock rankings interact with various assertions that
>>> rank() == or != Mutex::special?  I'm not sure those places handle
>>> these new ranks properly.  (I'm not sure those places handle
>>> Mutex::event rank properly either.)
>> On Jul 7, 2017, at 11:17 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>>
>> [?]
>> All in all, I believe that the deadlock detecion system has some redundant, and some confusing checks that involve the lock rank Mutex::special. But I do believe that it works and would detect deadlocks, but could do with some reworking to make it more explicit. And that is invariant of the new access rank and applies equally to the event rank.
>>
>> However, since these access locks play well with the current deadlock detection as they do not do anything illegal, and even if use of these locks did indeed do illegal things, it would still be detected by the deadlock detection system, it is reasonable to say that refactoring the deadlock detection system is a separate RFE?
>>
>> Specifically, clarifying the deadlock detection system by removing redundant checks, not checking for safepoint-safe state in try_lock as well as explicitly listing special and below locks as illegal when verifying Thread::check_for_valid_safepoint_state(), regardles of whether allow_vm_block() is true or not. Sounds like a separate RFE to me!
> Thanks for the additional analysis. I agree that so long as one does
> what one is supposed to (e.g. these locks always need to avoid
> safepoint checks), there won't be any undesired assertions.  And I
> also agree there won't be any bad consequences (e.g. incorrect code
> possibly slipping through) from misuse, though the indicative failures
> might not always be where one might prefer.
>
> I don't think the redundant checks are necessarily bad, as they make
> it more obvious to future readers what the requirements are at various
> levels.  However, I agree it should be a separate RFE to do some
> cleanup in this area, particularly where [non-]equality with
> Mutex::special ought to be an ordered comparison.

I am glad we agree in this area. Thanks for reading through the analysis.

/Erik


From erik.osterlund at oracle.com  Tue Jul 11 12:07:55 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 11 Jul 2017 14:07:55 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
 <595CBE40.5050603@oracle.com>
 <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com>
Message-ID: <5964BF9B.4010309@oracle.com>


On 2017-07-11 04:19, Kim Barrett wrote:
>> On Jul 5, 2017, at 6:24 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>> On 2017-07-05 04:00, Kim Barrett wrote:
>>>> On Jun 26, 2017, at 9:34 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>>> ------------------------------------------------------------------------------
>>> src/share/vm/gc/g1/ptrQueue.cpp
>>> Removing unlock / relock around
>>>    78   qset()->enqueue_complete_buffer(node);
>>>
>>> I would prefer that this part of this changeset not be made at this
>>> time.
>>>
>>> This part isn't necessary for the main point of this changeset.  It's
>>> a cleanup that is enabled by the lock rank changes, where the rank
>>> changes are required for other reasons.
>> Okay.
>>
>>> It also at least conflicts with, and probably breaks, a pending change
>>> of mine.  (I have a largish stack of patches in this area that didn't
>>> quite make it into JDK 9 before the original FC date, and which I've
>>> been (all too slowly) trying to work my way through and bring into JDK
>>> 10.)
>> I agree that it would be possible to just correct the ranks while allowing the spaghetti synchronization code to remain in the code base. Here are some comments about that to me not so attractive idea:
>> 1) I would really like to get rid of that code, because I think it is poor synchronization practice and its stated reason for existence is gone now.
>> 2) I have to do *something* about that part in the current change, otherwise the comment motivating its existence will be incorrect after my lock rank change. There is no longer a legitimate motivation for doing that unlock and re-lock. So we have the choice of making a new made up motivation why we do this anyway, or to remove it. For me the choice is easily to remove it.
> Or leave it be for now, to avoid knowingly creating more work for
> someone else by inflicting merge conflicts or other breakage on them.
> (But see below.) If the occasional out of date comment was the worst
> of the problems we faced, that would be pretty fabulous.

A two line merge conflict after over a year of dormancy though... ;)

>> 3) If some new actual motivation for dropping that lock arises later on down the road (which I am dubious about), then I do not see an issue with simply re-adding it then, when/if that becomes necessary again, with a new corresponding motivation added in appropriately.
>>
>> As far as your new changes go, I am curious what they do to motivate unlocking/re-locking this shared queue lock again. As outlined in my recent email to Thomas, we do not hold either of these queue locks when concurrent refinement helper code is called from GC barriers invoked from JavaThreads, even with my new changes. If it is in this code path that you will perform more work (just speculating), then that should be invariant of this cleanup.
> One possibility I was thinking of was the buffer filtering step.  I
> mis-remembered and thought that wasn't done for the (locked) shared
> queues, and that one of my pending changes was to change that.  (It's
> been over a year since I worked on those changes, and haven't had time
> to really page them back in.)  But I now see that we already do the
> filtering of the shared SATB queue (dirty card queues don't presently
> have any filtering, but might in the future) while holding its lock.
>
> This suggests a potential (though seemingly hard to avoid) fragility
> resulting from the lowered lock rank.

Note that this does not matter for JavaThreads (including compiler 
threads), for concurrent refinement threads or concurrent marking 
threads, nor does it matter for any thread when marking is not active.

So it seems to me that the worst consequence of this is possibly worse 
latency for operations coinciding in time with concurrent marking, that 
have large amounts of mutations or resurrections, and are not performed 
by JavaThreads (including compiler threads) or GC threads (that are 
performing the concurrent marking) or concurrent refinement threads 
(that have nothing to do with SATB), that are running concurrently with 
each other.

That does not seem to be a huge problem in my book. If it was, and an 
unknown bunch of non-JavaThreads are heavily mutating or resurrecting 
objects concurrent to marking, such that contention is inflicted on the 
shared queue lock for the shared SATB queue, then the right solution for 
that seems to be to give such threads their own local queue, rather than 
to reduce the time spent under the surprisingly hot shared queue lock.

>
> The present SATB filtering doesn't seem to acquire any locks, but it's
> a non-trivial amount of code spread over multiple files, so would be
> easy to miss something or break it in that respect.  Reducing the lock
> ranks requires being very careful with the SATB filtering code.

IMO, adding any lock into the SATB barrier which is used all over 
hotspot in some very shady places arguably requires being very careful 
regardless of my changes. So I am going to assume whoever does that for 
whatever reason is going to be careful.

> The "mutator" help for dirty card queue processing is not presently
> done for the shared queue, but I think could be today.  I'm less sure
> about that with lowered queue lock ranks; I *think* there aren't any
> relevant locks there (other than the very rare shared queue lock in
> refine_card_concurrently), but that's a substantially larger and more
> complex amount of code than SATB queue filtering.

As discussed with Thomas earlier in this thread, there are indeed locks 
blocking this. The HeapRegionRemSet::_m lock is currently a leaf lock. 
If collaborative refinement was to be performed on non-Java threads (and 
non-concurrent refinement threads), then this lock would have to 
decrease to the access rank first. But we concluded that warrants a new 
RFE with separate analysis.

As with the SATB queues though, I do not know what threads would be 
causing such trouble? It is not JavaThreads (including compiler 
threads), concurrent refinement threads, concurrent marking threads. 
That does not leave us with a whole lot of threads to cause that 
contention on the shared queue lock. And as with the SATB queues, if 
there are such threads that cause such contention on the shared queue 
lock, then the right fix seems to be to give them their own local queue 
and stop taking the shared queue lock in the first place.

> It looks like
> something along this line is part of my pending changes.  That would
> certainly be broken by the proposed removal of the temporary
> unlocking.  At the time I was working on it, it seemed like having
> that little unlocking dance simplified things elsewhere.  I can cope
> with the merge conflict (especially since it *is* a merge conflict and
> not silent breakage that I may have forgotten about by the time I get
> back to it), though I would prefer not to have to.  (I can also think
> of some reasons why this might not be worth doing or even a bad idea,
> and don't recall right now what I may have done to address those.)

This is why I wanted to know if you are certain this is truly going to 
be a problem or not.
Since this all seems rather uncertain, would you say it is reasonable 
that you take that two line merge conflict down the road if you 
eventually require putting the unlock/lock sequence back again?

> But while looking at the mutator helper, I realized there may be a
> different problem.  Lowering these lock ranks may not be sufficient to
> allow enqueue in "arbitrary" lock contexts.  The difficulty is that in
> the mutator help case (only applies for dirty card queue right now,
> and currently only if a Java thread dealing with its thread-local
> queue), the allocation of the temporary worker_id is done under the
> CBL lock (which is ok), but if there isn't a free worker_id, it
> *waits* for one, and that's not ok in an arbitrary lock context.
> Right now, we should not be able to hit that wait while not holding
> "critial" locks, because the present CBL rank is too high to (safely)
> be in enqueue in such a context.  But lowering the CBL rank is not
> sufficient to enqueue while holding critical locks; that potential
> wait also needs to be eliminated.  (This is assuming there's a place
> where a Java thread can need an enqueue while holding a critical lock.
> I don't have such a place in mind, but proving it can never happen now
> or in the future seems hard, and contrary to the intent of the
> proposed lock rank changes.)

I agree that in order to be able to freely perform object stores under 
special locks, we would have to stop waiting on the cbl monitor when 
claiming worker IDs in the helper part of the post write barrier. That 
is a good observation. On the same list of requirements for that to 
happen, the HeapRegionRemSet::_m monitor would have to change rank to 
"access" as previously mentioned.

> Eliminating that wait doesn't need to be part of this change, but
> seems like it might be required before taking advantage of the change
> to move some potentially enqueuing operations.

Agreed.

> It shouldn't be too hard to eliminate the wait, but it's a somewhat
> fundamental behavioral change.  The present mechanism places a real
> choke hold on the mutator when concurrent refinement can't keep up.
> Without a blocking operation in there, the mutator could overwhelm
> concurrent refinement, leading to longer pauses. Not that said choke
> hold is all that pleasant either.

Yes this mechanism seems to need some love indeed.

Thanks for reviewing!

/Erik


From stefan.johansson at oracle.com  Tue Jul 11 14:17:19 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 11 Jul 2017 16:17:19 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
Message-ID: <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>

Hi Roman,

On 2017-07-11 08:34, Per Liden wrote:
> Hi,
>
> On 2017-07-10 18:35, Roman Kennke wrote:
>> Hi Per,
>>
>> thanks for the review!
>>
>>>
>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not 
>>>> all
>>>> GCs need them. It makes sense to remove it from the CollectedHeap and
>>>> CollectorPolicy interfaces and move them down to the actual subclasses
>>>> that used them.
>>>>
>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's 
>>>> only
>>>> used/implemented in the parallel GC. Also, I made this class AllStatic
>>>> (was StackObj)
>>>
>>> AdaptiveSizePolicyOutput::print() is actually called from
>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
>>> with moving it, but we should have the proper #includes in java.cpp.
>>>
>>> (Your patch doesn't actually build in its current form. I suspect
>>> you're using precompiled headers which have a tendency to hide a lot
>>> of errors caused by missing includes)
>>>
>> I added the include.
>>
>>>>
>>>> Tested by running hotspot_gc jtreg tests without regressions.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
>>>
>>> collectorPolicy.hpp:
>>> --------------------
>>>  258   void cleared_all_soft_refs();
>>>
>>> Please declare this virtual too (that's the best we can do to signal
>>> intent until we have C++11/override)
>>>
>> Ok.
>>
>>>
>>> collectorPolicy.cpp:
>>> --------------------
>>>  224   this->CollectorPolicy::cleared_all_soft_refs();
>>>
>>> Please remove "this->" to match the super-call style used in other
>>> places in this file.
>>
>> ok.
>>
>>
>>>
>>> Btw, I can sponsor the patch if you want.
>>
>> Find the updated webrev here:
>>
>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.03/
>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.03/>
>
>
> Looks good!
>
This looks good to me too,
Stefan
> (Awaiting a second review before I can push)
>
> cheers,
> Per
>
>>
>> Cheers,
>> Roman
>>
>>>
>>> cheers,
>>> Per
>>>
>>>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>
>>>>
>>>> Roman
>>>>
>>


From stefan.johansson at oracle.com  Tue Jul 11 14:37:22 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 11 Jul 2017 16:37:22 +0200
Subject: RFR: 8177544: Restructure G1 Full GC code
In-Reply-To: <1499691680.2793.29.camel@oracle.com>
References: <62d1f02b-1fc0-ffcf-b8e0-e88ebacecebe@oracle.com>
 <1497346566.2829.33.camel@oracle.com>
 <f6253d8e-210e-b87f-2e75-48ee24f13291@oracle.com>
 <1499691680.2793.29.camel@oracle.com>
Message-ID: <accf2e8e-a8cd-b4a2-4532-9df6dd5d2ef0@oracle.com>


On 2017-07-10 15:01, Thomas Schatzl wrote:
> ...
>> I see your point and I think it would be good. But as we discussed
>> over chat, might be something to look at once everything else in this
>> area is done. Will create a RFE for this.
> Yes, that's fine.
>
>>>     - g1CollectedHeap.hpp: please try to sort the definitions of the
>>> new methods in order of calling them.
>> Done.
>>
>> Here are updated webrevs:
>> Full: http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.01/
>> Inc: http://cr.openjdk.java.net/~sjohanss/8177544/hotspot.00-01/
>>
> Looks good to me. Sorry for the late reply.
Thanks for reviewing Thomas!

No problem, I might not push this before getting back from vacation anyways.

Thanks,
Stefan
>
> Thanks,
>    Thomas
>


From kishor.kharbas at intel.com  Wed Jul 12 01:40:18 2017
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Wed, 12 Jul 2017 01:40:18 +0000
Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory
 devices
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F39018C57C@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F39018C57C@ORSMSX116.amr.corp.intel.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F39018DC66@ORSMSX116.amr.corp.intel.com>

Greetings,

I have an updated patch for JEP https://bugs.openjdk.java.net/browse/JDK-8171181 at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.05
This patch fixes the bugs pointed earlier and other suggestions to make the code less intrusive.

I have also sent this to 'hotspot-runtime-dev' mailing list (included below).

I would appreciate comments and feedback.

Thanks
Kishor

From: Kharbas, Kishor
Sent: Monday, July 10, 2017 1:53 PM
To: hotspot-runtime-dev at openjdk.java.net
Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory devices

Hello all!


I have an updated patch for https://bugs.openjdk.java.net/browse/JDK-8171181 at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.05
I have lost the old email chain so had to start a fresh one. The archived conversation can be found at - http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-March/022733.html


1.       I have worked on all the comments and fixed the bugs. Mainly bugs fixed are related to sigprocmask() and changed the implementation such that 'fd' is not passed all the way down the call stack. Thus minimizing function signature changes.


2.       Patch supports all OS'es. Consolidated all Posix compliant OS's implementation in os_posix.cpp.


3.       The patch is tested on Windows and Linux. Working on testing it on other OS'es.


Let me know if this version looks clean and correct.

Thanks
Kishor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170712/c1fddabc/attachment.htm>

From per.liden at oracle.com  Wed Jul 12 06:44:43 2017
From: per.liden at oracle.com (Per Liden)
Date: Wed, 12 Jul 2017 08:44:43 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
 <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
Message-ID: <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>

Hi Roman,

On 2017-07-11 16:17, Stefan Johansson wrote:
> Hi Roman,
>
> On 2017-07-11 08:34, Per Liden wrote:
>> Hi,
>>
>> On 2017-07-10 18:35, Roman Kennke wrote:
>>> Hi Per,
>>>
>>> thanks for the review!
>>>
>>>>
>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not
>>>>> all
>>>>> GCs need them. It makes sense to remove it from the CollectedHeap and
>>>>> CollectorPolicy interfaces and move them down to the actual subclasses
>>>>> that used them.
>>>>>
>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's
>>>>> only
>>>>> used/implemented in the parallel GC. Also, I made this class AllStatic
>>>>> (was StackObj)
>>>>
>>>> AdaptiveSizePolicyOutput::print() is actually called from
>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
>>>> with moving it, but we should have the proper #includes in java.cpp.

I just realized that this doesn't build on linux-i586, which builds a 
minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not 
include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef 
INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I 
suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for now.

(Use --with-target-bits=32 --with-jvm-variants=minimal when test 
building for linux-i586)

cheers,
Per

>>>>
>>>> (Your patch doesn't actually build in its current form. I suspect
>>>> you're using precompiled headers which have a tendency to hide a lot
>>>> of errors caused by missing includes)
>>>>
>>> I added the include.
>>>
>>>>>
>>>>> Tested by running hotspot_gc jtreg tests without regressions.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.00/
>>>>
>>>> collectorPolicy.hpp:
>>>> --------------------
>>>>  258   void cleared_all_soft_refs();
>>>>
>>>> Please declare this virtual too (that's the best we can do to signal
>>>> intent until we have C++11/override)
>>>>
>>> Ok.
>>>
>>>>
>>>> collectorPolicy.cpp:
>>>> --------------------
>>>>  224   this->CollectorPolicy::cleared_all_soft_refs();
>>>>
>>>> Please remove "this->" to match the super-call style used in other
>>>> places in this file.
>>>
>>> ok.
>>>
>>>
>>>>
>>>> Btw, I can sponsor the patch if you want.
>>>
>>> Find the updated webrev here:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.03/
>>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.03/>
>>
>>
>> Looks good!
>>
> This looks good to me too,
> Stefan
>> (Awaiting a second review before I can push)
>>
>> cheers,
>> Per
>>
>>>
>>> Cheers,
>>> Roman
>>>
>>>>
>>>> cheers,
>>>> Per
>>>>
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.00/>
>>>>>
>>>>> Roman
>>>>>
>>>
>


From erik.helin at oracle.com  Wed Jul 12 10:09:16 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 12 Jul 2017 12:09:16 +0200
Subject: RFR (S): 8183226: Remembered set summarization accesses not fully
 initialized java thread DCQS
In-Reply-To: <1499758074.3483.4.camel@oracle.com>
References: <1499083970.2802.33.camel@oracle.com>
 <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com>
 <1499688947.2793.21.camel@oracle.com> <1499758074.3483.4.camel@oracle.com>
Message-ID: <362056fe-f621-5e9d-a16b-13b51d9a550b@oracle.com>

On 07/11/2017 09:27 AM, Thomas Schatzl wrote:
> Webrevs:
> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1_to_2/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8183226/webrev.2/ (full)

Looks good, Reviewed.

Thanks,
Erik

> Thanks,
>   Thomas
>


From rkennke at redhat.com  Wed Jul 12 10:47:41 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 12 Jul 2017 12:47:41 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
 <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
 <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>
Message-ID: <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com>

Am 12.07.2017 um 08:44 schrieb Per Liden:
> Hi Roman,
>
> On 2017-07-11 16:17, Stefan Johansson wrote:
>> Hi Roman,
>>
>> On 2017-07-11 08:34, Per Liden wrote:
>>> Hi,
>>>
>>> On 2017-07-10 18:35, Roman Kennke wrote:
>>>> Hi Per,
>>>>
>>>> thanks for the review!
>>>>
>>>>>
>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not
>>>>>> all
>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap
>>>>>> and
>>>>>> CollectorPolicy interfaces and move them down to the actual
>>>>>> subclasses
>>>>>> that used them.
>>>>>>
>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's
>>>>>> only
>>>>>> used/implemented in the parallel GC. Also, I made this class
>>>>>> AllStatic
>>>>>> (was StackObj)
>>>>>
>>>>> AdaptiveSizePolicyOutput::print() is actually called from
>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
>>>>> with moving it, but we should have the proper #includes in java.cpp.
>
> I just realized that this doesn't build on linux-i586, which builds a
> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not
> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef
> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I
> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for
> now.
I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to
be able to call ParallelScavengeHeap::heap(), or else defeats the
purpose of this patch by requiring CollectedHeap to still carry
size_policy().. which we don't want. In addition to that, if I try to
include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting
freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp
around AdaptiveSizePolicyOutput seems like the lesser evil... done so here:

http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/
<http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.04/>

Ok?

The incremental diff between 03 and 04:

diff --git a/src/share/vm/runtime/java.cpp b/src/share/vm/runtime/java.cpp
--- a/src/share/vm/runtime/java.cpp
+++ b/src/share/vm/runtime/java.cpp
@@ -487,7 +487,10 @@
       ClassLoaderDataGraph::dump_on(log.trace_stream());
     }
   }
+
+#if INCLUDE_ALL_GCS
   AdaptiveSizePolicyOutput::print();
+#endif
 
   if (PrintBytecodeHistogram) {
     BytecodeHistogram::print();

Roman


From thomas.schatzl at oracle.com  Wed Jul 12 12:13:03 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 12 Jul 2017 14:13:03 +0200
Subject: RFR (S): 8183121: Add information about scanned and skipped cards
 during UpdateRS
Message-ID: <1499861583.6693.3.camel@oracle.com>

Hi all,

? can I have reviews for this small change that adds some information
about how many cards were scanned/skipped during Update RS.

This information is much better than just the number of processed
buffers, although I kept them for now.

This change is based on Erik's changes for JDK-8183539.

CR:
https://bugs.openjdk.java.net/browse/JDK-8183121
Webrev:
http://cr.openjdk.java.net/~tschatzl/8183121/webrev
Testing:
jprt, test case

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Wed Jul 12 12:15:47 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 12 Jul 2017 14:15:47 +0200
Subject: RFR (XS): 8183538: UpdateRS phase should claim cards
Message-ID: <1499861747.6693.6.camel@oracle.com>

Hi all,

? please review this small change that adds claiming of cards in the
update rs phase so that scan rs does not rescan them.

CR:
https://bugs.openjdk.java.net/browse/JDK-8183538
Webrev:
http://cr.openjdk.java.net/~tschatzl/8183538/webrev/
Testing:
jprt

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Wed Jul 12 12:16:13 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 12 Jul 2017 14:16:13 +0200
Subject: RFR (S): 8183226: Remembered set summarization accesses not
 fully initialized java thread DCQS
In-Reply-To: <362056fe-f621-5e9d-a16b-13b51d9a550b@oracle.com>
References: <1499083970.2802.33.camel@oracle.com>
 <0fb5d7cf-49b3-84a1-97b4-cdd53f0173e3@oracle.com>
 <1499688947.2793.21.camel@oracle.com> <1499758074.3483.4.camel@oracle.com>
 <362056fe-f621-5e9d-a16b-13b51d9a550b@oracle.com>
Message-ID: <1499861773.6693.7.camel@oracle.com>

Hi Erik, Stefan,

On Wed, 2017-07-12 at 12:09 +0200, Erik Helin wrote:
> On 07/11/2017 09:27 AM, Thomas Schatzl wrote:
> > 
> > Webrevs:
> > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.1_to_2/ (diff)
> > http://cr.openjdk.java.net/~tschatzl/8183226/webrev.2/ (full)
> Looks good, Reviewed.
> 

? thanks for your reviews.

Thomas


From stefan.johansson at oracle.com  Wed Jul 12 12:20:08 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 12 Jul 2017 14:20:08 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
 <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
 <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>
 <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com>
Message-ID: <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com>

Hi Roman,

On 2017-07-12 12:47, Roman Kennke wrote:
> Am 12.07.2017 um 08:44 schrieb Per Liden:
>> Hi Roman,
>>
>> On 2017-07-11 16:17, Stefan Johansson wrote:
>>> Hi Roman,
>>>
>>> On 2017-07-11 08:34, Per Liden wrote:
>>>> Hi,
>>>>
>>>> On 2017-07-10 18:35, Roman Kennke wrote:
>>>>> Hi Per,
>>>>>
>>>>> thanks for the review!
>>>>>
>>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not
>>>>>>> all
>>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap
>>>>>>> and
>>>>>>> CollectorPolicy interfaces and move them down to the actual
>>>>>>> subclasses
>>>>>>> that used them.
>>>>>>>
>>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's
>>>>>>> only
>>>>>>> used/implemented in the parallel GC. Also, I made this class
>>>>>>> AllStatic
>>>>>>> (was StackObj)
>>>>>> AdaptiveSizePolicyOutput::print() is actually called from
>>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
>>>>>> with moving it, but we should have the proper #includes in java.cpp.
>> I just realized that this doesn't build on linux-i586, which builds a
>> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not
>> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef
>> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I
>> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for
>> now.
> I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to
> be able to call ParallelScavengeHeap::heap(), or else defeats the
> purpose of this patch by requiring CollectedHeap to still carry
> size_policy().. which we don't want. In addition to that, if I try to
> include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting
> freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp
> around AdaptiveSizePolicyOutput seems like the lesser evil... done so here:
>
> http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/
> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.04/>
>
> Ok?
I'm no big fan of having #if INCLUDE_ALL_GCS if it can be avoided. A few 
lines below the call to AdaptiveSizePolicyOutput::print(), we call 
Universe::heap()->print_tracing_info(). I think we could move 
AdaptiveSizePolicyOutput::print() into 
ParallelScavengeHeap::print_tracing_info() without running into any 
problems.

What do you think about that solution?

Thanks,
Stefan
>
> The incremental diff between 03 and 04:
>
> diff --git a/src/share/vm/runtime/java.cpp b/src/share/vm/runtime/java.cpp
> --- a/src/share/vm/runtime/java.cpp
> +++ b/src/share/vm/runtime/java.cpp
> @@ -487,7 +487,10 @@
>         ClassLoaderDataGraph::dump_on(log.trace_stream());
>       }
>     }
> +
> +#if INCLUDE_ALL_GCS
>     AdaptiveSizePolicyOutput::print();
> +#endif
>   
>     if (PrintBytecodeHistogram) {
>       BytecodeHistogram::print();
>
> Roman


From per.liden at oracle.com  Wed Jul 12 12:48:03 2017
From: per.liden at oracle.com (Per Liden)
Date: Wed, 12 Jul 2017 14:48:03 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
 <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
 <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>
 <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com>
 <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com>
Message-ID: <0481fae0-e5bb-1b51-a37e-b5b40f4cbaec@oracle.com>

On 2017-07-12 14:20, Stefan Johansson wrote:
> Hi Roman,
>
> On 2017-07-12 12:47, Roman Kennke wrote:
>> Am 12.07.2017 um 08:44 schrieb Per Liden:
>>> Hi Roman,
>>>
>>> On 2017-07-11 16:17, Stefan Johansson wrote:
>>>> Hi Roman,
>>>>
>>>> On 2017-07-11 08:34, Per Liden wrote:
>>>>> Hi,
>>>>>
>>>>> On 2017-07-10 18:35, Roman Kennke wrote:
>>>>>> Hi Per,
>>>>>>
>>>>>> thanks for the review!
>>>>>>
>>>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and not
>>>>>>>> all
>>>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap
>>>>>>>> and
>>>>>>>> CollectorPolicy interfaces and move them down to the actual
>>>>>>>> subclasses
>>>>>>>> that used them.
>>>>>>>>
>>>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's
>>>>>>>> only
>>>>>>>> used/implemented in the parallel GC. Also, I made this class
>>>>>>>> AllStatic
>>>>>>>> (was StackObj)
>>>>>>> AdaptiveSizePolicyOutput::print() is actually called from
>>>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
>>>>>>> with moving it, but we should have the proper #includes in java.cpp.
>>> I just realized that this doesn't build on linux-i586, which builds a
>>> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not
>>> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef
>>> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I
>>> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for
>>> now.
>> I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to
>> be able to call ParallelScavengeHeap::heap(), or else defeats the
>> purpose of this patch by requiring CollectedHeap to still carry
>> size_policy().. which we don't want. In addition to that, if I try to
>> include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting
>> freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp
>> around AdaptiveSizePolicyOutput seems like the lesser evil... done so
>> here:
>>
>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/
>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.04/>
>>
>> Ok?
> I'm no big fan of having #if INCLUDE_ALL_GCS if it can be avoided. A few
> lines below the call to AdaptiveSizePolicyOutput::print(), we call
> Universe::heap()->print_tracing_info(). I think we could move
> AdaptiveSizePolicyOutput::print() into
> ParallelScavengeHeap::print_tracing_info() without running into any
> problems.
>
> What do you think about that solution?

That sounds like a slightly better approach.

cheers,
Per

>
> Thanks,
> Stefan
>>
>> The incremental diff between 03 and 04:
>>
>> diff --git a/src/share/vm/runtime/java.cpp
>> b/src/share/vm/runtime/java.cpp
>> --- a/src/share/vm/runtime/java.cpp
>> +++ b/src/share/vm/runtime/java.cpp
>> @@ -487,7 +487,10 @@
>>         ClassLoaderDataGraph::dump_on(log.trace_stream());
>>       }
>>     }
>> +
>> +#if INCLUDE_ALL_GCS
>>     AdaptiveSizePolicyOutput::print();
>> +#endif
>>       if (PrintBytecodeHistogram) {
>>       BytecodeHistogram::print();
>>
>> Roman
>


From rkennke at redhat.com  Wed Jul 12 13:32:47 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 12 Jul 2017 15:32:47 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <42872a15-d26c-9798-c6a2-f3f7c945baf7@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
 <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
Message-ID: <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>

Hi Robbin and all,

I fixed the 32bit failures by using jlong in all relevant places:

http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>

then Robbin found another problem. SafepointCleanupTest started to fail,
because "mark nmethods" is no longer printed. This made me think that
we're not measuring the conflated (and possibly parallelized)
deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
"safepoint cleanup tasks" which measures the total duration of safepoint
cleanup. We can't reasonably measure a possibly parallel and conflated
pass standalone, but we can measure all and by subtrating all the other
subphases, get an idea how long deflation and nmethod marking take up.

http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>

The full webrev is now:

http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>

Hope that's all ;-)

Roman

Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
> Hi, unfortunately the push failed on 32-bit.
>
> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>
> I do not have anytime to look at this, so here is the error.
>
> /Robbin
>
> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
> member function 'long int nmethod::stack_traversal_mark()':
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
> error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
> note: candidates are:
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
> note: static jint OrderAccess::load_acquire(const volatile jint*)
> <near match>
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
> note:   no known conversion for argument 1 from 'volatile long int*'
> to 'const volatile jint* {aka const volatile int*}'
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
> note: static juint OrderAccess::load_acquire(const volatile juint*)
> <near match>
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
> note:   no known conversion for argument 1 from 'volatile long int*'
> to 'const volatile juint* {aka const volatile unsigned int*}'
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
> member function 'void nmethod::set_stack_traversal_mark(long int)':
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
> error: call of overloaded 'release_store(volatile long int*, long
> int&)' is ambiguous
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
> note: candidates are:
> In file included from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>                  from
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
> note: static void OrderAccess::release_store(volatile jint*, jint)
> <near match>
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
> note:   no known conversion for argument 1 from 'volatile long int*'
> to 'volatile jint* {aka volatile int*}'
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
> note: static void OrderAccess::release_store(volatile juint*, juint)
> <near match>
> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
> note:   no known conversion for argument 1 from 'volatile long int*'
> to 'volatile juint* {aka volatile unsigned int*}'
>
> On 2017-07-10 20:50, Robbin Ehn wrote:
>> I'll start a push now.
>>
>> /Robbin
>>
>> On 2017-07-10 12:38, Roman Kennke wrote:
>>> Ok, so I guess I need a sponsor for this now:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>
>>> Roman
>>>
>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>
>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>
>>>>> Hi Roman,
>>>>>
>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>> Hi Robbin,
>>>>>>>
>>>>>>> Far down ->
>>>>>>>
>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm not happy about this change:
>>>>>>>>>
>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>> +  }
>>>>>>>>>
>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>> consistent
>>>>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>
>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>
>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>>>>    I'm looking at) runs while all other threads (incl. the
>>>>>>>>>> sweeper)
>>>>>>>>>>    is holding still.
>>>>>>>>>
>>>>>>>>> and:
>>>>>>>>>
>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>> sweeper.cpp...
>>>>>>>>
>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>> marking
>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>> (outside
>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>> storestore()
>>>>>>>> should be necessary.
>>>>>>>>
>>>>>>>>  From Igor's comment I can see how it happened though: Apparently
>>>>>>>> there
>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>> with
>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>> required
>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>> also put
>>>>>>>> a storestore() in the other places that call
>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>> storestore()
>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>> 'for
>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>> necessary in
>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>
>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>> Refactor the
>>>>>>>> code so that both paths at least call the storestore() in the same
>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>
>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>
>>>>>>> So there is a slight optimization when not running sweeper to skip
>>>>>>> compiler barrier/fence in stw.
>>>>>>>
>>>>>>> Don't think that matter, so I propose something like:
>>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>>> _stack_traversal_mark; }
>>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>>> _stack_traversal_mark = l; }
>>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>
>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>> that
>>>>>>> it is concurrent accessed.
>>>>>>> And remove both storestore.
>>>>>>>
>>>>>>> "Also neither of these state variables are volatile in nmethod, so
>>>>>>> even the compiler may reorder the stores"
>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>
>>>>>>> I think _state also should use la/rs semantics instead, but that's
>>>>>>> another story.
>>>>>> Like this?
>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>
>>>>> Yes, exactly, I like this!
>>>>> Dan? Igor ? Tobias?
>>>>>
>>>>
>>>> That seems correct.
>>>>
>>>> igor
>>>>
>>>>> Thanks Roman!
>>>>>
>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this
>>>>> thread/changeset to the end!
>>>>>
>>>>> /Robbin
>>>>>
>>>>>> Roman
>>>>
>>>


From rkennke at redhat.com  Wed Jul 12 13:58:12 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 12 Jul 2017 15:58:12 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
 <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
 <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>
 <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com>
 <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com>
Message-ID: <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com>

Am 12.07.2017 um 14:20 schrieb Stefan Johansson:
> Hi Roman,
>
> On 2017-07-12 12:47, Roman Kennke wrote:
>> Am 12.07.2017 um 08:44 schrieb Per Liden:
>>> Hi Roman,
>>>
>>> On 2017-07-11 16:17, Stefan Johansson wrote:
>>>> Hi Roman,
>>>>
>>>> On 2017-07-11 08:34, Per Liden wrote:
>>>>> Hi,
>>>>>
>>>>> On 2017-07-10 18:35, Roman Kennke wrote:
>>>>>> Hi Per,
>>>>>>
>>>>>> thanks for the review!
>>>>>>
>>>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and
>>>>>>>> not
>>>>>>>> all
>>>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap
>>>>>>>> and
>>>>>>>> CollectorPolicy interfaces and move them down to the actual
>>>>>>>> subclasses
>>>>>>>> that used them.
>>>>>>>>
>>>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's
>>>>>>>> only
>>>>>>>> used/implemented in the parallel GC. Also, I made this class
>>>>>>>> AllStatic
>>>>>>>> (was StackObj)
>>>>>>> AdaptiveSizePolicyOutput::print() is actually called from
>>>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
>>>>>>> with moving it, but we should have the proper #includes in
>>>>>>> java.cpp.
>>> I just realized that this doesn't build on linux-i586, which builds a
>>> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not
>>> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef
>>> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I
>>> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for
>>> now.
>> I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to
>> be able to call ParallelScavengeHeap::heap(), or else defeats the
>> purpose of this patch by requiring CollectedHeap to still carry
>> size_policy().. which we don't want. In addition to that, if I try to
>> include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting
>> freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp
>> around AdaptiveSizePolicyOutput seems like the lesser evil... done so
>> here:
>>
>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/
>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.04/>
>>
>> Ok?
> I'm no big fan of having #if INCLUDE_ALL_GCS if it can be avoided. A
> few lines below the call to AdaptiveSizePolicyOutput::print(), we call
> Universe::heap()->print_tracing_info(). I think we could move
> AdaptiveSizePolicyOutput::print() into
> ParallelScavengeHeap::print_tracing_info() without running into any
> problems.
>
> What do you think about that solution?

That's a very good idea!! It alters behaviour slightly (will print
adaptive size policy stuff in hs_err now) but I think that's for the better.

Incremental:
http://cr.openjdk.java.net/~rkennke/8179268/webrev.05.diff/
<http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.05.diff/>
Full:
http://cr.openjdk.java.net/~rkennke/8179268/webrev.05
<http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.05>

Good now?

Thanks,
Roman

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170712/0d76508c/attachment.htm>

From per.liden at oracle.com  Wed Jul 12 14:19:44 2017
From: per.liden at oracle.com (Per Liden)
Date: Wed, 12 Jul 2017 16:19:44 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
 <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
 <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>
 <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com>
 <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com>
 <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com>
Message-ID: <a099bc66-dbfc-eae8-0e61-c8f0b1269228@oracle.com>

Hi,

On 2017-07-12 15:58, Roman Kennke wrote:
> Am 12.07.2017 um 14:20 schrieb Stefan Johansson:
>> Hi Roman,
>>
>> On 2017-07-12 12:47, Roman Kennke wrote:
>>> Am 12.07.2017 um 08:44 schrieb Per Liden:
>>>> Hi Roman,
>>>>
>>>> On 2017-07-11 16:17, Stefan Johansson wrote:
>>>>> Hi Roman,
>>>>>
>>>>> On 2017-07-11 08:34, Per Liden wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 2017-07-10 18:35, Roman Kennke wrote:
>>>>>>> Hi Per,
>>>>>>>
>>>>>>> thanks for the review!
>>>>>>>
>>>>>>>>> AdaptiveSizePolicy is not used/called from outside the GCs, and
>>>>>>>>> not
>>>>>>>>> all
>>>>>>>>> GCs need them. It makes sense to remove it from the CollectedHeap
>>>>>>>>> and
>>>>>>>>> CollectorPolicy interfaces and move them down to the actual
>>>>>>>>> subclasses
>>>>>>>>> that used them.
>>>>>>>>>
>>>>>>>>> I moved AdaptiveSizePolicyOutput to parallelScavengeHeap.hpp, it's
>>>>>>>>> only
>>>>>>>>> used/implemented in the parallel GC. Also, I made this class
>>>>>>>>> AllStatic
>>>>>>>>> (was StackObj)
>>>>>>>> AdaptiveSizePolicyOutput::print() is actually called from
>>>>>>>> runtime/java.cpp also, so it's used outside of ParallelGC. I'm fine
>>>>>>>> with moving it, but we should have the proper #includes in
>>>>>>>> java.cpp.
>>>> I just realized that this doesn't build on linux-i586, which builds a
>>>> minimal JVM where INCLUDE_ALL_GCS isn't defined (and will thus not
>>>> include parallelScavangeHeap.hpp). Rather than having some ugly #ifdef
>>>> INCLUDE_ALL_GCS at the AdaptiveSizePolicyOutput::print() call site I
>>>> suggest we keep AdaptiveSizePolicyOutput in adaptiveSizePolicy.hpp for
>>>> now.
>>> I tried that. Unfortunately, it also requires #ifdef INCLUDE_ALL_GCS to
>>> be able to call ParallelScavengeHeap::heap(), or else defeats the
>>> purpose of this patch by requiring CollectedHeap to still carry
>>> size_policy().. which we don't want. In addition to that, if I try to
>>> include parallelScavengeHeap.hpp in adaptiveSizePolicy.hpp, I am getting
>>> freaky circular dependency problems. #ifdef INCLUDE_ALL_GCS in java.cpp
>>> around AdaptiveSizePolicyOutput seems like the lesser evil... done so
>>> here:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.04/
>>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.04/>
>>>
>>> Ok?
>> I'm no big fan of having #if INCLUDE_ALL_GCS if it can be avoided. A
>> few lines below the call to AdaptiveSizePolicyOutput::print(), we call
>> Universe::heap()->print_tracing_info(). I think we could move
>> AdaptiveSizePolicyOutput::print() into
>> ParallelScavengeHeap::print_tracing_info() without running into any
>> problems.
>>
>> What do you think about that solution?
>
> That's a very good idea!! It alters behaviour slightly (will print
> adaptive size policy stuff in hs_err now) but I think that's for the better.
>
> Incremental:
> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.05.diff/>
> Full:
> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05
> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.05>
>
> Good now?

Looks good. Not sure I follow your comment on hs_err. The adaptive size 
policy stuff prints to the normal log (with gc+ergo=debug).

Before pushing I'll take the liberty of removing the extra space you 
added to the end of ParallelScavengeHeap::print_tracing_info().

  586   AdaptiveSizePolicyOutput::print();
  587
  588 }

Stefan, ok to push?

cheers,
Per

>
> Thanks,
> Roman
>


From stefan.johansson at oracle.com  Wed Jul 12 14:24:59 2017
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 12 Jul 2017 16:24:59 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <a099bc66-dbfc-eae8-0e61-c8f0b1269228@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
 <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
 <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>
 <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com>
 <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com>
 <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com>
 <a099bc66-dbfc-eae8-0e61-c8f0b1269228@oracle.com>
Message-ID: <5a3ad9f8-38d6-43a2-e4ec-65b225fd6492@oracle.com>


On 2017-07-12 16:19, Per Liden wrote:
> Hi,
>
> On 2017-07-12 15:58, Roman Kennke wrote:
>>
>> That's a very good idea!! It alters behaviour slightly (will print
>> adaptive size policy stuff in hs_err now) but I think that's for the 
>> better.
>>
>> Incremental:
>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05.diff/
>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.05.diff/>
>> Full:
>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05
>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.05>
>>
>> Good now?
>
> Looks good. Not sure I follow your comment on hs_err. The adaptive 
> size policy stuff prints to the normal log (with gc+ergo=debug).
>
> Before pushing I'll take the liberty of removing the extra space you 
> added to the end of ParallelScavengeHeap::print_tracing_info().
>
>  586   AdaptiveSizePolicyOutput::print();
>  587
>  588 }
>
> Stefan, ok to push?
>
Yes, this looks great!

Thanks for cleaning this up Roman,
Stefan
> cheers,
> Per
>
>>
>> Thanks,
>> Roman
>>


From per.liden at oracle.com  Wed Jul 12 20:15:41 2017
From: per.liden at oracle.com (Per Liden)
Date: Wed, 12 Jul 2017 22:15:41 +0200
Subject: RFR: 8179268: Factor out AdaptiveSizePolicy from top-level
 interfaces CollectorPolicy and CollectedHeap
In-Reply-To: <5a3ad9f8-38d6-43a2-e4ec-65b225fd6492@oracle.com>
References: <a3e2db6c-e680-05bd-8189-b77328956759@redhat.com>
 <50cb4b58-623c-04c2-f6c5-cfb1bd0a3b1f@oracle.com>
 <073ad956-f475-f3c4-cac8-42bfa1329565@redhat.com>
 <f2e6e11e-0d8c-6487-590f-1ce18727dbd1@oracle.com>
 <5cf7afd1-5328-2411-1c23-5e73ae230069@oracle.com>
 <f88f21bc-6dad-6889-ad2a-5426cdeeea64@oracle.com>
 <7e05603a-7254-b6d0-bc6d-35fdf498b16f@redhat.com>
 <7701ab0e-7a75-baec-434f-62978a90e8e9@oracle.com>
 <95b50437-3638-2fb7-56a0-349c918b3475@redhat.com>
 <a099bc66-dbfc-eae8-0e61-c8f0b1269228@oracle.com>
 <5a3ad9f8-38d6-43a2-e4ec-65b225fd6492@oracle.com>
Message-ID: <9a257b72-2303-4f1b-7db3-9657f32e1e5f@oracle.com>

This has now been pushed to jdk10/hs.

cheers,
Per

On 07/12/2017 04:24 PM, Stefan Johansson wrote:
>
>
> On 2017-07-12 16:19, Per Liden wrote:
>> Hi,
>>
>> On 2017-07-12 15:58, Roman Kennke wrote:
>>>
>>> That's a very good idea!! It alters behaviour slightly (will print
>>> adaptive size policy stuff in hs_err now) but I think that's for the
>>> better.
>>>
>>> Incremental:
>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05.diff/
>>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.05.diff/>
>>> Full:
>>> http://cr.openjdk.java.net/~rkennke/8179268/webrev.05
>>> <http://cr.openjdk.java.net/%7Erkennke/8179268/webrev.05>
>>>
>>> Good now?
>>
>> Looks good. Not sure I follow your comment on hs_err. The adaptive
>> size policy stuff prints to the normal log (with gc+ergo=debug).
>>
>> Before pushing I'll take the liberty of removing the extra space you
>> added to the end of ParallelScavengeHeap::print_tracing_info().
>>
>>  586   AdaptiveSizePolicyOutput::print();
>>  587
>>  588 }
>>
>> Stefan, ok to push?
>>
> Yes, this looks great!
>
> Thanks for cleaning this up Roman,
> Stefan
>> cheers,
>> Per
>>
>>>
>>> Thanks,
>>> Roman
>>>
>


From robbin.ehn at oracle.com  Wed Jul 12 20:39:59 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 12 Jul 2017 22:39:59 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
 <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
 <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>
Message-ID: <743ff172-88b6-30dc-8808-a4a97be4a571@oracle.com>

On 2017-07-12 15:32, Roman Kennke wrote:
> Hi Robbin and all,
>
> I fixed the 32bit failures by using jlong in all relevant places:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>

Looks good!

>
> then Robbin found another problem. SafepointCleanupTest started to fail,
> because "mark nmethods" is no longer printed. This made me think that
> we're not measuring the conflated (and possibly parallelized)
> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
> "safepoint cleanup tasks" which measures the total duration of safepoint
> cleanup. We can't reasonably measure a possibly parallel and conflated
> pass standalone, but we can measure all and by subtrating all the other
> subphases, get an idea how long deflation and nmethod marking take up.
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>

Looks good and thanks for fixing

It's time to ship this, can we have a second review please!

/Robbin

>
> The full webrev is now:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>
> Hope that's all ;-)
>
> Roman
>
> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>> Hi, unfortunately the push failed on 32-bit.
>>
>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>
>> I do not have anytime to look at this, so here is the error.
>>
>> /Robbin
>>
>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>> member function 'long int nmethod::stack_traversal_mark()':
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>> error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>> note: candidates are:
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>> note:   no known conversion for argument 1 from 'volatile long int*'
>> to 'const volatile jint* {aka const volatile int*}'
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>> note:   no known conversion for argument 1 from 'volatile long int*'
>> to 'const volatile juint* {aka const volatile unsigned int*}'
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>> error: call of overloaded 'release_store(volatile long int*, long
>> int&)' is ambiguous
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>> note: candidates are:
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>> note: static void OrderAccess::release_store(volatile jint*, jint)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>> note:   no known conversion for argument 1 from 'volatile long int*'
>> to 'volatile jint* {aka volatile int*}'
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>> note: static void OrderAccess::release_store(volatile juint*, juint)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>> note:   no known conversion for argument 1 from 'volatile long int*'
>> to 'volatile juint* {aka volatile unsigned int*}'
>>
>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>> I'll start a push now.
>>>
>>> /Robbin
>>>
>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>> Ok, so I guess I need a sponsor for this now:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>
>>>> Roman
>>>>
>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>
>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>> Hi Robbin,
>>>>>>>>
>>>>>>>> Far down ->
>>>>>>>>
>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>
>>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>>> +  }
>>>>>>>>>>
>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>> consistent
>>>>>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>
>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>
>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>>>>>    I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>> sweeper)
>>>>>>>>>>>    is holding still.
>>>>>>>>>>
>>>>>>>>>> and:
>>>>>>>>>>
>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>
>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>> marking
>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>> (outside
>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>> storestore()
>>>>>>>>> should be necessary.
>>>>>>>>>
>>>>>>>>>  From Igor's comment I can see how it happened though: Apparently
>>>>>>>>> there
>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>> with
>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>> required
>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>> also put
>>>>>>>>> a storestore() in the other places that call
>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>> storestore()
>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>> 'for
>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>> necessary in
>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>
>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>> Refactor the
>>>>>>>>> code so that both paths at least call the storestore() in the same
>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>
>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>
>>>>>>>> So there is a slight optimization when not running sweeper to skip
>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>
>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>>>> _stack_traversal_mark; }
>>>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>
>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>> that
>>>>>>>> it is concurrent accessed.
>>>>>>>> And remove both storestore.
>>>>>>>>
>>>>>>>> "Also neither of these state variables are volatile in nmethod, so
>>>>>>>> even the compiler may reorder the stores"
>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>
>>>>>>>> I think _state also should use la/rs semantics instead, but that's
>>>>>>>> another story.
>>>>>>> Like this?
>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>
>>>>>> Yes, exactly, I like this!
>>>>>> Dan? Igor ? Tobias?
>>>>>>
>>>>>
>>>>> That seems correct.
>>>>>
>>>>> igor
>>>>>
>>>>>> Thanks Roman!
>>>>>>
>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this
>>>>>> thread/changeset to the end!
>>>>>>
>>>>>> /Robbin
>>>>>>
>>>>>>> Roman
>>>>>
>>>>
>


From email.sundarms at gmail.com  Thu Jul 13 02:11:41 2017
From: email.sundarms at gmail.com (Sundara Mohan M)
Date: Wed, 12 Jul 2017 19:11:41 -0700
Subject: High Reference Processing/Object Copy time
Message-ID: <CAEY0QqCpTQj+DYs17DkZtuV0F84Lnp=ncG5pSjyCZpAm_JWCEg@mail.gmail.com>

Hi,
   I am observing a odd behaviour (very high ref proc time once) with G1GC

gc log snippet,

flags used
Java HotSpot(TM) 64-Bit Server VM (25.112-b15) for linux-amd64 JRE
(1.8.0_112-b15), built on Sep 22 2016 21:10:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 132290100k(1065596k free), swap
132120572k(131992000k free)
CommandLine flags: -XX:G1MaxNewSizePercent=30
-XX:G1OldCSetRegionThresholdPercent=20 -XX:GCLogFileSize=20971520
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=out-of-memory-heap-dump
-XX:InitialHeapSize=33285996544 -XX:MaxGCPauseMillis=500
-XX:MaxHeapSize=33285996544 -XX:MetaspaceSize=536870912
-XX:NumberOfGCLogFiles=20 -XX:+ParallelRefProcEnabled
-XX:+PrintAdaptiveSizePolicy -XX:+PrintGC
-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
-XX:+UnlockExperimentalVMOptions -XX:+UseCompressedClassPointers
-XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation
-XX:+UseStringDeduplication


....
2017-07-12T17:02:40.227+0000: 77743.943: [GC pause (G1 Evacuation Pause)
(young)
Desired survivor size 104857600 bytes, new threshold 2 (max 15)
- age   1:   38456192 bytes,   38456192 total
- age   2:   86746408 bytes,  125202600 total
 77743.943: [G1Ergonomics (CSet Construction) start choosing CSet,
_pending_cards: 149039, predicted base time: 374.57 ms, remaining time:
125.43 ms, target pause time: 50
0.00 ms]
 77743.943: [G1Ergonomics (CSet Construction) add young regions to CSet,
eden: 174 regions, survivors: 24 regions, predicted young region time:
1277.98 ms]
 77743.943: [G1Ergonomics (CSet Construction) finish choosing CSet, eden:
174 regions, survivors: 24 regions, old: 0 regions, predicted pause time:
1652.55 ms, target paus
e time: 500.00 ms]
 77751.132: [G1Ergonomics (Concurrent Cycles) request concurrent cycle
initiation, reason: occupancy higher than threshold, occupancy: 21147680768
bytes, allocation reques
t: 0 bytes, threshold: 14978698425 bytes (45.00 %), source: end of GC]
, 7.1891696 secs]
   [Parallel Time: 2253.1 ms, GC Workers: 13]
      [GC Worker Start (ms): Min: 77743943.2, Avg: 77743943.3, Max:
77743943.4, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.7, Avg: 3.5, Max: 6.5, Diff: 4.8,
Sum: 44.9]
      [Update RS (ms): Min: 39.2, Avg: 42.4, Max: 45.1, Diff: 5.9, Sum:
551.8]
         [Processed Buffers: Min: 26, Avg: 57.4, Max: 78, Diff: 52, Sum:
746]
      [Scan RS (ms): Min: 1.8, Avg: 3.7, Max: 4.5, Diff: 2.7, Sum: 47.5]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
Sum: 0.0]
*      [Object Copy (ms): Min: 2198.1, Avg: 2198.7, Max: 2202.7, Diff: 4.6,
Sum: 28583.3]*
      [Termination (ms): Min: 0.0, Avg: 4.5, Max: 4.9, Diff: 4.9, Sum: 58.4]
         [Termination Attempts: Min: 1, Avg: 16.7, Max: 28, Diff: 27, Sum:
217]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.4]
      [GC Worker Total (ms): Min: 2252.7, Avg: 2252.8, Max: 2252.9, Diff:
0.2, Sum: 29286.3]
      [GC Worker End (ms): Min: 77746196.1, Avg: 77746196.1, Max:
77746196.1, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 167.7 ms, GC Workers: 13]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.4, Max: 1.2, Diff: 1.2, Sum: 5.1]
      [Table Fixup (ms): Min: 165.5, Avg: 165.9, Max: 166.3, Diff: 0.9,
Sum: 2156.9]
   [Clear CT: 1.5 ms]
   [Other: 4766.8 ms]
      [Choose CSet: 0.0 ms]
*      [Ref Proc: 4763.9 ms]*
      [Ref Enq: 0.8 ms]
      [Redirty Cards: 0.7 ms]
      [Humongous Register: 0.2 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 0.4 ms]
*   [Eden: 1392.0M(1392.0M)->0.0B(1440.0M) Survivors: 192.0M->144.0M Heap:
20.8G(31.0G)->19.6G(31.0G)]*
* [Times: user=22.82 sys=13.83, real=7.19 secs]*

*Question*
1. Is there a way to find out why Ref Proc took 4.7 s at this instance
only? all other instances it was less than a second.
2. Why object copy took 2.1s even though young gen region size is 1.3G in
this case and there was not much garbage collected in this case.
3. Why is this happening occasionally and is there a way to enable more
logs when this happens.


Thanks,
Sundar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170712/6b9399a8/attachment.htm>

From ecki at zusammenkunft.net  Thu Jul 13 04:58:51 2017
From: ecki at zusammenkunft.net (Bernd Eckenfels)
Date: Thu, 13 Jul 2017 04:58:51 +0000
Subject: High Reference Processing/Object Copy time
In-Reply-To: <CAEY0QqCpTQj+DYs17DkZtuV0F84Lnp=ncG5pSjyCZpAm_JWCEg@mail.gmail.com>
References: <CAEY0QqCpTQj+DYs17DkZtuV0F84Lnp=ncG5pSjyCZpAm_JWCEg@mail.gmail.com>
Message-ID: <HE1PR08MB2795E9C7342192F3E5524B9AFFAC0@HE1PR08MB2795.eurprd08.prod.outlook.com>

The sys time is very high in this snippet, how to the other snippets compare? Did you turn off transparent huge pages (THP) in your OS and is there no swapping happening?

BTW: this is more a discussion for the user mailing list.

Gruss
Bernd
--
http://bernd.eckenfels.net
________________________________
From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> on behalf of Sundara Mohan M <email.sundarms at gmail.com>
Sent: Thursday, July 13, 2017 4:11:41 AM
To: hotspot-gc-dev at openjdk.java.net
Subject: High Reference Processing/Object Copy time

Hi,
   I am observing a odd behaviour (very high ref proc time once) with G1GC

gc log snippet,

flags used
Java HotSpot(TM) 64-Bit Server VM (25.112-b15) for linux-amd64 JRE (1.8.0_112-b15), built on Sep 22 2016 21:10:53 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 132290100k(1065596k free), swap 132120572k(131992000k free)
CommandLine flags: -XX:G1MaxNewSizePercent=30 -XX:G1OldCSetRegionThresholdPercent=20 -XX:GCLogFileSize=20971520 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=out-of-memory-heap-dump -XX:InitialHeapSize=33285996544 -XX:MaxGCPauseMillis=500 -XX:MaxHeapSize=33285996544 -XX:MetaspaceSize=536870912 -XX:NumberOfGCLogFiles=20 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+UnlockExperimentalVMOptions -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation -XX:+UseStringDeduplication


....
2017-07-12T17:02:40.227+0000: 77743.943: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 104857600 bytes, new threshold 2 (max 15)
- age   1:   38456192 bytes,   38456192 total
- age   2:   86746408 bytes,  125202600 total
 77743.943: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 149039, predicted base time: 374.57 ms, remaining time: 125.43 ms, target pause time: 50
0.00 ms]
 77743.943: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 174 regions, survivors: 24 regions, predicted young region time: 1277.98 ms]
 77743.943: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 174 regions, survivors: 24 regions, old: 0 regions, predicted pause time: 1652.55 ms, target paus
e time: 500.00 ms]
 77751.132: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 21147680768 bytes, allocation reques
t: 0 bytes, threshold: 14978698425 bytes (45.00 %), source: end of GC]
, 7.1891696 secs]
   [Parallel Time: 2253.1 ms, GC Workers: 13]
      [GC Worker Start (ms): Min: 77743943.2, Avg: 77743943.3, Max: 77743943.4, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.7, Avg: 3.5, Max: 6.5, Diff: 4.8, Sum: 44.9]
      [Update RS (ms): Min: 39.2, Avg: 42.4, Max: 45.1, Diff: 5.9, Sum: 551.8]
         [Processed Buffers: Min: 26, Avg: 57.4, Max: 78, Diff: 52, Sum: 746]
      [Scan RS (ms): Min: 1.8, Avg: 3.7, Max: 4.5, Diff: 2.7, Sum: 47.5]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 2198.1, Avg: 2198.7, Max: 2202.7, Diff: 4.6, Sum: 28583.3]
      [Termination (ms): Min: 0.0, Avg: 4.5, Max: 4.9, Diff: 4.9, Sum: 58.4]
         [Termination Attempts: Min: 1, Avg: 16.7, Max: 28, Diff: 27, Sum: 217]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.4]
      [GC Worker Total (ms): Min: 2252.7, Avg: 2252.8, Max: 2252.9, Diff: 0.2, Sum: 29286.3]
      [GC Worker End (ms): Min: 77746196.1, Avg: 77746196.1, Max: 77746196.1, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 167.7 ms, GC Workers: 13]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.4, Max: 1.2, Diff: 1.2, Sum: 5.1]
      [Table Fixup (ms): Min: 165.5, Avg: 165.9, Max: 166.3, Diff: 0.9, Sum: 2156.9]
   [Clear CT: 1.5 ms]
   [Other: 4766.8 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 4763.9 ms]
      [Ref Enq: 0.8 ms]
      [Redirty Cards: 0.7 ms]
      [Humongous Register: 0.2 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 0.4 ms]
   [Eden: 1392.0M(1392.0M)->0.0B(1440.0M) Survivors: 192.0M->144.0M Heap: 20.8G(31.0G)->19.6G(31.0G)]
 [Times: user=22.82 sys=13.83, real=7.19 secs]

Question
1. Is there a way to find out why Ref Proc took 4.7 s at this instance only? all other instances it was less than a second.
2. Why object copy took 2.1s even though young gen region size is 1.3G in this case and there was not much garbage collected in this case.
3. Why is this happening occasionally and is there a way to enable more logs when this happens.


Thanks,
Sundar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170713/851c2762/attachment.htm>

From thomas.schatzl at oracle.com  Thu Jul 13 08:06:34 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 13 Jul 2017 10:06:34 +0200
Subject: High Reference Processing/Object Copy time
In-Reply-To: <HE1PR08MB2795E9C7342192F3E5524B9AFFAC0@HE1PR08MB2795.eurprd08.prod.outlook.com>
References: <CAEY0QqCpTQj+DYs17DkZtuV0F84Lnp=ncG5pSjyCZpAm_JWCEg@mail.gmail.com>
 <HE1PR08MB2795E9C7342192F3E5524B9AFFAC0@HE1PR08MB2795.eurprd08.prod.outlook.com>
Message-ID: <1499933194.2815.11.camel@oracle.com>

Hi,

On Thu, 2017-07-13 at 04:58 +0000, Bernd Eckenfels wrote:
> The sys time is very high in this snippet, how to the other snippets
> compare? Did you turn off transparent huge pages (THP) in your OS and
> is there no swapping happening?
> 

The documentation offers some more potential issues when there is high
system time:?https://docs.oracle.com/javase/9/gctuning/garbage-first-ga
rbage-collector-tuning.htm#GUID-8D9B2530-E370-4B8B-8ADD-A43674FC6658?(T
he section is applicable to both JDK8 and 9).

The VM/garbage collector is a user level program. High system time (at
least as high as in your snippet) strongly indicates a problem in the
environment (or in the interaction with your environment, i.e. memory
or I/O related).

> BTW: this is more a discussion for the user mailing list.

Agree, please move to the hotspot-gc-use list which is more
appropriate.

Thanks,
? Thomas


From erik.helin at oracle.com  Thu Jul 13 11:09:24 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 13 Jul 2017 13:09:24 +0200
Subject: RFR (XS): 8183538: UpdateRS phase should claim cards
In-Reply-To: <1499861747.6693.6.camel@oracle.com>
References: <1499861747.6693.6.camel@oracle.com>
Message-ID: <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com>

Hi Thomas,

On 07/12/2017 02:15 PM, Thomas Schatzl wrote:
> Hi all,
>
>   please review this small change that adds claiming of cards in the
> update rs phase so that scan rs does not rescan them.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183538
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183538/webrev/

looks good, Reviewed.

I was trying to find a way where we could utilize the claim_card 
function, but could not come up with a good approach. Push this and then 
we can see if we can reduce the slight code/logic duplication later.

Thanks,
Erik

> Testing:
> jprt
>
> Thanks,
>   Thomas
>


From thomas.schatzl at oracle.com  Thu Jul 13 11:35:12 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 13 Jul 2017 13:35:12 +0200
Subject: RFR (XS): 8183538: UpdateRS phase should claim cards
In-Reply-To: <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com>
References: <1499861747.6693.6.camel@oracle.com>
 <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com>
Message-ID: <1499945712.2756.2.camel@oracle.com>

Hi,

On Thu, 2017-07-13 at 13:09 +0200, Erik Helin wrote:
> Hi Thomas,
> 
> On 07/12/2017 02:15 PM, Thomas Schatzl wrote:
> > 
> > Hi all,
> > 
> > ? please review this small change that adds claiming of cards in
> > the
> > update rs phase so that scan rs does not rescan them.
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8183538
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8183538/webrev/
> looks good, Reviewed.
> 
> I was trying to find a way where we could utilize the claim_card?
> function, but could not come up with a good approach. Push this and
> then we can see if we can reduce the slight code/logic duplication
> later.

? yes, me too :) All variants I could think of would penalize one or
the other phase.

Thanks for your review.

Thanks,
? Thomas


From erik.helin at oracle.com  Thu Jul 13 14:53:12 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 13 Jul 2017 16:53:12 +0200
Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set
In-Reply-To: <ddffcf61-df48-d956-676e-14a80a4a3af7@oracle.com>
References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>
 <ddffcf61-df48-d956-676e-14a80a4a3af7@oracle.com>
Message-ID: <ba4aeac4-40cf-3b42-fc7f-a9faed48c144@oracle.com>

On 07/04/2017 02:17 PM, Mikael Gerdin wrote:
> Hi Erik,
>
> Do you know if any of the tests actually would have failed if rem set
> reconstruction after evacuation failure didn't work properly?
>
> I'd feel safer with this change if you ran with some verification code
> to ensure that the into_cset queue was always useless when evac failure
> occurs.

Good point, I have now run GCBasher for a very long time with:
-XX:+G1EvacuationFailALot -XX:G1EvacuationFailureALotCount=5 
-XX:+VerifyBeforeGC -XX:+VerifyAfterGC

This mean that GCBasher encounters a (forced) evacuation failure every 
fifth GC and also runs full verification for every GC. So far it has 
been working fine.

I have also run all tests in the JTReg group hotspot_gc with 
G1EvacuationFailALot set to true (in g1_globals.hpp) and 
G1EvacuationFailureALotCount set to 5 (also in g1_globals.hpp). This 
mean that all GC tests (including the stress tests) encountered an 
evacuation failure every fifth GC. This also worked fine.

I also wrote a new patch against tip (where _into_cset_dcqs is still 
present) to do some custom verification. The contents of 
G1RemSet::_into_cset_dcqs and G1CollectedHeap::_dirty_card_queue_set 
should be identical after a collection. This sort-of worked :)

The queues are *very* similar (often around 98% of the cards in 
G1RemSet::_into_cset_dcqs are found in 
G1CollectedHeap::_dirty_card_queue_set). The reason for the "missing 
cards" is that cards in G1RemSet::_into_cset_dcqs comes from the 
post-write barrier, and the post-write barrier dirties the card that 
contains the object header (except for arrays, where it dirties the 
field/slot). The cards in G1CollectedHeap::_dirty_card_queue_set comes 
from G1ParScanThreadState::update_rs, and update_rs always dirties the 
card that contains the field (*not* the header). Hence, if an object 
crosses card boundaries, then the post-write barrier and update_rs will 
dirty different cards. This has no impact on correctness, it is like 
this for performance reasons (dirtying the card that contains the object 
header leads to fewer dirty cards, but we don't have quick access to the 
object header in update_rs).

So, with the above, I'm fairly confident (famous last words) that this 
patch is working :)

I also rebased this patch on top of all the latest changes:
- http://cr.openjdk.java.net/~ehelin/8183539/01/

(it is the same patch, just rebased)

Thanks,
Erik

> Thanks
> /Mikael
>
>>
>> Thanks,
>> Erik


From thomas.schatzl at oracle.com  Thu Jul 13 15:06:57 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 13 Jul 2017 17:06:57 +0200
Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set
In-Reply-To: <ba4aeac4-40cf-3b42-fc7f-a9faed48c144@oracle.com>
References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>
 <ddffcf61-df48-d956-676e-14a80a4a3af7@oracle.com>
 <ba4aeac4-40cf-3b42-fc7f-a9faed48c144@oracle.com>
Message-ID: <1499958417.2756.4.camel@oracle.com>

Hi Erik,

On Thu, 2017-07-13 at 16:53 +0200, Erik Helin wrote:
> On 07/04/2017 02:17 PM, Mikael Gerdin wrote:
> > 
> > Hi Erik,
> > 
> > Do you know if any of the tests actually would have failed if rem
> > set
> > reconstruction after evacuation failure didn't work properly?
> > 
> > I'd feel safer with this change if you ran with some verification
> > code to ensure that the into_cset queue was always useless when
> > evac failure occurs.
>
> Good point, I have now run GCBasher for a very long time with:
> -XX:+G1EvacuationFailALot -XX:G1EvacuationFailureALotCount=5?
> -XX:+VerifyBeforeGC -XX:+VerifyAfterGC
> 
> This mean that GCBasher encounters a (forced) evacuation failure
> every fifth GC and also runs full verification for every GC. So far
> it has been working fine.
> 
> I have also run all tests in the JTReg group hotspot_gc with?
> G1EvacuationFailALot set to true (in g1_globals.hpp) and?
> G1EvacuationFailureALotCount set to 5 (also in g1_globals.hpp). This?
> mean that all GC tests (including the stress tests) encountered an?
> evacuation failure every fifth GC. This also worked fine.
> 
> I also wrote a new patch against tip (where _into_cset_dcqs is still?
> present) to do some custom verification. The contents of?
> G1RemSet::_into_cset_dcqs and G1CollectedHeap::_dirty_card_queue_set?
> should be identical after a collection. This sort-of worked :)
> 
> The queues are *very* similar (often around 98% of the cards in?
> G1RemSet::_into_cset_dcqs are found in
> G1CollectedHeap::_dirty_card_queue_set). The reason for the "missing
> cards" is that cards in G1RemSet::_into_cset_dcqs comes from the?
> post-write barrier, and the post-write barrier dirties the card that?
> contains the object header (except for arrays, where it dirties the?
> field/slot). The cards in G1CollectedHeap::_dirty_card_queue_set
> comes from G1ParScanThreadState::update_rs, and update_rs always
> dirties the card that contains the field (*not* the header). Hence,
> if an object crosses card boundaries, then the post-write barrier and
> update_rs will dirty different cards. This has no impact on
> correctness, it is like this for performance reasons (dirtying the
> card that contains the object header leads to fewer dirty cards, but
> we don't have quick access to the object header in update_rs).
> 
> So, with the above, I'm fairly confident (famous last words) that
> this patch is working :)

Thanks for this thorough investigation, sounds good.

Ship it.

Thomas


From thomas.schatzl at oracle.com  Fri Jul 14 09:35:04 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 14 Jul 2017 11:35:04 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
Message-ID: <1500024904.3458.8.camel@oracle.com>

Hi all,

? can I have reviews for this change that tries to clean up (and only
clean up) the G1CMBitMap class (and the surrounding helper classes) a
bit?

What has been done:
- fix naming
- improve visibility of methods
- remove superfluous code
- make G1CMBitMapClosure pass a HeapWord* instead of a bitmap index,
avoiding that the user code is cluttered with conversions from bitmap
indices to HeapWords
- remove inheritance between G1CMBitMap and G1CMBitMapRO, similar to
the BitMap class make G1CMBitMapRO a "view" of G1CMBitMap.
- remove unused code in G1CMBitMapRO
- move method implementations into .inline.hpp file

The next CR JDK-8184347 will deal with moving G1CMBitmap* into separate
files.

CR:
https://bugs.openjdk.java.net/browse/JDK-8184346
Webrev:
http://cr.openjdk.java.net/~tschatzl/8184346/webrev/
Testing:
jprt

Thanks,
? Thomas


From rkennke at redhat.com  Fri Jul 14 09:53:16 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 14 Jul 2017 11:53:16 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <1500024904.3458.8.camel@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
Message-ID: <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>

Hi Thomas,

> Hi all,
>
>   can I have reviews for this change that tries to clean up (and only
> clean up) the G1CMBitMap class (and the surrounding helper classes) a
> bit?
>
> What has been done:
> - fix naming
> - improve visibility of methods
> - remove superfluous code
> - make G1CMBitMapClosure pass a HeapWord* instead of a bitmap index,
> avoiding that the user code is cluttered with conversions from bitmap
> indices to HeapWords
> - remove inheritance between G1CMBitMap and G1CMBitMapRO, similar to
> the BitMap class make G1CMBitMapRO a "view" of G1CMBitMap.
> - remove unused code in G1CMBitMapRO
> - move method implementations into .inline.hpp file
The changes look good to me.

+ return _cm->nextMarkBitMap()->is_marked((HeapWord *)obj);

I'd write that as

(HeapWord*) obj

but I'm never quite sure what style is preferable in Hotspot ;-)


Are changes in  g1FromCardCache.cpp/.hpp unrelated?

> The next CR JDK-8184347 will deal with moving G1CMBitmap* into separate
> files.
And while you're at it, you may want to move it to gc/shared and renamed
it to something like MarkBitmap?
https://bugs.openjdk.java.net/browse/JDK-8180193

Best regards,
Roman (not official reviewer)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170714/66445313/attachment.htm>

From erik.helin at oracle.com  Fri Jul 14 10:21:54 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 14 Jul 2017 12:21:54 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
 <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
 <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>
Message-ID: <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com>

On 07/10/2017 04:10 PM, Roman Kennke wrote:
> Am 10.07.2017 um 15:13 schrieb Erik Helin:
>> On 07/07/2017 03:21 PM, Roman Kennke wrote:
>>> Am 07.07.2017 um 14:35 schrieb Erik Helin:
>>>> On 07/06/2017 06:23 PM, Roman Kennke wrote:
>>>>>>>> Ok to push this?
>>>>>>
>>>>>> I just realized that your change doesn't build on Windows since you
>>>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really
>>>>>> picky
>>>>>> about that.
>>>>>> /Mikael
>>>>>
>>>>> Uhhh.
>>>>> Ok, here's revision #3 with precompiled added in:
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.03/>
>>>>
>>>> Hi Roman,
>>>>
>>>> I just started looking :) I think GenCollectedHeap::gc_prologue and
>>>> GenCollectedHeap::gc_epilogue should be virtual, and
>>>> always_do_update_barrier = UseConcMarkSweepGC moved down
>>>> CMSHeap::gc_epilogue.
>>>>
>>>> What do you think?
>>>
>>> Yes, I have seen that. My original plan was to leave it as is because I
>>> know that Erik ?. is working on a big barrier set refactoring that would
>>> remove this code anyway. However, it doesn't really matter, here's the
>>> cleaned up patch:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/
>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.04/>
>>
>> A few comments:
>>
>> cmsHeap.hpp:
>> - you are missing quite a few #includes, but it works since
>>    genCollectedHeap.hpp #includes a whole lot of stuff. Not necessary to
>>    fix now, because the "missing #include" will start to pop up when
>>    someone tries to break apart GenCollectedHeap into smaller pieces.
> Right.
> I always try to minimize includes, especially in header files (they are
> bound to proliferate later anyway). In addition to that, if a class is
> only referenced as pointer, I avoid includes and use forward class
> definition instead.

I think that we in general try to include what is needed, not what only 
what makes the code compile (header guards will of course ensure that 
the header files are only parsed once). So in cmsHeap.hpp, at least I 
would have added:

   #include "gc/cms/concurrentMarkSweepGeneration.hpp"
   #include "gc/shared/collectedHeap.hpp"
   #include "gc/shared/gcCause.hpp"

and forward declared:

   class CLDClosure;
   class OopsInGenClosure;
   class outputStream;
   class StrongRootsScope;
   class ThreadClosure;

>>
>> - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they
>>    be private in CMSHeap?
> They are virtual and protected in GenCollectedHeap and called by
> GenCollectedHeap. Makes sense to also make them protected in CMSHeap? Or
> am I missing something?
> 
>> - there are two `private:` blocks, please use only one `private:`
>>    block.
>>
> Fixed.

And now there is two `protected:` blocks, immediately after each other:

   86 protected:
   87   void gc_prologue(bool full);
   88   void gc_epilogue(bool full);
   89
   90 protected:
   91   // Accessor for memory state verification support
   92   NOT_PRODUCT(
   93     virtual size_t skip_header_HeapWords() { return 
CMSCollector::skip_header_HeapWords(); }
   94   )

IMO, I would just make the three functions above private. I know they 
are protected in GenCollectedHeap, but it should be fine to have them 
private in CMSHeap. Having them protected signals, at least to me, that 
this class could be considered as a base class (protected to me reads 
"this can be accessed by classes inheriting from this class), and we 
don't want any class to inherit from CMSHeap.

>> - one extra newline here:
>>    32 class CMSHeap : public GenCollectedHeap {
>>    33
>>
>> - one extra newline here:
>>    46
>>    47
>>
>> cmsHeap.cpp:
>> - one extra newline here:
>>    36 CMSHeap::CMSHeap(GenCollectorPolicy *policy) :
>> GenCollectedHeap(policy) {
>>    37
>>
>> - one extra newline here:
>>    65
>>    66
>>
> Removed all of them.
> 
>> - do you need to use `this` here?
>>    87   this->GenCollectedHeap::print_on_error(st);
>>
>>    Isn't it enough to just GenCollectedHeap::print_on_error(st)?
> Yes, it is. Just a habit of mine to make it more readable (to me). Fixed it.
>> - one extra newline here:
>>    92 bool CMSHeap::create_cms_collector() {
>>    93
> Fixed.
>> - this is pre-existing, but since we are copying code, do we want to
>>    clean it up?
>>    104   if (collector == NULL ||
>> !collector->completed_initialization()) {
>>    105     if (collector) {
>>    106       delete collector;  // Be nice in embedded situation
>>    107     }
>>    108     vm_shutdown_during_initialization("Could not create CMS
>> collector");
>>    109     return false;
>>    110   }
>>
>>    The collector == NULL check is not needed here. CMSCollector derives
>>    from CHeapObj and CHeapObj::operator new will by default do
>>    vm_exit_out_of_memory if the returned memory is NULL. The check can
>>    just be:
>>
>>    if (!collector->completed_initialization()) {
>>      vm_shutdown_during_initialization("Could not create CMS collector");
>>      return false;
>>    }
>>    return true;
>>
> Ok, good point. Fixed.

Sorry, reading the code again it is obvious that create_cms_collector 
never can return false. It either returns true or calls 
vm_shutdown_during_initialization (which will not return). So, I would 
just make create_cms_collctor void, the if branch below is dead code:

   51   if (!success) return JNI_ENOMEM;

Btw, this code looks really fishy :) The CMSCollector is created with 
new but the pointer (collector) is never stored anywhere. It works, 
becaues the constructor for CMSCollector sets a static variable in 
ConcurrentMarkSweepGeneration, but it isn't exactly beautiful :) Don't 
change this now, I just wanted to point it out, since the code looks a 
bit mysterious.

>> - maybe skip the // success comment here:
>>    111   return true;  // success
> That was probably pre-existing too. Should be thankful that it did not
> say return true; // return true :-P
> 
>> - is it possible to end up in CMSHeap::should_do_concurrent_full_gc()
>>    if we are not using CMS? As in:
>>    123 bool CMSHeap::should_do_concurrent_full_gc(GCCause::Cause cause) {
>>    124   if (!UseConcMarkSweepGC) {
>>    125     return false;
>>    126   }
>>
> Duh. Fixed.
> 
>> - one extra newline here:
>>    135
>>    136
>>
>> genCollectedHeap.hpp:
>> - I don't think you have to make _skip_header_HeapWords protected.
>>    Instead I think we can skip_header_HeapWords() virtual, make it
>>    return 0 in GenCollectedHeap and return
>>    CMSCollector::skip_header_HeapWords in CMSHeap and just remove the _
>>    skip_header_HeapWords variable.
> Great catch! I love it when refactoring leads to simplifications...
> Fixed.
>> - do you really need #ifdef ASSERT around check_gen_kinds?
>>
> No, not really.
> 
>> - can you make GCH_strong_roots_tasks a protected enum in
>>    GenCollectedHeap? As in
>>    class GenCollectedHeap : public CollectedHeap {
>>    protected:
>>      enum StrongRootTasks {
>>        GCH_PS_Universe_oops_do,
>>      };
>>    };
>>
> Good idea. Done.
> 
>> Have you though about vmStructs.cpp, does it need any changes?
> No. I don't really know what needs to go in there. I added:
> 
>    declare_constant(CollectedHeap::CMSHeap)                                \
> 
> just so that it's there next to the other heap types. Not sure what else
> may be needed, if anything?

This is for the serviceability agent. You will have to poke around in 
hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used. 
Unfortunately I'm not that familiar with the agent, perhaps someone else 
can chime in here?

Thanks,
Erik

> http://cr.openjdk.java.net/~rkennke/8179387/webrev.05/
> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.05/>
> 
> Better now?
> 
> Roman
> 


From thomas.schatzl at oracle.com  Fri Jul 14 10:58:32 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 14 Jul 2017 12:58:32 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
Message-ID: <1500029912.3458.26.camel@oracle.com>

Hi Roman,

On Fri, 2017-07-14 at 11:53 +0200, Roman Kennke wrote:
> Hi Thomas,
> 
> > Hi all,
> > 
> > ? can I have reviews for this change that tries to clean up (and
> > only clean up) the G1CMBitMap class (and the surrounding helper
> > classes) a bit?
> > 
> > What has been done:
> > - fix naming
> > - improve visibility of methods
> > - remove superfluous code
> > - make G1CMBitMapClosure pass a HeapWord* instead of a bitmap
> > index, avoiding that the user code is cluttered with conversions
> > from bitmap indices to HeapWords
> > - remove inheritance between G1CMBitMap and G1CMBitMapRO, similar
> > to the BitMap class make G1CMBitMapRO a "view" of G1CMBitMap.
> > - remove unused code in G1CMBitMapRO
> > - move method implementations into .inline.hpp file
> ?The changes look good to me.

Thanks for your review.

> + return _cm->nextMarkBitMap()->is_marked((HeapWord *)obj);
> 
> I'd write that as
> 
> (HeapWord*) obj
> 
> but I'm never quite sure what style is preferable in Hotspot ;-)

I do not know either :) I would kind of prefer no space between cast
and the variable, as casts to me are something like unary operators
where we do not add a space between operator and variable either.

I removed the space between the type and the star at least.

> Are changes in? g1FromCardCache.cpp/.hpp unrelated?

Yes, sorry. I will remove those and send out an extra RFR. I forgot to
split them out.

> > The next CR JDK-8184347 will deal with moving G1CMBitmap* into
> > separate
> > files.
> ?And while you're at it, you may want to move it to gc/shared and
> renamed it to something like MarkBitmap?
> https://bugs.openjdk.java.net/browse/JDK-8180193
> 

Not particularly against this change, but I think we should do the move
and renaming separately when the change is actually required, i.e. just
before there is another dependency on it.

Also, G1CMBitMap has hard dependencies on several other G1 specific
classes, so I think it is too early to move it to the shared directory
from that POV too.

New webrevs:
http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1/?(full)
http://cr.openjdk.java.net/~tschatzl/8184346/webrev.0_to_1/?(diff)

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Fri Jul 14 11:04:57 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 14 Jul 2017 13:04:57 +0200
Subject: RFR (XS): 8184452: Add bounds checking for FromCardCache
Message-ID: <1500030297.3458.29.camel@oracle.com>

Hi all,

? can I have reviews for this change that adds asserts/bounds checking
to the FromCardCache methods?

This helped me a lot to find crashes in some upcoming change, and I
think it is useful to have. If you think it is not worth the trouble,
feel free to tell me and I will retract the change.

CR:
https://bugs.openjdk.java.net/browse/JDK-8184452
Webrev:
http://cr.openjdk.java.net/~tschatzl/8184452/webrev/
Testing:
jprt

Thanks,
? Thomas


From shade at redhat.com  Fri Jul 14 11:12:18 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 14 Jul 2017 13:12:18 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <1500029912.3458.26.camel@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
Message-ID: <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>

Hi Thomas,

On 07/14/2017 12:58 PM, Thomas Schatzl wrote:
>>> The next CR JDK-8184347 will deal with moving G1CMBitmap* into
>>> separate
>>> files.
>>  And while you're at it, you may want to move it to gc/shared and
>> renamed it to something like MarkBitmap?
>> https://bugs.openjdk.java.net/browse/JDK-8180193
>>
> 
> Not particularly against this change, but I think we should do the move
> and renaming separately when the change is actually required, i.e. just
> before there is another dependency on it.

I think this would be inconvenient, because when "another dependency" would come
in a large webrev, it would have to include the CMBitmap move too, complicating
reviews. It seems pulling the actual non-G1-specific parts to shared is good to
minimize those changes.

Would you like us to do take the CMBitmap rename and move to shared/ then, after
you do G1-local move?

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170714/d0fe0226/signature.asc>

From rkennke at redhat.com  Fri Jul 14 11:12:37 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 14 Jul 2017 13:12:37 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <1500029912.3458.26.camel@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
Message-ID: <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com>

Hi Thomas,

> Hi Roman,
>
> On Fri, 2017-07-14 at 11:53 +0200, Roman Kennke wrote:
>> Hi Thomas,
>>
>>> Hi all,
>>>
>>>   can I have reviews for this change that tries to clean up (and
>>> only clean up) the G1CMBitMap class (and the surrounding helper
>>> classes) a bit?
>>>
>>> What has been done:
>>> - fix naming
>>> - improve visibility of methods
>>> - remove superfluous code
>>> - make G1CMBitMapClosure pass a HeapWord* instead of a bitmap
>>> index, avoiding that the user code is cluttered with conversions
>>> from bitmap indices to HeapWords
>>> - remove inheritance between G1CMBitMap and G1CMBitMapRO, similar
>>> to the BitMap class make G1CMBitMapRO a "view" of G1CMBitMap.
>>> - remove unused code in G1CMBitMapRO
>>> - move method implementations into .inline.hpp file
>>  The changes look good to me.
> Thanks for your review.
>
>> + return _cm->nextMarkBitMap()->is_marked((HeapWord *)obj);
>>
>> I'd write that as
>>
>> (HeapWord*) obj
>>
>> but I'm never quite sure what style is preferable in Hotspot ;-)
> I do not know either :) I would kind of prefer no space between cast
> and the variable, as casts to me are something like unary operators
> where we do not add a space between operator and variable either.
>
> I removed the space between the type and the star at least.
Fine for me.
>> Are changes in  g1FromCardCache.cpp/.hpp unrelated?
> Yes, sorry. I will remove those and send out an extra RFR. I forgot to
> split them out.
Ok.

>>> The next CR JDK-8184347 will deal with moving G1CMBitmap* into
>>> separate
>>> files.
>>  And while you're at it, you may want to move it to gc/shared and
>> renamed it to something like MarkBitmap?
>> https://bugs.openjdk.java.net/browse/JDK-8180193
>>
> Not particularly against this change, but I think we should do the move
> and renaming separately when the change is actually required, i.e. just
> before there is another dependency on it.
That's fine for me. Just wanted to point out that this is going to come :-)
> New webrevs:
> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1/ (full)
> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.0_to_1/ (diff)

Good for me.

Roman


From rkennke at redhat.com  Fri Jul 14 11:15:09 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 14 Jul 2017 13:15:09 +0200
Subject: RFR (XS): 8184452: Add bounds checking for FromCardCache
In-Reply-To: <1500030297.3458.29.camel@oracle.com>
References: <1500030297.3458.29.camel@oracle.com>
Message-ID: <bc00e82b-301a-5fd8-52a2-d28a5142b567@redhat.com>

Am 14.07.2017 um 13:04 schrieb Thomas Schatzl:
> Hi all,
>
>   can I have reviews for this change that adds asserts/bounds checking
> to the FromCardCache methods?
>
> This helped me a lot to find crashes in some upcoming change, and I
> think it is useful to have. If you think it is not worth the trouble,
> feel free to tell me and I will retract the change.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8184452
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8184452/webrev/
> Testing:
> jprt
>
> Thanks,
>   Thomas
>
I'm all for more asserts if it helps to figure out bugs, so yes. Change
looks good too.

Roman (not official reviewer)


From thomas.schatzl at oracle.com  Fri Jul 14 11:19:18 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 14 Jul 2017 13:19:18 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
Message-ID: <1500031158.3458.41.camel@oracle.com>

Hi all,

? could I get reviews for this refactoring change that
merges?G1ConcurrentMark::par_mark() and G1ConcurrentMark::grayRoot()
into one single method G1ConcurrentMark::mark_in_next_bitmap() that
factors out all code that is otherwise done multiple times separately.

I.e. checking that the given address is smaller than nTAMS, asserts,
dirty card mark check and the actual card mark into a single file that
is then called everywhere. I also removed some superfluous asserts that
are subsumed in previous asserts or methods used.

It can also be seen as start of cleaning up G1ConcurrentMark.

I intentionally left both?G1ParCopyHelper::mark_object()
and?G1ParCopyHelper::mark_forwarded_object(), although they look like
they could be merged. First, it does not seem worthwhile because their
semantics and asserts seem to be separate enough, second, some probably
not-so-distant future change will need them separate again :P
If you really want I could do that nevertheless.

Note that this change depends on the recent G1CMBitMap cleanup in JDK-
8184346 (but not on the move of G1CMBitMap into separate files that is
referenced in the webrev - the file touched just happens to be in the
mq stack).?

CR:
https://bugs.openjdk.java.net/browse/JDK-8184348
Webrev:
http://cr.openjdk.java.net/~tschatzl/8184348/webrev/
Testing:
jprt, some additional local hotspot test runs

Thanks,
? Thomas


From shade at redhat.com  Fri Jul 14 11:20:44 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 14 Jul 2017 13:20:44 +0200
Subject: RFR (XS): 8184452: Add bounds checking for FromCardCache
In-Reply-To: <1500030297.3458.29.camel@oracle.com>
References: <1500030297.3458.29.camel@oracle.com>
Message-ID: <c04cd31c-f76b-4571-25ef-ff2195aee34d@redhat.com>

On 07/14/2017 01:04 PM, Thomas Schatzl wrote:
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8184452/webrev/

Looks good.

-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170714/c602ab44/signature.asc>

From rkennke at redhat.com  Fri Jul 14 11:22:08 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 14 Jul 2017 13:22:08 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
 <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
 <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>
 <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com>
Message-ID: <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com>

Hi Erik,

> On 07/10/2017 04:10 PM, Roman Kennke wrote:
>> Am 10.07.2017 um 15:13 schrieb Erik Helin:
>>> On 07/07/2017 03:21 PM, Roman Kennke wrote:
>>>> Am 07.07.2017 um 14:35 schrieb Erik Helin:
>>>>> On 07/06/2017 06:23 PM, Roman Kennke wrote:
>>>>>>>>> Ok to push this?
>>>>>>>
>>>>>>> I just realized that your change doesn't build on Windows since you
>>>>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really
>>>>>>> picky
>>>>>>> about that.
>>>>>>> /Mikael
>>>>>>
>>>>>> Uhhh.
>>>>>> Ok, here's revision #3 with precompiled added in:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.03/>
>>>>>
>>>>> Hi Roman,
>>>>>
>>>>> I just started looking :) I think GenCollectedHeap::gc_prologue and
>>>>> GenCollectedHeap::gc_epilogue should be virtual, and
>>>>> always_do_update_barrier = UseConcMarkSweepGC moved down
>>>>> CMSHeap::gc_epilogue.
>>>>>
>>>>> What do you think?
>>>>
>>>> Yes, I have seen that. My original plan was to leave it as is
>>>> because I
>>>> know that Erik ?. is working on a big barrier set refactoring that
>>>> would
>>>> remove this code anyway. However, it doesn't really matter, here's the
>>>> cleaned up patch:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.04/>
>>>
>>> A few comments:
>>>
>>> cmsHeap.hpp:
>>> - you are missing quite a few #includes, but it works since
>>>    genCollectedHeap.hpp #includes a whole lot of stuff. Not
>>> necessary to
>>>    fix now, because the "missing #include" will start to pop up when
>>>    someone tries to break apart GenCollectedHeap into smaller pieces.
>> Right.
>> I always try to minimize includes, especially in header files (they are
>> bound to proliferate later anyway). In addition to that, if a class is
>> only referenced as pointer, I avoid includes and use forward class
>> definition instead.
>
> I think that we in general try to include what is needed, not what
> only what makes the code compile (header guards will of course ensure
> that the header files are only parsed once). So in cmsHeap.hpp, at
> least I would have added:
>
>   #include "gc/cms/concurrentMarkSweepGeneration.hpp"
>   #include "gc/shared/collectedHeap.hpp"
>   #include "gc/shared/gcCause.hpp"
>
> and forward declared:
>
>   class CLDClosure;
>   class OopsInGenClosure;
>   class outputStream;
>   class StrongRootsScope;
>   class ThreadClosure;
Ok, added those and some more that I found. Not sure why we'd need
#include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out for now.

>>>
>>> - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they
>>>    be private in CMSHeap?
>> They are virtual and protected in GenCollectedHeap and called by
>> GenCollectedHeap. Makes sense to also make them protected in CMSHeap? Or
>> am I missing something?
>>
>>> - there are two `private:` blocks, please use only one `private:`
>>>    block.
>>>
>> Fixed.
>
> And now there is two `protected:` blocks, immediately after each other:
>
Duh. Fixed.

> IMO, I would just make the three functions above private. I know they
> are protected in GenCollectedHeap, but it should be fine to have them
> private in CMSHeap. Having them protected signals, at least to me,
> that this class could be considered as a base class (protected to me
> reads "this can be accessed by classes inheriting from this class),
> and we don't want any class to inherit from CMSHeap.

How can they be called from the superclass if they are private in the
subclass? Would that work in C++?

protected (to me) means visibility between super and subclasses. If I'd
want to signal that I intend that to be overridden, I'd say 'virtual'.

> Sorry, reading the code again it is obvious that create_cms_collector
> never can return false. It either returns true or calls
> vm_shutdown_during_initialization (which will not return). So, I would
> just make create_cms_collctor void, the if branch below is dead code:
>
>   51   if (!success) return JNI_ENOMEM;
>
Right! Very good catch! Changed that.

> Btw, this code looks really fishy :)
Err, yep. I'll make a note somewhere (in bugs.o.j.n) to fix that later.
> This is for the serviceability agent. You will have to poke around in
> hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used.
> Unfortunately I'm not that familiar with the agent, perhaps someone
> else can chime in here?

Considering that the remaining references to GenCollectedHeap in
vmStructs.cpp don't look like related to CMSHeap, I'd argue that what I
did is all that's needed for now. Do you agree?

http://cr.openjdk.java.net/~rkennke/8179387/webrev.06.diff/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.06.diff/>
http://cr.openjdk.java.net/~rkennke/8179387/webrev.06/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.06/>

Thanks for reviewing!
Roman


From rkennke at redhat.com  Fri Jul 14 11:24:47 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 14 Jul 2017 13:24:47 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
Message-ID: <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>

Am 14.07.2017 um 13:12 schrieb Aleksey Shipilev:
> Hi Thomas,
>
> On 07/14/2017 12:58 PM, Thomas Schatzl wrote:
>>>> The next CR JDK-8184347 will deal with moving G1CMBitmap* into
>>>> separate
>>>> files.
>>>  And while you're at it, you may want to move it to gc/shared and
>>> renamed it to something like MarkBitmap?
>>> https://bugs.openjdk.java.net/browse/JDK-8180193
>>>
>> Not particularly against this change, but I think we should do the move
>> and renaming separately when the change is actually required, i.e. just
>> before there is another dependency on it.
> I think this would be inconvenient, because when "another dependency" would come
> in a large webrev, it would have to include the CMBitmap move too, complicating
> reviews.
I understood it such that we would post the moving around of gc/g1 files
to gc/shared right before we'd post Shenandoah (in the not-so-distant
future, hopefully). That would work for me. I wouldn't like to include
everything in a giant webrev :-P

Roman


From shade at redhat.com  Fri Jul 14 11:25:53 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 14 Jul 2017 13:25:53 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <1500031158.3458.41.camel@oracle.com>
References: <1500031158.3458.41.camel@oracle.com>
Message-ID: <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>

On 07/14/2017 01:19 PM, Thomas Schatzl wrote:
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8184348/webrev/

*) I'd probably split the assert with newlines. Makes webrevs tidier!

*) This is not needed, because par_mark already has the optimistic check, down
below in Bitmap::par_set_bit?

  54   // Dirty read to avoid CAS.
  55   if (_nextMarkBitMap->is_marked(obj_addr)) {
  56     return false;
  57   }

*) So, mark_reference_grey used to be called from G1CMSATBBufferClosure on
objects below TAMS, but now it would get called on objects past TAMS too?
Doesn't G1 verify there are no bits set in bitmap past TAMS
(G1HeapVerifier::verify_no_bits_over_tams)?

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170714/8e4e10cd/signature.asc>

From erik.helin at oracle.com  Fri Jul 14 12:00:14 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 14 Jul 2017 14:00:14 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
 <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
 <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>
 <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com>
 <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com>
Message-ID: <b0fc4b0c-7067-0b10-e12e-9023be2ddaae@oracle.com>

On 07/14/2017 01:22 PM, Roman Kennke wrote:
> Hi Erik,
> 
>> On 07/10/2017 04:10 PM, Roman Kennke wrote:
>>> Am 10.07.2017 um 15:13 schrieb Erik Helin:
>>>> On 07/07/2017 03:21 PM, Roman Kennke wrote:
>>>>> Am 07.07.2017 um 14:35 schrieb Erik Helin:
>>>>>> On 07/06/2017 06:23 PM, Roman Kennke wrote:
>>>>>>>>>> Ok to push this?
>>>>>>>>
>>>>>>>> I just realized that your change doesn't build on Windows since you
>>>>>>>> didn't #include "precompiled.hpp" in cmsHeap.cpp. MSVC is really
>>>>>>>> picky
>>>>>>>> about that.
>>>>>>>> /Mikael
>>>>>>>
>>>>>>> Uhhh.
>>>>>>> Ok, here's revision #3 with precompiled added in:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.03/
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.03/>
>>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> I just started looking :) I think GenCollectedHeap::gc_prologue and
>>>>>> GenCollectedHeap::gc_epilogue should be virtual, and
>>>>>> always_do_update_barrier = UseConcMarkSweepGC moved down
>>>>>> CMSHeap::gc_epilogue.
>>>>>>
>>>>>> What do you think?
>>>>>
>>>>> Yes, I have seen that. My original plan was to leave it as is
>>>>> because I
>>>>> know that Erik ?. is working on a big barrier set refactoring that
>>>>> would
>>>>> remove this code anyway. However, it doesn't really matter, here's the
>>>>> cleaned up patch:
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.04/
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.04/>
>>>>
>>>> A few comments:
>>>>
>>>> cmsHeap.hpp:
>>>> - you are missing quite a few #includes, but it works since
>>>>     genCollectedHeap.hpp #includes a whole lot of stuff. Not
>>>> necessary to
>>>>     fix now, because the "missing #include" will start to pop up when
>>>>     someone tries to break apart GenCollectedHeap into smaller pieces.
>>> Right.
>>> I always try to minimize includes, especially in header files (they are
>>> bound to proliferate later anyway). In addition to that, if a class is
>>> only referenced as pointer, I avoid includes and use forward class
>>> definition instead.
>>
>> I think that we in general try to include what is needed, not what
>> only what makes the code compile (header guards will of course ensure
>> that the header files are only parsed once). So in cmsHeap.hpp, at
>> least I would have added:
>>
>>    #include "gc/cms/concurrentMarkSweepGeneration.hpp"
>>    #include "gc/shared/collectedHeap.hpp"
>>    #include "gc/shared/gcCause.hpp"
>>
>> and forward declared:
>>
>>    class CLDClosure;
>>    class OopsInGenClosure;
>>    class outputStream;
>>    class StrongRootsScope;
>>    class ThreadClosure;
> Ok, added those and some more that I found. Not sure why we'd need
> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out for now.

Because you are accessing CMSCollcetor in:

  99   NOT_PRODUCT(
  100     virtual size_t skip_header_HeapWords() { return 
CMSCollector::skip_header_HeapWords(); }
  101   )

and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An 
alternative would of course be to just declare skip_header_HeapWords() 
in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then you 
only need to include concurrentMarkSweeoGeneration.hpp in cmsHeap.cpp.

>>>>
>>>> - why are gc_prologue and gc_epilogue protected in CMSHeap? Can't they
>>>>     be private in CMSHeap?
>>> They are virtual and protected in GenCollectedHeap and called by
>>> GenCollectedHeap. Makes sense to also make them protected in CMSHeap? Or
>>> am I missing something?
>>>
>>>> - there are two `private:` blocks, please use only one `private:`
>>>>     block.
>>>>
>>> Fixed.
>>
>> And now there is two `protected:` blocks, immediately after each other:
>>
> Duh. Fixed.
> 
>> IMO, I would just make the three functions above private. I know they
>> are protected in GenCollectedHeap, but it should be fine to have them
>> private in CMSHeap. Having them protected signals, at least to me,
>> that this class could be considered as a base class (protected to me
>> reads "this can be accessed by classes inheriting from this class),
>> and we don't want any class to inherit from CMSHeap.
> 
> How can they be called from the superclass if they are private in the
> subclass? Would that work in C++?
> 
> protected (to me) means visibility between super and subclasses. If I'd
> want to signal that I intend that to be overridden, I'd say 'virtual'.

It is perfectly fine to have private virtual methods in C++ (see for 
example 
https://stackoverflow.com/questions/2170688/private-virtual-method-in-c). 
A virtual function only needs to be protected if a "child class" needs 
to access the function in the "parent class". For both gc_prologue and 
gc_epilogue, this is the case, which is why they have to be 'protected' 
in GenCollectedHeap. But, no class is going to derive from CMSHeap, so 
they can be private in CMSHeap.

skip_header_HeapWords needs to be virtual, since classes inheriting from 
GenCollectedHeap might want to change its behavior. However, no class 
inheriting from GenCollectedHeap (only CMSHeap so far) needs to call 
GenCollectedHeap::skip_header_HeapWords, so it can actually be private 
virtual in GenCollectedHeap. But, in order to not confuse readers, it 
might better to keep it protected virtual in GenCollectedHeap. There is 
no reason to have skip_header_HeapWords protected in CMSHeap though, 
there it should be declared private (and potentially virtual, since 
override comes first in C++11).

>> Sorry, reading the code again it is obvious that create_cms_collector
>> never can return false. It either returns true or calls
>> vm_shutdown_during_initialization (which will not return). So, I would
>> just make create_cms_collctor void, the if branch below is dead code:
>>
>>    51   if (!success) return JNI_ENOMEM;
>>
> Right! Very good catch! Changed that.
> 
>> Btw, this code looks really fishy :)
> Err, yep. I'll make a note somewhere (in bugs.o.j.n) to fix that later.
>> This is for the serviceability agent. You will have to poke around in
>> hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used.
>> Unfortunately I'm not that familiar with the agent, perhaps someone
>> else can chime in here?
> 
> Considering that the remaining references to GenCollectedHeap in
> vmStructs.cpp don't look like related to CMSHeap, I'd argue that what I
> did is all that's needed for now. Do you agree?

Honestly, I don't know, that is why I asked if someone else with more 
knowledge in this area can comment. Have you tried building and using 
the SA agent with your change? You can also ask around on hotspot-rt-dev 
and or serviceability-dev.

Thanks,
Erik

> http://cr.openjdk.java.net/~rkennke/8179387/webrev.06.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.06.diff/>
> http://cr.openjdk.java.net/~rkennke/8179387/webrev.06/
> <http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.06/>
> 
> Thanks for reviewing!
> Roman
> 


From thomas.schatzl at oracle.com  Fri Jul 14 12:09:40 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 14 Jul 2017 14:09:40 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
 <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
Message-ID: <1500034180.3458.67.camel@oracle.com>

Hi Roman,

On Fri, 2017-07-14 at 13:24 +0200, Roman Kennke wrote:
> Am 14.07.2017 um 13:12 schrieb Aleksey Shipilev:
> > 
> > Hi Thomas,
> > 
> > On 07/14/2017 12:58 PM, Thomas Schatzl wrote:
> > > 
> > > > 
> > > > > 
> > > > > The next CR JDK-8184347 will deal with moving G1CMBitmap*
> > > > > into separate files.
> > > > ?And while you're at it, you may want to move it to gc/shared
> > > > and renamed it to something like MarkBitmap?
> > > > https://bugs.openjdk.java.net/browse/JDK-8180193
> > > > 
> > > Not particularly against this change, but I think we should do
> > > the move and renaming separately when the change is actually
> > > required, i.e. just before there is another dependency on it.
> > I think this would be inconvenient, because when "another
> > dependency" would come in a large webrev, it would have to include
> > the CMBitmap move too, complicating reviews.
> I understood it such that we would post the moving around of gc/g1
> files to gc/shared right before we'd post Shenandoah (in the not-so-
> distant future, hopefully). That would work for me. I wouldn't like
> to include everything in a giant webrev :-P
> 

? that is exactly what I meant - thanks for your understanding.

Thomas


From thomas.schatzl at oracle.com  Fri Jul 14 12:20:00 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 14 Jul 2017 14:20:00 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
References: <1500031158.3458.41.camel@oracle.com>
 <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
Message-ID: <1500034800.3458.75.camel@oracle.com>

Hi Aleksey,

? thanks for looking into this.

On Fri, 2017-07-14 at 13:25 +0200, Aleksey Shipilev wrote:
> On 07/14/2017 01:19 PM, Thomas Schatzl wrote:
> > 
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8184348/webrev/
> *) I'd probably split the assert with newlines. Makes webrevs tidier!

Not completely sure what you are referring to, but I split some very
long asserts across lines.

As for the asserts themselves, I tend to group them together in blocks
separate from actual code with newlines. But there are (often) no
newlines between subsequent asserts.

> *) This is not needed, because par_mark already has the optimistic
> check, down
> below in Bitmap::par_set_bit?
> 
> ? 54???// Dirty read to avoid CAS.
> ? 55???if (_nextMarkBitMap->is_marked(obj_addr)) {
> ? 56?????return false;
> ? 57???}

Thanks for catching this, I simply copied this check from the former
grayRoot() method... :)

> *) So, mark_reference_grey used to be called from
> G1CMSATBBufferClosure on
> objects below TAMS, but now it would get called on objects past TAMS
> too?

CMTask::make_reference_grey() now calls
G1ConcurrentMark::mark_in_next_bitmap(), not ConcurrentMark::par_mark()
which does not exist any more: G1ConcurrentMark::mark_in_next_bitmap()
in the first check filters out marking attempts above nTAMS
(g1ConcurrentMark.inline.hpp:47 now), returning false, which makes
make_reference_grey() exit immediately in that case. This seems to
achieve the same effect.

See the comment in g1ConcurrentMark.inline.hpp:51 too, which refers to
that issue.

(The documentation of G1ConcurrentMark::mark_in_next_bitmap() also
mentions that: "Mark the given object on the next bitmap if it is below
nTAMS")

Indeed, I tripped over this when trying to refactor this, and I did do
runs of some gc stress applications with verification on (actually that
issue is also caught during check-in by jprt tests). :)

If you are worried whether there is a performance difference because
maybe now we do more work in some cases, all paths previously leading
to the former G1ConcurrentMark::par_mark() did the nTAMS check in one
way or another already (of course in inconsistent fashion) so there
should be no change here.

There may be some further optimizations to be done here (like for
marking during initial mark pause, as e.g. survivor region nTAMS ==
bottom so we will never put a mark for them), but that, unless like the
one with the duplicated marking check which is dead simple, I would
prefer to do in an extra CR. But please feel free to mention them, I
may pick them up immediately afterwards :)

> Doesn't G1 verify there are no bits set in bitmap past TAMS
> (G1HeapVerifier::verify_no_bits_over_tams)?

It does, but as mentioned above, these mark attempts past nTAMS should
be filtered out as expected.

New webrevs:
http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1/?(full)
http://cr.openjdk.java.net/~tschatzl/8184348/webrev.0_to_1/?(diff)

Thanks a lot,
? Thomas


From shade at redhat.com  Fri Jul 14 13:18:43 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 14 Jul 2017 15:18:43 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <1500034800.3458.75.camel@oracle.com>
References: <1500031158.3458.41.camel@oracle.com>
 <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
 <1500034800.3458.75.camel@oracle.com>
Message-ID: <cef86fcc-c15b-f279-51f5-29aa0617033d@redhat.com>

On 07/14/2017 02:20 PM, Thomas Schatzl wrote:
> Not completely sure what you are referring to, but I split some very
> long asserts across lines.

Yes, I meant that, sorry for not being clear. Any webrev that requires me to
scroll horizontally on 2560-pixel wide screen triggers me!

>> *) So, mark_reference_grey used to be called from
>> G1CMSATBBufferClosure on
>> objects below TAMS, but now it would get called on objects past TAMS
>> too?
> 
> CMTask::make_reference_grey() now calls
> G1ConcurrentMark::mark_in_next_bitmap(), not ConcurrentMark::par_mark()
> which does not exist any more: G1ConcurrentMark::mark_in_next_bitmap()
> in the first check filters out marking attempts above nTAMS
> (g1ConcurrentMark.inline.hpp:47 now), returning false, which makes
> make_reference_grey() exit immediately in that case. This seems to
> achieve the same effect.

Ah, I missed that part! I agree this part is fine then.

> If you are worried whether there is a performance difference because
> maybe now we do more work in some cases, all paths previously leading
> to the former G1ConcurrentMark::par_mark() did the nTAMS check in one
> way or another already (of course in inconsistent fashion) so there
> should be no change here.

No, I am not worried. SATB-heavy workloads have problems way beyond bitmap
marking :)

> New webrevs:
> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1/ (full)

Looks good to me.

-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170714/f54049df/signature.asc>

From thomas.schatzl at oracle.com  Fri Jul 14 14:34:30 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 14 Jul 2017 16:34:30 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <cef86fcc-c15b-f279-51f5-29aa0617033d@redhat.com>
References: <1500031158.3458.41.camel@oracle.com>
 <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
 <1500034800.3458.75.camel@oracle.com>
 <cef86fcc-c15b-f279-51f5-29aa0617033d@redhat.com>
Message-ID: <1500042870.3458.84.camel@oracle.com>

Hi again,

On Fri, 2017-07-14 at 15:18 +0200, Aleksey Shipilev wrote:
> On 07/14/2017 02:20 PM, Thomas Schatzl wrote:
> > 
> > Not completely sure what you are referring to, but I split some
> > very
> > long asserts across lines.
> Yes, I meant that, sorry for not being clear. Any webrev that
> requires me to scroll horizontally on 2560-pixel wide screen triggers
> me!

I noticed that too :)

> > > 
> > > *) So, mark_reference_grey used to be called from
> > > G1CMSATBBufferClosure on
> > > objects below TAMS, but now it would get called on objects past
> > > TAMS
> > > too?
> > CMTask::make_reference_grey() now calls
> > G1ConcurrentMark::mark_in_next_bitmap(), not
> > ConcurrentMark::par_mark()
> > which does not exist any more:
> > G1ConcurrentMark::mark_in_next_bitmap()
> > in the first check filters out marking attempts above nTAMS
> > (g1ConcurrentMark.inline.hpp:47 now), returning false, which makes
> > make_reference_grey() exit immediately in that case. This seems to
> > achieve the same effect.
> Ah, I missed that part! I agree this part is fine then.
> 
> > 
> > If you are worried whether there is a performance difference
> > because maybe now we do more work in some cases, all paths
> > previously leading to the former G1ConcurrentMark::par_mark() did
> > the nTAMS check in one way or another already (of course in
> > inconsistent fashion) so there should be no change here.
> No, I am not worried. SATB-heavy workloads have problems way beyond
> bitmap marking :)
> 
> > 
> > New webrevs:
> > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1/ (full)
> Looks good to me.

Thanks. Unfortunately, after re-appyling and fixing other changes based
on this one I noticed that I missed one opportunity to refactor in
G1CMTask::deal_with_reference(). I would like to add this to this
changeset still... sorry.

There is some note about some perf optimization that mentions that it
is advantagous to do the nTAMS check before determining the heap
region; however I do not think this is an issue.

Quickly comparing runs of a fairly large and reference-intensive
workload (BigRAMTester with 20g heap e.g. attached to JDK-8152438),
marking cycles with the latest webrev.2 are at least as fast as without
any of this RFR's changes.

New webrevs:
http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2?(diff)
http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2?(full)

Thanks,
? Thomas


From daniel.daugherty at oracle.com  Fri Jul 14 22:48:34 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 14 Jul 2017 16:48:34 -0600
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <743ff172-88b6-30dc-8808-a4a97be4a571@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
 <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
 <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>
 <743ff172-88b6-30dc-8808-a4a97be4a571@oracle.com>
Message-ID: <d381bea5-e1fb-b2c0-fc64-aab70c6bc490@oracle.com>

On 7/12/17 2:39 PM, Robbin Ehn wrote:
> On 2017-07-12 15:32, Roman Kennke wrote:
>> Hi Robbin and all,
>>
>> I fixed the 32bit failures by using jlong in all relevant places:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>
>
> Looks good!
>
>>
>> then Robbin found another problem. SafepointCleanupTest started to fail,
>> because "mark nmethods" is no longer printed. This made me think that
>> we're not measuring the conflated (and possibly parallelized)
>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
>> "safepoint cleanup tasks" which measures the total duration of safepoint
>> cleanup. We can't reasonably measure a possibly parallel and conflated
>> pass standalone, but we can measure all and by subtrating all the other
>> subphases, get an idea how long deflation and nmethod marking take up.
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>
>
> Looks good and thanks for fixing
>
> It's time to ship this, can we have a second review please!

 > http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/

src/share/vm/code/nmethod.hpp b/src/share/vm/code/nmethod.hpp
     No comments.

src/share/vm/runtime/safepoint.cpp
     No comments.

src/share/vm/runtime/safepoint.cpp
     No comments.

test/runtime/logging/SafepointCleanupTest.java
     No comments.

Thumbs up. Only looked at the files that changed relative
to the last version that I reviewed (webrev.12, I think)...

Dan


>
> /Robbin
>
>>
>> The full webrev is now:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>>
>> Hope that's all ;-)
>>
>> Roman
>>
>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>>> Hi, unfortunately the push failed on 32-bit.
>>>
>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>>
>>> I do not have anytime to look at this, so here is the error.
>>>
>>> /Robbin
>>>
>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: 
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>> member function 'long int nmethod::stack_traversal_mark()':
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: 
>>>
>>> error: call of overloaded 'load_acquire(volatile long int*)' is 
>>> ambiguous
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: 
>>>
>>> note: candidates are:
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: 
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: 
>>>
>>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: 
>>>
>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>> to 'const volatile jint* {aka const volatile int*}'
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: 
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: 
>>>
>>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: 
>>>
>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>> to 'const volatile juint* {aka const volatile unsigned int*}'
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: 
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: 
>>>
>>> error: call of overloaded 'release_store(volatile long int*, long
>>> int&)' is ambiguous
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: 
>>>
>>> note: candidates are:
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, 
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: 
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: 
>>>
>>> note: static void OrderAccess::release_store(volatile jint*, jint)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: 
>>>
>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>> to 'volatile jint* {aka volatile int*}'
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: 
>>>
>>> note: static void OrderAccess::release_store(volatile juint*, juint)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: 
>>>
>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>> to 'volatile juint* {aka volatile unsigned int*}'
>>>
>>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>>> I'll start a push now.
>>>>
>>>> /Robbin
>>>>
>>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>>> Ok, so I guess I need a sponsor for this now:
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>
>>>>> Roman
>>>>>
>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>>
>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>>> Hi Robbin,
>>>>>>>>>
>>>>>>>>> Far down ->
>>>>>>>>>
>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>>
>>>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>>>> +  }
>>>>>>>>>>>
>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>>> consistent
>>>>>>>>>>> with an OrderAccess::storestore() that's not properly 
>>>>>>>>>>> documented
>>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>>
>>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>>
>>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case 
>>>>>>>>>>>> that
>>>>>>>>>>>>    I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>>> sweeper)
>>>>>>>>>>>>    is holding still.
>>>>>>>>>>>
>>>>>>>>>>> and:
>>>>>>>>>>>
>>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>>
>>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>>> marking
>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>>> (outside
>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>>> storestore()
>>>>>>>>>> should be necessary.
>>>>>>>>>>
>>>>>>>>>>  From Igor's comment I can see how it happened though: 
>>>>>>>>>> Apparently
>>>>>>>>>> there
>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>>> with
>>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>>> required
>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>>> also put
>>>>>>>>>> a storestore() in the other places that call
>>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>>> storestore()
>>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>>> 'for
>>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>>> necessary in
>>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>>
>>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>>> Refactor the
>>>>>>>>>> code so that both paths at least call the storestore() in the 
>>>>>>>>>> same
>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and 
>>>>>>>>>> call
>>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>>
>>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>>
>>>>>>>>> So there is a slight optimization when not running sweeper to 
>>>>>>>>> skip
>>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>>
>>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>>> -  long  stack_traversal_mark() { return
>>>>>>>>> _stack_traversal_mark; }
>>>>>>>>> -  void  set_stack_traversal_mark(long l) {
>>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>>> +  long  stack_traversal_mark() { return
>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>>> +  void  set_stack_traversal_mark(long l) {
>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>>
>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>>> that
>>>>>>>>> it is concurrent accessed.
>>>>>>>>> And remove both storestore.
>>>>>>>>>
>>>>>>>>> "Also neither of these state variables are volatile in 
>>>>>>>>> nmethod, so
>>>>>>>>> even the compiler may reorder the stores"
>>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>>
>>>>>>>>> I think _state also should use la/rs semantics instead, but 
>>>>>>>>> that's
>>>>>>>>> another story.
>>>>>>>> Like this?
>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>
>>>>>>> Yes, exactly, I like this!
>>>>>>> Dan? Igor ? Tobias?
>>>>>>>
>>>>>>
>>>>>> That seems correct.
>>>>>>
>>>>>> igor
>>>>>>
>>>>>>> Thanks Roman!
>>>>>>>
>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow 
>>>>>>> this
>>>>>>> thread/changeset to the end!
>>>>>>>
>>>>>>> /Robbin
>>>>>>>
>>>>>>>> Roman
>>>>>>
>>>>>
>>


From robbin.ehn at oracle.com  Sun Jul 16 08:25:14 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Sun, 16 Jul 2017 10:25:14 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <5e7c7d00-4acd-bea3-3525-33dbd9159efb@oracle.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
 <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
 <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>
Message-ID: <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com>

Hi Roman,

On 2017-07-12 15:32, Roman Kennke wrote:
> Hi Robbin and all,
>
> I fixed the 32bit failures by using jlong in all relevant places:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>
>
> then Robbin found another problem. SafepointCleanupTest started to fail,
> because "mark nmethods" is no longer printed. This made me think that
> we're not measuring the conflated (and possibly parallelized)
> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
> "safepoint cleanup tasks" which measures the total duration of safepoint
> cleanup. We can't reasonably measure a possibly parallel and conflated
> pass standalone, but we can measure all and by subtrating all the other
> subphases, get an idea how long deflation and nmethod marking take up.
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>
>
> The full webrev is now:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>
> Hope that's all ;-)

With this changeset something always pop-ups.

Failure reason: Targets failed.  Target macosx_x64_10.9-fastdebug FAILED.

  /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16 -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070 -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef -Wunused-function -Wformat=2 -DASSERT 
-DCHECK_UNHANDLED_OOPS -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86 -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64 -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED -DINCLUDE_AOT -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22: error: variable has incomplete type 'StrongRootsScope'
     StrongRootsScope srs(num_cleanup_workers);
                      ^
/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: note: forward declaration of 'StrongRootsScope'
class StrongRootsScope;
       ^
/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22: error: variable has incomplete type 'StrongRootsScope'
     StrongRootsScope srs(1);
                      ^
/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: note: forward declaration of 'StrongRootsScope'
class StrongRootsScope;
       ^
2 errors generated.
make[3]: *** [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [hotspot-server-libs] Error 2

Send me the new webrev and I'll test it before the 16th round of review :)

/Robbin

>
> Roman
>
> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>> Hi, unfortunately the push failed on 32-bit.
>>
>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>
>> I do not have anytime to look at this, so here is the error.
>>
>> /Robbin
>>
>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>> member function 'long int nmethod::stack_traversal_mark()':
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>> error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>> note: candidates are:
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>> note:   no known conversion for argument 1 from 'volatile long int*'
>> to 'const volatile jint* {aka const volatile int*}'
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>> note:   no known conversion for argument 1 from 'volatile long int*'
>> to 'const volatile juint* {aka const volatile unsigned int*}'
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>> error: call of overloaded 'release_store(volatile long int*, long
>> int&)' is ambiguous
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>> note: candidates are:
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>                  from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>> note: static void OrderAccess::release_store(volatile jint*, jint)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>> note:   no known conversion for argument 1 from 'volatile long int*'
>> to 'volatile jint* {aka volatile int*}'
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>> note: static void OrderAccess::release_store(volatile juint*, juint)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>> note:   no known conversion for argument 1 from 'volatile long int*'
>> to 'volatile juint* {aka volatile unsigned int*}'
>>
>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>> I'll start a push now.
>>>
>>> /Robbin
>>>
>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>> Ok, so I guess I need a sponsor for this now:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>
>>>> Roman
>>>>
>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>
>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>> Hi Robbin,
>>>>>>>>
>>>>>>>> Far down ->
>>>>>>>>
>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>
>>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>>> +  }
>>>>>>>>>>
>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>> consistent
>>>>>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>
>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>
>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>>>>>    I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>> sweeper)
>>>>>>>>>>>    is holding still.
>>>>>>>>>>
>>>>>>>>>> and:
>>>>>>>>>>
>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>
>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>> marking
>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>> (outside
>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>> storestore()
>>>>>>>>> should be necessary.
>>>>>>>>>
>>>>>>>>>  From Igor's comment I can see how it happened though: Apparently
>>>>>>>>> there
>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>> with
>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>> required
>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>> also put
>>>>>>>>> a storestore() in the other places that call
>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>> storestore()
>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>> 'for
>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>> necessary in
>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>
>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>> Refactor the
>>>>>>>>> code so that both paths at least call the storestore() in the same
>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>
>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>
>>>>>>>> So there is a slight optimization when not running sweeper to skip
>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>
>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>>>> _stack_traversal_mark; }
>>>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>
>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>> that
>>>>>>>> it is concurrent accessed.
>>>>>>>> And remove both storestore.
>>>>>>>>
>>>>>>>> "Also neither of these state variables are volatile in nmethod, so
>>>>>>>> even the compiler may reorder the stores"
>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>
>>>>>>>> I think _state also should use la/rs semantics instead, but that's
>>>>>>>> another story.
>>>>>>> Like this?
>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>
>>>>>> Yes, exactly, I like this!
>>>>>> Dan? Igor ? Tobias?
>>>>>>
>>>>>
>>>>> That seems correct.
>>>>>
>>>>> igor
>>>>>
>>>>>> Thanks Roman!
>>>>>>
>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this
>>>>>> thread/changeset to the end!
>>>>>>
>>>>>> /Robbin
>>>>>>
>>>>>>> Roman
>>>>>
>>>>
>


From kim.barrett at oracle.com  Mon Jul 17 00:33:38 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sun, 16 Jul 2017 20:33:38 -0400
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <5964BF9B.4010309@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
 <595CBE40.5050603@oracle.com>
 <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com>
 <5964BF9B.4010309@oracle.com>
Message-ID: <DE6D32E5-40D2-4D7B-A5C3-24149AF5A1EC@oracle.com>

> On Jul 11, 2017, at 8:07 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>> This suggests a potential (though seemingly hard to avoid) fragility
>> resulting from the lowered lock rank.
> 
> Note that this does not matter for JavaThreads (including compiler threads), for concurrent refinement threads or concurrent marking threads, nor does it matter for any thread when marking is not active.
> 
> So it seems to me that the worst consequence of this is possibly worse latency for operations coinciding in time with concurrent marking, that have large amounts of mutations or resurrections, and are not performed by JavaThreads (including compiler threads) or GC threads (that are performing the concurrent marking) or concurrent refinement threads (that have nothing to do with SATB), that are running concurrently with each other.
> 
> That does not seem to be a huge problem in my book. If it was, and an unknown bunch of non-JavaThreads are heavily mutating or resurrecting objects concurrent to marking, such that contention is inflicted on the shared queue lock for the shared SATB queue, then the right solution for that seems to be to give such threads their own local queue, rather than to reduce the time spent under the surprisingly hot shared queue lock.

I think this part of the reply misses my point, though later
discussion is on the right track.

The rank for any locks in the filtering or mutator assist code can be
anything not higher than the CBL lock ranks, since filtering and
mutator assist are invoked in related contexts.  Any locks in the
filtering code must be lower than the shared queue lock ranks.

Reducing the CBL and shared queue ranks to allow them to be locked in
more contexts implicitly imposes additional requirements on the
filtering and mutator assist code, especially the latter, which is not
presently invoked while holding the shared queue lock.  Code which
would have been "easily" safe before this change may now be not so
easy, or may even be broken.  In this discussion we've already
identified two places that require further repair before we can start
taking advantage of these reduced lock ranks.  And future changes in
those areas may be more difficult than with the old lock ranks.

But since I agree with the rationale for reducing the ranks of these
locks, it seems we need to accept these additional costs (some known
additional work needed, and restrictions on future changes).  But we
should remember these costs exist (RFEs for the additional work, maybe
some comments on the filtering and mutator assist API functions
discussing the issue).

>> The present SATB filtering doesn't seem to acquire any locks, but it's
>> a non-trivial amount of code spread over multiple files, so would be
>> easy to miss something or break it in that respect.  Reducing the lock
>> ranks requires being very careful with the SATB filtering code.
> 
> IMO, adding any lock into the SATB barrier which is used all over hotspot in some very shady places arguably requires being very careful regardless of my changes. So I am going to assume whoever does that for whatever reason is going to be careful.
> 
>> The "mutator" help for dirty card queue processing is not presently
>> done for the shared queue, but I think could be today.  I'm less sure
>> about that with lowered queue lock ranks; I *think* there aren't any
>> relevant locks there (other than the very rare shared queue lock in
>> refine_card_concurrently), but that's a substantially larger and more
>> complex amount of code than SATB queue filtering.
> 
> As discussed with Thomas earlier in this thread, there are indeed locks blocking this. The HeapRegionRemSet::_m lock is currently a leaf lock. If collaborative refinement was to be performed on non-Java threads (and non-concurrent refinement threads), then this lock would have to decrease to the access rank first. But we concluded that warrants a new RFE with separate analysis.
> 
> As with the SATB queues though, I do not know what threads would be causing such trouble? It is not JavaThreads (including compiler threads), concurrent refinement threads, concurrent marking threads. That does not leave us with a whole lot of threads to cause that contention on the shared queue lock. And as with the SATB queues, if there are such threads that cause such contention on the shared queue lock, then the right fix seems to be to give them their own local queue and stop taking the shared queue lock in the first place.

A native thread copying a jweak to a (strong) jobject uses the shared
queue.  I don't think we're going to fix that by giving native threads
their own queues.

A Java thread calls into C++, takes a low-rank lock, and while holding
that lock touches a queue.  Everything in the queue touching needs to
be ranked lower than that lock, including filter and mutator assist
code.  That this isn't permitted today is beside the point; this seems
to me to be exactly the sort of situation this change is intended to
permit.

Since I think the rank reductions are a necessary (though not sufficient)
step, call it Reviewed.


From shade at redhat.com  Mon Jul 17 07:23:04 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 17 Jul 2017 09:23:04 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <1500042870.3458.84.camel@oracle.com>
References: <1500031158.3458.41.camel@oracle.com>
 <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
 <1500034800.3458.75.camel@oracle.com>
 <cef86fcc-c15b-f279-51f5-29aa0617033d@redhat.com>
 <1500042870.3458.84.camel@oracle.com>
Message-ID: <a9beb8d7-5bb5-f81a-cae6-d04b2c6e0cdf@redhat.com>

On 07/14/2017 04:34 PM, Thomas Schatzl wrote:
> Thanks. Unfortunately, after re-appyling and fixing other changes based
> on this one I noticed that I missed one opportunity to refactor in
> G1CMTask::deal_with_reference(). I would like to add this to this
> changeset still... sorry.
> 
> There is some note about some perf optimization that mentions that it
> is advantagous to do the nTAMS check before determining the heap
> region; however I do not think this is an issue.
> 
> Quickly comparing runs of a fairly large and reference-intensive
> workload (BigRAMTester with 20g heap e.g. attached to JDK-8152438),
> marking cycles with the latest webrev.2 are at least as fast as without
> any of this RFR's changes.
> 
> New webrevs:
> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff)
> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full)

Looks good.

I wonder what this was about in the old code:

 187   if (_g1h->is_in_g1_reserved(objAddr)) {

New code properly asserts the object is in reserved. Did we ever had oops stored
outside of reserved? That would be surprising!

Thanks,
-Aleksey


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170717/a02f0f42/signature.asc>

From mikael.gerdin at oracle.com  Mon Jul 17 08:07:37 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 17 Jul 2017 10:07:37 +0200
Subject: RFR (XS): 8183538: UpdateRS phase should claim cards
In-Reply-To: <1499945712.2756.2.camel@oracle.com>
References: <1499861747.6693.6.camel@oracle.com>
 <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com>
 <1499945712.2756.2.camel@oracle.com>
Message-ID: <c298d239-6f7e-c541-b39a-6bba6399271f@oracle.com>

Hi Thomas,

On 2017-07-13 13:35, Thomas Schatzl wrote:
> Hi,
> 
> On Thu, 2017-07-13 at 13:09 +0200, Erik Helin wrote:
>> Hi Thomas,
>>
>> On 07/12/2017 02:15 PM, Thomas Schatzl wrote:
>>>
>>> Hi all,
>>>
>>>    please review this small change that adds claiming of cards in
>>> the
>>> update rs phase so that scan rs does not rescan them.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8183538
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8183538/webrev/
>> looks good, Reviewed.

Looks good to me as well.
/Mikael

>>
>> I was trying to find a way where we could utilize the claim_card
>> function, but could not come up with a good approach. Push this and
>> then we can see if we can reduce the slight code/logic duplication
>> later.
> 
>    yes, me too :) All variants I could think of would penalize one or
> the other phase.
> 
> Thanks for your review.
> 
> Thanks,
>    Thomas
> 


From thomas.schatzl at oracle.com  Mon Jul 17 08:23:37 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 17 Jul 2017 10:23:37 +0200
Subject: RFR (XS): 8183538: UpdateRS phase should claim cards
In-Reply-To: <c298d239-6f7e-c541-b39a-6bba6399271f@oracle.com>
References: <1499861747.6693.6.camel@oracle.com>
 <83250fb8-84aa-5764-bd52-5a5dccfd2e49@oracle.com>
 <1499945712.2756.2.camel@oracle.com>
 <c298d239-6f7e-c541-b39a-6bba6399271f@oracle.com>
Message-ID: <1500279817.2845.7.camel@oracle.com>

Hi Mikael,

On Mon, 2017-07-17 at 10:07 +0200, Mikael Gerdin wrote:
> Hi Thomas,
> 
> On 2017-07-13 13:35, Thomas Schatzl wrote:
> > 
> > Hi,
> > 
> > On Thu, 2017-07-13 at 13:09 +0200, Erik Helin wrote:
> > > 
> > > Hi Thomas,
> > > 
> > > On 07/12/2017 02:15 PM, Thomas Schatzl wrote:
> > > > 
> > > > 
> > > > Hi all,
> > > > 
> > > > ???please review this small change that adds claiming of cards
> > > > in the update rs phase so that scan rs does not rescan them.
> > > > 
> > > > CR:
> > > > https://bugs.openjdk.java.net/browse/JDK-8183538
> > > > Webrev:
> > > > http://cr.openjdk.java.net/~tschatzl/8183538/webrev/
> > > looks good, Reviewed.
> Looks good to me as well.
> /Mikael

? thanks for your review.

Thomas


From thomas.schatzl at oracle.com  Mon Jul 17 08:25:25 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 17 Jul 2017 10:25:25 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <a9beb8d7-5bb5-f81a-cae6-d04b2c6e0cdf@redhat.com>
References: <1500031158.3458.41.camel@oracle.com>
 <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
 <1500034800.3458.75.camel@oracle.com>
 <cef86fcc-c15b-f279-51f5-29aa0617033d@redhat.com>
 <1500042870.3458.84.camel@oracle.com>
 <a9beb8d7-5bb5-f81a-cae6-d04b2c6e0cdf@redhat.com>
Message-ID: <1500279925.2845.8.camel@oracle.com>

Hi Aleksey,

On Mon, 2017-07-17 at 09:23 +0200, Aleksey Shipilev wrote:
> On 07/14/2017 04:34 PM, Thomas Schatzl wrote:
> > 
> > Thanks. Unfortunately, after re-appyling and fixing other changes
> > based
> > on this one I noticed that I missed one opportunity to refactor in
> > G1CMTask::deal_with_reference(). I would like to add this to this
> > changeset still... sorry.
> > 
> > There is some note about some perf optimization that mentions that
> > it
> > is advantagous to do the nTAMS check before determining the heap
> > region; however I do not think this is an issue.
> > 
> > Quickly comparing runs of a fairly large and reference-intensive
> > workload (BigRAMTester with 20g heap e.g. attached to JDK-8152438),
> > marking cycles with the latest webrev.2 are at least as fast as
> > without
> > any of this RFR's changes.
> > 
> > New webrevs:
> > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full)
> Looks good.
> 
> I wonder what this was about in the old code:
> 
> ?187???if (_g1h->is_in_g1_reserved(objAddr)) {
> 
> New code properly asserts the object is in reserved. Did we ever had
> oops stored
> outside of reserved? That would be surprising!

? the reference can be NULL here. The is_in_g1_reserved() check also
filters those, in a bit of a crude way. So I changed this to an
explicit NULL check, and let it run into the assert (in
ConcurrentMark::mark_in_next_bitmap()) in other cases.

I have not seen any issues in my testing of the changes I extracted
these from. There should obviously no oops referencing anything outside
of the heap.

Thanks for your review.

Thanks,
? Thomas


From shade at redhat.com  Mon Jul 17 08:29:32 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 17 Jul 2017 10:29:32 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <1500279925.2845.8.camel@oracle.com>
References: <1500031158.3458.41.camel@oracle.com>
 <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
 <1500034800.3458.75.camel@oracle.com>
 <cef86fcc-c15b-f279-51f5-29aa0617033d@redhat.com>
 <1500042870.3458.84.camel@oracle.com>
 <a9beb8d7-5bb5-f81a-cae6-d04b2c6e0cdf@redhat.com>
 <1500279925.2845.8.camel@oracle.com>
Message-ID: <3c8fd54d-bce5-229b-38d9-f9ede82e2c54@redhat.com>

On 07/17/2017 10:25 AM, Thomas Schatzl wrote:
>>> New webrevs:
>>> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff)
>>> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full)
>> Looks good.
>>
>> I wonder what this was about in the old code:
>>
>>  187   if (_g1h->is_in_g1_reserved(objAddr)) {
>>
>> New code properly asserts the object is in reserved. Did we ever had
>> oops stored
>> outside of reserved? That would be surprising!
> 
>   the reference can be NULL here. The is_in_g1_reserved() check also
> filters those, in a bit of a crude way. So I changed this to an
> explicit NULL check, and let it run into the assert (in
> ConcurrentMark::mark_in_next_bitmap()) in other cases.

That explains it, thanks. Go!

-Aleksey


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170717/898fff75/signature.asc>

From thomas.schatzl at oracle.com  Mon Jul 17 08:33:10 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 17 Jul 2017 10:33:10 +0200
Subject: RFR (XS): 8184452: Add bounds checking for FromCardCache
In-Reply-To: <c04cd31c-f76b-4571-25ef-ff2195aee34d@redhat.com>
References: <1500030297.3458.29.camel@oracle.com>
 <c04cd31c-f76b-4571-25ef-ff2195aee34d@redhat.com>
Message-ID: <1500280390.2845.11.camel@oracle.com>

Hi Roman, Aleksey,

On Fri, 2017-07-14 at 13:20 +0200, Aleksey Shipilev wrote:
> On 07/14/2017 01:04 PM, Thomas Schatzl wrote:
> > 
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8184452/webrev/
> Looks good.

On Fri, 2017-07-14 at 13:15 +0200, Roman Kennke wrote:
> Am 14.07.2017 um 13:04 schrieb Thomas Schatzl:
> >?
> > Hi all,
> >?
> >???can I have reviews for this change that adds asserts/bounds
> > checking to the FromCardCache methods?
> > [...]
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8184452/webrev/
> > Testing:
> > jprt
> >?
> > Thanks,
> >???Thomas
>?
> I'm all for more asserts if it helps to figure out bugs, so yes.
> Change looks good too.
>?
> Roman (not official reviewer)

? thanks for your reviews!

Thanks,
? Thomas


From erik.osterlund at oracle.com  Mon Jul 17 08:49:45 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 17 Jul 2017 10:49:45 +0200
Subject: RFR (S): 8182703: Correct G1 barrier queue lock orderings
In-Reply-To: <DE6D32E5-40D2-4D7B-A5C3-24149AF5A1EC@oracle.com>
References: <59510D5E.10009@oracle.com>
 <25F423D9-F8D5-4E62-8300-CCE106E70777@oracle.com>
 <595CBE40.5050603@oracle.com>
 <6FFC2106-D260-481D-B8C3-DDA849926F23@oracle.com>
 <5964BF9B.4010309@oracle.com>
 <DE6D32E5-40D2-4D7B-A5C3-24149AF5A1EC@oracle.com>
Message-ID: <596C7A29.3000602@oracle.com>

Hi Kim,

Thank you for the review!
I have some comments though...

On 2017-07-17 02:33, Kim Barrett wrote:
>> On Jul 11, 2017, at 8:07 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>>> This suggests a potential (though seemingly hard to avoid) fragility
>>> resulting from the lowered lock rank.
>> Note that this does not matter for JavaThreads (including compiler threads), for concurrent refinement threads or concurrent marking threads, nor does it matter for any thread when marking is not active.
>>
>> So it seems to me that the worst consequence of this is possibly worse latency for operations coinciding in time with concurrent marking, that have large amounts of mutations or resurrections, and are not performed by JavaThreads (including compiler threads) or GC threads (that are performing the concurrent marking) or concurrent refinement threads (that have nothing to do with SATB), that are running concurrently with each other.
>>
>> That does not seem to be a huge problem in my book. If it was, and an unknown bunch of non-JavaThreads are heavily mutating or resurrecting objects concurrent to marking, such that contention is inflicted on the shared queue lock for the shared SATB queue, then the right solution for that seems to be to give such threads their own local queue, rather than to reduce the time spent under the surprisingly hot shared queue lock.
> I think this part of the reply misses my point, though later
> discussion is on the right track.
>
> The rank for any locks in the filtering or mutator assist code can be
> anything not higher than the CBL lock ranks, since filtering and
> mutator assist are invoked in related contexts.  Any locks in the
> filtering code must be lower than the shared queue lock ranks.
>
> Reducing the CBL and shared queue ranks to allow them to be locked in
> more contexts implicitly imposes additional requirements on the
> filtering and mutator assist code, especially the latter, which is not
> presently invoked while holding the shared queue lock.  Code which
> would have been "easily" safe before this change may now be not so
> easy, or may even be broken.  In this discussion we've already
> identified two places that require further repair before we can start
> taking advantage of these reduced lock ranks.  And future changes in
> those areas may be more difficult than with the old lock ranks.

1) I agree - more work is needed to free the unethically caged heap oop 
store. Some constraints have been removed, but there are a few more.
2) I disagree that we can not already take immediate advantage of this. 
My main problem is the SATB queues required for the weak oop load 
barriers in hotspot. They are now free, and therefore I can take 
immediate advantage of these changes.
3) I think that to the greatest extent possible, lock ranks should 
follow the way we intend to lock, rather than letting the ranks affect 
the way we lock. The deadlock detection system was designed to have 
false positives. Therefore we should first figure out if we have a true 
possible deadlock or a false positive. In case of false positives, I 
think we should try pretty hard not to compromise solid locking schemes 
in order to fight false positives of the deadlock detection system. I 
understand this is sometimes difficult, but I think it is a good idea in 
general.

> But since I agree with the rationale for reducing the ranks of these
> locks, it seems we need to accept these additional costs (some known
> additional work needed, and restrictions on future changes).  But we
> should remember these costs exist (RFEs for the additional work, maybe
> some comments on the filtering and mutator assist API functions
> discussing the issue).

I am glad we agree here. I will file RFEs.

>>> The present SATB filtering doesn't seem to acquire any locks, but it's
>>> a non-trivial amount of code spread over multiple files, so would be
>>> easy to miss something or break it in that respect.  Reducing the lock
>>> ranks requires being very careful with the SATB filtering code.
>> IMO, adding any lock into the SATB barrier which is used all over hotspot in some very shady places arguably requires being very careful regardless of my changes. So I am going to assume whoever does that for whatever reason is going to be careful.
>>
>>> The "mutator" help for dirty card queue processing is not presently
>>> done for the shared queue, but I think could be today.  I'm less sure
>>> about that with lowered queue lock ranks; I *think* there aren't any
>>> relevant locks there (other than the very rare shared queue lock in
>>> refine_card_concurrently), but that's a substantially larger and more
>>> complex amount of code than SATB queue filtering.
>> As discussed with Thomas earlier in this thread, there are indeed locks blocking this. The HeapRegionRemSet::_m lock is currently a leaf lock. If collaborative refinement was to be performed on non-Java threads (and non-concurrent refinement threads), then this lock would have to decrease to the access rank first. But we concluded that warrants a new RFE with separate analysis.
>>
>> As with the SATB queues though, I do not know what threads would be causing such trouble? It is not JavaThreads (including compiler threads), concurrent refinement threads, concurrent marking threads. That does not leave us with a whole lot of threads to cause that contention on the shared queue lock. And as with the SATB queues, if there are such threads that cause such contention on the shared queue lock, then the right fix seems to be to give them their own local queue and stop taking the shared queue lock in the first place.
> A native thread copying a jweak to a (strong) jobject uses the shared
> queue.  I don't think we're going to fix that by giving native threads
> their own queues.

I am not sure what threads you are referring to here. But I guess that 
is okay.

> A Java thread calls into C++, takes a low-rank lock, and while holding
> that lock touches a queue.  Everything in the queue touching needs to
> be ranked lower than that lock, including filter and mutator assist
> code.  That this isn't permitted today is beside the point; this seems
> to me to be exactly the sort of situation this change is intended to
> permit.

As mentioned earlier, I specifically need the SATB enqueue barriers to 
be free. I want the heap oop store to be free too, but that is not 
blocking me.

> Since I think the rank reductions are a necessary (though not sufficient)
> step, call it Reviewed.

Thank you for the review.

/Erik


From mikael.gerdin at oracle.com  Mon Jul 17 08:57:21 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 17 Jul 2017 10:57:21 +0200
Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set
In-Reply-To: <1499958417.2756.4.camel@oracle.com>
References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>
 <ddffcf61-df48-d956-676e-14a80a4a3af7@oracle.com>
 <ba4aeac4-40cf-3b42-fc7f-a9faed48c144@oracle.com>
 <1499958417.2756.4.camel@oracle.com>
Message-ID: <8cb3ed9f-e520-4f10-6d6e-fdbb7560859e@oracle.com>

Hi Erik,

On 2017-07-13 17:06, Thomas Schatzl wrote:
> Hi Erik,
> 
> On Thu, 2017-07-13 at 16:53 +0200, Erik Helin wrote:
>> On 07/04/2017 02:17 PM, Mikael Gerdin wrote:
>>>
>>> Hi Erik,
>>>
>>> Do you know if any of the tests actually would have failed if rem
>>> set
>>> reconstruction after evacuation failure didn't work properly?
>>>
>>> I'd feel safer with this change if you ran with some verification
>>> code to ensure that the into_cset queue was always useless when
>>> evac failure occurs.
>>
>> Good point, I have now run GCBasher for a very long time with:
>> -XX:+G1EvacuationFailALot -XX:G1EvacuationFailureALotCount=5
>> -XX:+VerifyBeforeGC -XX:+VerifyAfterGC
>>
>> This mean that GCBasher encounters a (forced) evacuation failure
>> every fifth GC and also runs full verification for every GC. So far
>> it has been working fine.
>>
>> I have also run all tests in the JTReg group hotspot_gc with
>> G1EvacuationFailALot set to true (in g1_globals.hpp) and
>> G1EvacuationFailureALotCount set to 5 (also in g1_globals.hpp). This
>> mean that all GC tests (including the stress tests) encountered an
>> evacuation failure every fifth GC. This also worked fine.
>>
>> I also wrote a new patch against tip (where _into_cset_dcqs is still
>> present) to do some custom verification. The contents of
>> G1RemSet::_into_cset_dcqs and G1CollectedHeap::_dirty_card_queue_set
>> should be identical after a collection. This sort-of worked :)
>>
>> The queues are *very* similar (often around 98% of the cards in
>> G1RemSet::_into_cset_dcqs are found in
>> G1CollectedHeap::_dirty_card_queue_set). The reason for the "missing
>> cards" is that cards in G1RemSet::_into_cset_dcqs comes from the
>> post-write barrier, and the post-write barrier dirties the card that
>> contains the object header (except for arrays, where it dirties the
>> field/slot). The cards in G1CollectedHeap::_dirty_card_queue_set
>> comes from G1ParScanThreadState::update_rs, and update_rs always
>> dirties the card that contains the field (*not* the header). Hence,
>> if an object crosses card boundaries, then the post-write barrier and
>> update_rs will dirty different cards. This has no impact on
>> correctness, it is like this for performance reasons (dirtying the
>> card that contains the object header leads to fewer dirty cards, but
>> we don't have quick access to the object header in update_rs).
>>
>> So, with the above, I'm fairly confident (famous last words) that
>> this patch is working :)
> 
> Thanks for this thorough investigation, sounds good.
> 
> Ship it.

+1

/Mikael
> 
> Thomas
> 


From erik.helin at oracle.com  Mon Jul 17 09:42:47 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 17 Jul 2017 11:42:47 +0200
Subject: RFR: 8183539: Remove G1RemSet::_into_cset_dirty_card_queue_set
In-Reply-To: <8cb3ed9f-e520-4f10-6d6e-fdbb7560859e@oracle.com>
References: <1390aea1-d25a-a465-b0bf-c66490cf682a@oracle.com>
 <ddffcf61-df48-d956-676e-14a80a4a3af7@oracle.com>
 <ba4aeac4-40cf-3b42-fc7f-a9faed48c144@oracle.com>
 <1499958417.2756.4.camel@oracle.com>
 <8cb3ed9f-e520-4f10-6d6e-fdbb7560859e@oracle.com>
Message-ID: <05fd85c9-dc07-8a15-90cd-d3e78d16df56@oracle.com>

On 07/17/2017 10:57 AM, Mikael Gerdin wrote:
> Hi Erik,
>
> On 2017-07-13 17:06, Thomas Schatzl wrote:
>> Hi Erik,
>>
>>> On Thu, 2017-07-13 at 16:53 +0200, Erik Helin wrote:
>>> So, with the above, I'm fairly confident (famous last words) that
>>> this patch is working :)
>>
>> Thanks for this thorough investigation, sounds good.
>>
>> Ship it.
>
> +1

Thanks Thomas and Mikael for reviewing!
Erik

> /Mikael
>>
>> Thomas
>>


From rkennke at redhat.com  Mon Jul 17 12:07:21 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 17 Jul 2017 14:07:21 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <b0fc4b0c-7067-0b10-e12e-9023be2ddaae@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
 <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
 <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>
 <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com>
 <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com>
 <b0fc4b0c-7067-0b10-e12e-9023be2ddaae@oracle.com>
Message-ID: <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com>

(I included hotspot-runtime-dev and serviceability-dev to review
vmStructs.cpp changes. see below)

Hi Erik,

>> Ok, added those and some more that I found. Not sure why we'd need
>> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out
>> for now.
>
> Because you are accessing CMSCollcetor in:
>
>  99   NOT_PRODUCT(
>  100     virtual size_t skip_header_HeapWords() { return
> CMSCollector::skip_header_HeapWords(); }
>  101   )
>
> and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An
> alternative would of course be to just declare skip_header_HeapWords()
> in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then
> you only need to include concurrentMarkSweeoGeneration.hpp in
> cmsHeap.cpp.
Ah ok, I've missed that one. Added it now.

>>> IMO, I would just make the three functions above private. I know they
>>> are protected in GenCollectedHeap, but it should be fine to have them
>>> private in CMSHeap. Having them protected signals, at least to me,
>>> that this class could be considered as a base class (protected to me
>>> reads "this can be accessed by classes inheriting from this class),
>>> and we don't want any class to inherit from CMSHeap.
>>
>> How can they be called from the superclass if they are private in the
>> subclass? Would that work in C++?
>>
>> protected (to me) means visibility between super and subclasses. If I'd
>> want to signal that I intend that to be overridden, I'd say 'virtual'.
>
> It is perfectly fine to have private virtual methods in C++ (see for
> example
> https://stackoverflow.com/questions/2170688/private-virtual-method-in-c).
> A virtual function only needs to be protected if a "child class" needs
> to access the function in the "parent class". For both gc_prologue and
> gc_epilogue, this is the case, which is why they have to be
> 'protected' in GenCollectedHeap. But, no class is going to derive from
> CMSHeap, so they can be private in CMSHeap.
Cool. Learned something new :-) It actually makes sense.

I've moved all 3 methods into the private block in CMSHeap. I left them
virtual (because of missing override), and I also left them in protected
in GenCollectedHeap (prologue/epilogue because we need to,
skip_header_HeapWords() to not confuse readers.)
 
>>> This is for the serviceability agent. You will have to poke around in
>>> hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used.
>>> Unfortunately I'm not that familiar with the agent, perhaps someone
>>> else can chime in here?
>>
>> Considering that the remaining references to GenCollectedHeap in
>> vmStructs.cpp don't look like related to CMSHeap, I'd argue that what I
>> did is all that's needed for now. Do you agree?
>
> Honestly, I don't know, that is why I asked if someone else with more
> knowledge in this area can comment. Have you tried building and using
> the SA agent with your change? You can also ask around on
> hotspot-rt-dev and or serviceability-dev.
I haven't tried building SA. I poked around
hotspot/src/jdk.hotspot.agent and I think it should be ok. Can somebody
who knows about it confirm this?

Differential webrev:
http://cr.openjdk.java.net/~rkennke/8179387/webrev.07.diff/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.07.diff/>
Full webrev:
http://cr.openjdk.java.net/~rkennke/8179387/webrev.07/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.07/>

Roman


From mikael.gerdin at oracle.com  Mon Jul 17 14:12:28 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 17 Jul 2017 16:12:28 +0200
Subject: RFR(XS) 8183935: G1GCPhaseTimes and G1RootProcessor do not include
 macros.hpp
Message-ID: <ffa9c7ba-984f-1a31-0c9c-6cead66135c2@oracle.com>

Hi,

Please review this trivial change to add includes of macros.hpp to 
G1GCPhaseTimes and G1RootProcessor. They both the value of INCLUDE_AOT 
and as such should explicitly include the proper header to ensure that 
it is set to the correct value.

Bug: https://bugs.openjdk.java.net/browse/JDK-8183935
Webrev: http://cr.openjdk.java.net/~mgerdin/8183935/webrev.0/
Testing: JPRT build-only

Thanks
/Mikael


From thomas.schatzl at oracle.com  Mon Jul 17 14:21:37 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 17 Jul 2017 16:21:37 +0200
Subject: RFR(XS) 8183935: G1GCPhaseTimes and G1RootProcessor do not
 include macros.hpp
In-Reply-To: <ffa9c7ba-984f-1a31-0c9c-6cead66135c2@oracle.com>
References: <ffa9c7ba-984f-1a31-0c9c-6cead66135c2@oracle.com>
Message-ID: <1500301297.2845.22.camel@oracle.com>

Hi Mikael,

On Mon, 2017-07-17 at 16:12 +0200, Mikael Gerdin wrote:
> Hi,
> 
> Please review this trivial change to add includes of macros.hpp to?
> G1GCPhaseTimes and G1RootProcessor. They both the value of
> INCLUDE_AOT?
> and as such should explicitly include the proper header to ensure
> that?
> it is set to the correct value.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8183935
> Webrev: http://cr.openjdk.java.net/~mgerdin/8183935/webrev.0/
> Testing: JPRT build-only

? ship it :) Although I do not think it is necessary to include
macros.hpp both in the hpp and cpp file, but not sure. It won't hurt.

Thanks,
? Thomas


From mikael.gerdin at oracle.com  Mon Jul 17 14:22:47 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 17 Jul 2017 16:22:47 +0200
Subject: RFR(XS) 8183935: G1GCPhaseTimes and G1RootProcessor do not
 include macros.hpp
In-Reply-To: <1500301297.2845.22.camel@oracle.com>
References: <ffa9c7ba-984f-1a31-0c9c-6cead66135c2@oracle.com>
 <1500301297.2845.22.camel@oracle.com>
Message-ID: <7e3a87b0-e00b-3726-1835-346f902ab336@oracle.com>

Hi Thomas,

On 2017-07-17 16:21, Thomas Schatzl wrote:
> Hi Mikael,
> 
> On Mon, 2017-07-17 at 16:12 +0200, Mikael Gerdin wrote:
>> Hi,
>>
>> Please review this trivial change to add includes of macros.hpp to
>> G1GCPhaseTimes and G1RootProcessor. They both the value of
>> INCLUDE_AOT
>> and as such should explicitly include the proper header to ensure
>> that
>> it is set to the correct value.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8183935
>> Webrev: http://cr.openjdk.java.net/~mgerdin/8183935/webrev.0/
>> Testing: JPRT build-only
> 
>    ship it :) Although I do not think it is necessary to include
> macros.hpp both in the hpp and cpp file, but not sure. It won't hurt.

I sort of agree but I think it's a nice convention to always #include 
macros in files which look at the INCLUDE_* macros.

Thanks for the review!
/Mikael

> 
> Thanks,
>    Thomas
> 


From erik.helin at oracle.com  Mon Jul 17 14:28:47 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 17 Jul 2017 16:28:47 +0200
Subject: RFR(XS) 8183935: G1GCPhaseTimes and G1RootProcessor do not
 include macros.hpp
In-Reply-To: <ffa9c7ba-984f-1a31-0c9c-6cead66135c2@oracle.com>
References: <ffa9c7ba-984f-1a31-0c9c-6cead66135c2@oracle.com>
Message-ID: <f070755b-6049-5d04-94cc-ae0a6ce9984c@oracle.com>

Reviewed.

Thanks,
Erik

On 07/17/2017 04:12 PM, Mikael Gerdin wrote:
> Hi,
>
> Please review this trivial change to add includes of macros.hpp to
> G1GCPhaseTimes and G1RootProcessor. They both the value of INCLUDE_AOT
> and as such should explicitly include the proper header to ensure that
> it is set to the correct value.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8183935
> Webrev: http://cr.openjdk.java.net/~mgerdin/8183935/webrev.0/
> Testing: JPRT build-only
>
> Thanks
> /Mikael


From shade at redhat.com  Tue Jul 18 08:55:26 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 18 Jul 2017 10:55:26 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
Message-ID: <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>

No comments? I'll ask OpenJDK Lead to move this JEP to Candidate soon then.

Thanks,
-Aleksey

On 07/10/2017 10:14 PM, Aleksey Shipilev wrote:
> Hi,
> 
> I would like to solicit feedback on Epsilon GC JEP:
>   https://bugs.openjdk.java.net/browse/JDK-8174901
>   http://openjdk.java.net/jeps/8174901
> 
> The JEP text should be pretty self-contained, but we can certainly add more
> points after the discussion happens.
> 
> For the last few months, there were quite a few instances where Epsilon proved a
> good vehicle to do GC performance research, especially on object locality and
> code generation fronts. I think it also serves as the trivial target for
> Erik's/Roman's GC interface work.
> 
> The implementation and tests are there in the Sandbox, for those who are curious.
> 
> Thanks,
> -Aleksey
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170718/2cf7d71d/signature.asc>

From erik.helin at oracle.com  Tue Jul 18 10:09:47 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Tue, 18 Jul 2017 12:09:47 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
Message-ID: <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>

Hi Aleksey,

first of all, thanks for trying this out and starting a discussion. 
Regarding the JEP, I have a few questions/comments:
- the JEP specifies "last-drop performance improvements" as a
   motivation. However, I think you also know that taking a pause and
   compacting a heap that is mostly filled with garbage most likely
   results in higher throughput*. So are you thinking in terms of pauses
   here when you say performance?
- why do you think Epsilon GC is a good baseline? IMHO, no barriers is
   not the perfect baseline, since it is just a theoretical exercise.
   Just cranking up the heap and using Serial is more realistic
   baseline, but even using that as a baseline is questionable.
- the JEP specifies this as an experimental feature, meaning that you
   intend non-JVM developers to be able to run this. Have you considered
   the cost of supporting this option? You say "New jtreg tests under
   hotspot/gc/epsilon would be enough to assert correctness". For which
   platforms? How often should these tests be run, every night? Whenever
   we want to do large changes, like updating logging, tracing, etc,
   will we have to take Epsilon GC into account? Will there be
   serviceability support for Epsilon GC, like jstat, MXBeans, perf
   counters etc?
- You quote "The experience, however, tells that many players in the
   Java ecosystem already did this exercise with expunging GC from their
   custom-built JVMs". So it seems that those users that want something
   like Epsilon GC are fine with building OpenJDK themselves? Having
   -XX:+UseEpsilonGC as a developer flag is much different compared to
   exposing it (and supporting, even if in experimental mode) to users.

   Please recall that even removing/changing an experimental flag
   requires a CSR request and careful motivation as why you want to
   remove it.

I guess most of my question can be summarized as: this seems like it 
perhaps could be useful tool for JVM GC developers, why do you want to 
expose the flag to non-JVM developers (given all the 
work/support/maintenance that comes with that)?

It is _great_ that you are experimenting and trying out new ideas in the 
VM, please continue doing that! Please don't interpret my 
questions/comments as to grumpy, this is just my experience from 
maintaining 5-6 different GC algorithms for more than five years that is 
speaking. There is _always_ a maintenance cost :)

Thanks,
Erik

* almost always. There will of course be scenarios where the throughput 
could be higher without compacting.

On 07/18/2017 10:55 AM, Aleksey Shipilev wrote:
> No comments? I'll ask OpenJDK Lead to move this JEP to Candidate soon then.
>
> Thanks,
> -Aleksey
>
> On 07/10/2017 10:14 PM, Aleksey Shipilev wrote:
>> Hi,
>>
>> I would like to solicit feedback on Epsilon GC JEP:
>>   https://bugs.openjdk.java.net/browse/JDK-8174901
>>   http://openjdk.java.net/jeps/8174901
>>
>> The JEP text should be pretty self-contained, but we can certainly add more
>> points after the discussion happens.
>>
>> For the last few months, there were quite a few instances where Epsilon proved a
>> good vehicle to do GC performance research, especially on object locality and
>> code generation fronts. I think it also serves as the trivial target for
>> Erik's/Roman's GC interface work.
>>
>> The implementation and tests are there in the Sandbox, for those who are curious.
>>
>> Thanks,
>> -Aleksey
>>
>
>


From shade at redhat.com  Tue Jul 18 11:23:46 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 18 Jul 2017 13:23:46 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
Message-ID: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>

Hi Erik,

Thanks for looking into this!

On 07/18/2017 12:09 PM, Erik Helin wrote:
> first of all, thanks for trying this out and starting a discussion. Regarding
> the JEP, I have a few questions/comments:
> - the JEP specifies "last-drop performance improvements" as a
>   motivation. However, I think you also know that taking a pause and
>   compacting a heap that is mostly filled with garbage most likely
>   results in higher throughput*. So are you thinking in terms of pauses
>   here when you say performance?

This cuts both ways: while it is true that moving GC improves locality [1], it
is also true that the runtime overhead from barriers can be quite high [2, 3,
4]. So, "performance" in that section is tied to both throughput (no barriers)
and pauses (no pauses).

[1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
[2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
[3] Also, remember the reason for UseCondCardMark
[4] Also, remember the whole thing about G1 barriers

> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>   not the perfect baseline, since it is just a theoretical exercise.
>   Just cranking up the heap and using Serial is more realistic
>   baseline, but even using that as a baseline is questionable.

It sometimes is. Non-generational GC is a good baseline for some workloads. Even
Serial does not cut it, because even if you crank up old and trim down young,
there is no way to disable reference write barrier store that maintains card tables.

> - the JEP specifies this as an experimental feature, meaning that you
>   intend non-JVM developers to be able to run this. Have you considered
>   the cost of supporting this option? You say "New jtreg tests under
>   hotspot/gc/epsilon would be enough to assert correctness". For which
>   platforms? How often should these tests be run, every night? 

I think for all platforms, somewhere in hs-tier3? IMO, current test set in
hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my
4-core i7.

> Whenever we want to do large changes, like updating logging, tracing, etc, 
> will we have to take Epsilon GC into account? Will there be serviceability
> support for Epsilon GC, like jstat, MXBeans, perf counters etc?
I tried to address the maintenance costs in the JEP? It is unlikely to cause
trouble, since it mostly calls into the shared code. And GC interface work would
hopefully make BarrierSet into more shareable chunk of interface, which makes
the whole thing even more self-contained. There is some new code in MemoryPools
that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean
that reports allocation pressure, although I'd need to add a test to assert that.

To me, if the no-op GC requires much maintenance whenever something in JVM is
changing, that points to the insanity of GC interface. No-op GC is a good canary
in the coalmine for this. This is why one of the motivations is seeing what
exactly a minimal GC should support to be functional.


> - You quote "The experience, however, tells that many players in the
>   Java ecosystem already did this exercise with expunging GC from their
>   custom-built JVMs". So it seems that those users that want something
>   like Epsilon GC are fine with building OpenJDK themselves? Having
>   -XX:+UseEpsilonGC as a developer flag is much different compared to
>   exposing it (and supporting, even if in experimental mode) to users.

There is a fair share of survivorship bias: we know about people who succeeded,
do we know how many failed or given up? I think developers who do day-to-day
Hotspot development grossly underestimate the effort required to even build a
custom JVM. Most power users I know have did this exercise with great pains. I
used to sing the same song to them: just build OpenJDK yourself, but then pesky
details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType,
oh new compilers that build OpenJDK with warnings and build does treat warnings
as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc.
etc. As much as OpenJDK build improved over the years, I am not audacious enough
to claim it would ever be a completely smooth experience :) Now I am just
willingly hand them binary builds.

So I think having the experimental feature available in the actual product build
extends the feature exposure. For example, suppose you are the academic writing
a paper on GC, would you accept custom-build JVM into your results, or would you
rather pick up the "gold" binary build from a standard distribution and run with it?


> I guess most of my question can be summarized as: this seems like it perhaps
> could be useful tool for JVM GC developers, why do you want to expose the flag
> to non-JVM developers (given all the work/support/maintenance that comes with
> that)?

My initial thought was that the discussion about the costs should involve
discussing the actual code. This is why there is a complete implementation in
the Sandbox, and also the webrev posted.

In the months following my initial (crazy) experiments, I had multiple people
coming to me and asking when Epsilon is going to be in JDK, because they want to
use it. And those were the ultra-power-users who actually know what they are
doing with their garbage-free applications.

So the short answer about why Epsilon is good to have in product is because the
cost seems low, the benefits are present, and so cost/benefit is still low.


> It is _great_ that you are experimenting and trying out new ideas in the VM,
> please continue doing that! Please don't interpret my questions/comments as
> to grumpy, this is just my experience from maintaining 5-6 different GC
> algorithms for more than five years that is speaking. There is _always_ a
> maintenance cost :)

Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
scary to you, given your experience maintaining the related code?

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170718/17177d73/signature.asc>

From erik.helin at oracle.com  Tue Jul 18 12:37:19 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Tue, 18 Jul 2017 14:37:19 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
Message-ID: <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>

On 07/18/2017 01:23 PM, Aleksey Shipilev wrote:
> Hi Erik,
>
> Thanks for looking into this!
>
> On 07/18/2017 12:09 PM, Erik Helin wrote:
>> first of all, thanks for trying this out and starting a discussion. Regarding
>> the JEP, I have a few questions/comments:
>> - the JEP specifies "last-drop performance improvements" as a
>>   motivation. However, I think you also know that taking a pause and
>>   compacting a heap that is mostly filled with garbage most likely
>>   results in higher throughput*. So are you thinking in terms of pauses
>>   here when you say performance?
>
> This cuts both ways: while it is true that moving GC improves locality [1], it
> is also true that the runtime overhead from barriers can be quite high [2, 3,
> 4]. So, "performance" in that section is tied to both throughput (no barriers)
> and pauses (no pauses).
>
> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
> [3] Also, remember the reason for UseCondCardMark
> [4] Also, remember the whole thing about G1 barriers

Absolutely, barriers can come with an overhead. But a barrier that 
consists of dirtying a card does not come with a quite high overhead. In 
fact, it comes with a very low overhead :)

>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>   not the perfect baseline, since it is just a theoretical exercise.
>>   Just cranking up the heap and using Serial is more realistic
>>   baseline, but even using that as a baseline is questionable.
>
> It sometimes is. Non-generational GC is a good baseline for some workloads. Even
> Serial does not cut it, because even if you crank up old and trim down young,
> there is no way to disable reference write barrier store that maintains card tables.

I will still point out though that a GC without a barrier is still just 
a theoretical baseline. One could imagine a single-gen mark-compact GC 
for OpenJDK (that would require no barriers), but AFAIK almost all users 
prefer the slight overhead of dirtying a card (and in return get a 
generational GC) for the use cases where a single-gen mark-compact 
algorithm would be applicable.

>> - the JEP specifies this as an experimental feature, meaning that you
>>   intend non-JVM developers to be able to run this. Have you considered
>>   the cost of supporting this option? You say "New jtreg tests under
>>   hotspot/gc/epsilon would be enough to assert correctness". For which
>>   platforms? How often should these tests be run, every night?
>
> I think for all platforms, somewhere in hs-tier3? IMO, current test set in
> hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my
> 4-core i7.
>
>> Whenever we want to do large changes, like updating logging, tracing, etc,
>> will we have to take Epsilon GC into account? Will there be serviceability
>> support for Epsilon GC, like jstat, MXBeans, perf counters etc?
> I tried to address the maintenance costs in the JEP? It is unlikely to cause
> trouble, since it mostly calls into the shared code. And GC interface work would
> hopefully make BarrierSet into more shareable chunk of interface, which makes
> the whole thing even more self-contained. There is some new code in MemoryPools
> that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean
> that reports allocation pressure, although I'd need to add a test to assert that.
>
> To me, if the no-op GC requires much maintenance whenever something in JVM is
> changing, that points to the insanity of GC interface. No-op GC is a good canary
> in the coalmine for this. This is why one of the motivations is seeing what
> exactly a minimal GC should support to be functional.

Again, our opinions differ on this. Am I all for changing the GC 
interface? Yes, I have expressed nothing but full support of the great 
work that Roman is doing. Do I think we need something like a canary in 
the coalmine for JVM internal, GC internal, code? No. If you want 
anything resembling a canary, write a unit test using googletest that 
exercises the interface.

However, again, this might be useful for someone who wants try to do 
some changes to the JVM GC code. But that, to me, is not enough to 
expose it to non-JVM developers. It could be useful to have in the 
source code though, maybe like a --with-jvm-feature kind of thing?

>> - You quote "The experience, however, tells that many players in the
>>   Java ecosystem already did this exercise with expunging GC from their
>>   custom-built JVMs". So it seems that those users that want something
>>   like Epsilon GC are fine with building OpenJDK themselves? Having
>>   -XX:+UseEpsilonGC as a developer flag is much different compared to
>>   exposing it (and supporting, even if in experimental mode) to users.
>
> There is a fair share of survivorship bias: we know about people who succeeded,
> do we know how many failed or given up? I think developers who do day-to-day
> Hotspot development grossly underestimate the effort required to even build a
> custom JVM. Most power users I know have did this exercise with great pains. I
> used to sing the same song to them: just build OpenJDK yourself, but then pesky
> details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType,
> oh new compilers that build OpenJDK with warnings and build does treat warnings
> as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc.
> etc. As much as OpenJDK build improved over the years, I am not audacious enough
> to claim it would ever be a completely smooth experience :) Now I am just
> willingly hand them binary builds.

Such users will still be able to get binary builds if someone is willing 
to produce them with Epsilon GC. There are plenty of OpenJDK binary 
builds available from various organizations/companies.

> So I think having the experimental feature available in the actual product build
> extends the feature exposure. For example, suppose you are the academic writing
> a paper on GC, would you accept custom-build JVM into your results, or would you
> rather pick up the "gold" binary build from a standard distribution and run with it?

I guess such researcher would be producing a build from the same source 
as the one the made changes to? How could they otherwise do any kind of 
reasonable comparison?

>> I guess most of my question can be summarized as: this seems like it perhaps
>> could be useful tool for JVM GC developers, why do you want to expose the flag
>> to non-JVM developers (given all the work/support/maintenance that comes with
>> that)?
>
> My initial thought was that the discussion about the costs should involve
> discussing the actual code. This is why there is a complete implementation in
> the Sandbox, and also the webrev posted.
>
> In the months following my initial (crazy) experiments, I had multiple people
> coming to me and asking when Epsilon is going to be in JDK, because they want to
> use it. And those were the ultra-power-users who actually know what they are
> doing with their garbage-free applications.
>
> So the short answer about why Epsilon is good to have in product is because the
> cost seems low, the benefits are present, and so cost/benefit is still low.

And it is here that our opinions differ :) For you the maintenance cost 
is low, whereas for me, having yet another command-line flag, yet 
another code path, gets in the way. You have to respect that we have 
different background and experiences here.

>> It is _great_ that you are experimenting and trying out new ideas in the VM,
>> please continue doing that! Please don't interpret my questions/comments as
>> to grumpy, this is just my experience from maintaining 5-6 different GC
>> algorithms for more than five years that is speaking. There is _always_ a
>> maintenance cost :)
>
> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
> scary to you, given your experience maintaining the related code?

I don't like taking the role of the grumpy open source maintainer :) No, 
the code is not scary, code is rarely scary IMO, it is just code. 
Running tests, fixing that a test -Xmx1g isn't run on a RPi, having 
additional code paths, more cases to take into consideration when 
refactoring, is burdensome. And to me, the benefits of benchmarking 
against Epsilon vs benchmarking against Serial/Parallel isn't that high 
to me.

But, I can understand that it is useful when trying to evaluate for 
example the cost of stores into a HashMap. Which is why I'm not against 
the code, but I'm not keen on exposing this to non-JVM developers.

Thanks,
Erik

> Thanks,
> -Aleksey
>


From rkennke at redhat.com  Tue Jul 18 12:45:25 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 18 Jul 2017 14:45:25 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
Message-ID: <03eb1ee9-d022-18b7-4f91-c9ead4922c60@redhat.com>

Hi Aleksey,

what speaks against doing full GCs when memory runs out?

I can imagine scenarios when it could be useful to allow full-GCs:

1. Allow full-GCs only on System.gc()... for testing? Or for control
fanatics?
2. Allow full-GCs only on OOM.. for containerized apps or as replacement
for letting the process die and respawn (i.e. don't care at all about
pauses, but care about throughput and absolutely-no-barriers)
3. Allow full-GCs in both cases

I can see this enabled/disabled selectively by flags.

Yes, I know, complexity, maintenance, etc blah blah ;-) But it should be
very simple to do. Reuse markSweep.cpp should do it.

Basically serial GC without the generational barriers.

What do you think?

Roman

Am 18.07.2017 um 13:23 schrieb Aleksey Shipilev:
> Hi Erik,
>
> Thanks for looking into this!
>
> On 07/18/2017 12:09 PM, Erik Helin wrote:
>> first of all, thanks for trying this out and starting a discussion. Regarding
>> the JEP, I have a few questions/comments:
>> - the JEP specifies "last-drop performance improvements" as a
>>   motivation. However, I think you also know that taking a pause and
>>   compacting a heap that is mostly filled with garbage most likely
>>   results in higher throughput*. So are you thinking in terms of pauses
>>   here when you say performance?
> This cuts both ways: while it is true that moving GC improves locality [1], it
> is also true that the runtime overhead from barriers can be quite high [2, 3,
> 4]. So, "performance" in that section is tied to both throughput (no barriers)
> and pauses (no pauses).
>
> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
> [3] Also, remember the reason for UseCondCardMark
> [4] Also, remember the whole thing about G1 barriers
>
>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>   not the perfect baseline, since it is just a theoretical exercise.
>>   Just cranking up the heap and using Serial is more realistic
>>   baseline, but even using that as a baseline is questionable.
> It sometimes is. Non-generational GC is a good baseline for some workloads. Even
> Serial does not cut it, because even if you crank up old and trim down young,
> there is no way to disable reference write barrier store that maintains card tables.
>
>> - the JEP specifies this as an experimental feature, meaning that you
>>   intend non-JVM developers to be able to run this. Have you considered
>>   the cost of supporting this option? You say "New jtreg tests under
>>   hotspot/gc/epsilon would be enough to assert correctness". For which
>>   platforms? How often should these tests be run, every night? 
> I think for all platforms, somewhere in hs-tier3? IMO, current test set in
> hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my
> 4-core i7.
>
>> Whenever we want to do large changes, like updating logging, tracing, etc, 
>> will we have to take Epsilon GC into account? Will there be serviceability
>> support for Epsilon GC, like jstat, MXBeans, perf counters etc?
> I tried to address the maintenance costs in the JEP? It is unlikely to cause
> trouble, since it mostly calls into the shared code. And GC interface work would
> hopefully make BarrierSet into more shareable chunk of interface, which makes
> the whole thing even more self-contained. There is some new code in MemoryPools
> that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean
> that reports allocation pressure, although I'd need to add a test to assert that.
>
> To me, if the no-op GC requires much maintenance whenever something in JVM is
> changing, that points to the insanity of GC interface. No-op GC is a good canary
> in the coalmine for this. This is why one of the motivations is seeing what
> exactly a minimal GC should support to be functional.
>
>
>> - You quote "The experience, however, tells that many players in the
>>   Java ecosystem already did this exercise with expunging GC from their
>>   custom-built JVMs". So it seems that those users that want something
>>   like Epsilon GC are fine with building OpenJDK themselves? Having
>>   -XX:+UseEpsilonGC as a developer flag is much different compared to
>>   exposing it (and supporting, even if in experimental mode) to users.
> There is a fair share of survivorship bias: we know about people who succeeded,
> do we know how many failed or given up? I think developers who do day-to-day
> Hotspot development grossly underestimate the effort required to even build a
> custom JVM. Most power users I know have did this exercise with great pains. I
> used to sing the same song to them: just build OpenJDK yourself, but then pesky
> details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType,
> oh new compilers that build OpenJDK with warnings and build does treat warnings
> as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc.
> etc. As much as OpenJDK build improved over the years, I am not audacious enough
> to claim it would ever be a completely smooth experience :) Now I am just
> willingly hand them binary builds.
>
> So I think having the experimental feature available in the actual product build
> extends the feature exposure. For example, suppose you are the academic writing
> a paper on GC, would you accept custom-build JVM into your results, or would you
> rather pick up the "gold" binary build from a standard distribution and run with it?
>
>
>> I guess most of my question can be summarized as: this seems like it perhaps
>> could be useful tool for JVM GC developers, why do you want to expose the flag
>> to non-JVM developers (given all the work/support/maintenance that comes with
>> that)?
> My initial thought was that the discussion about the costs should involve
> discussing the actual code. This is why there is a complete implementation in
> the Sandbox, and also the webrev posted.
>
> In the months following my initial (crazy) experiments, I had multiple people
> coming to me and asking when Epsilon is going to be in JDK, because they want to
> use it. And those were the ultra-power-users who actually know what they are
> doing with their garbage-free applications.
>
> So the short answer about why Epsilon is good to have in product is because the
> cost seems low, the benefits are present, and so cost/benefit is still low.
>
>
>> It is _great_ that you are experimenting and trying out new ideas in the VM,
>> please continue doing that! Please don't interpret my questions/comments as
>> to grumpy, this is just my experience from maintaining 5-6 different GC
>> algorithms for more than five years that is speaking. There is _always_ a
>> maintenance cost :)
> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
> scary to you, given your experience maintaining the related code?
>
> Thanks,
> -Aleksey
>


From erik.osterlund at oracle.com  Tue Jul 18 13:20:04 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 18 Jul 2017 15:20:04 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
Message-ID: <596E0B04.8030407@oracle.com>

Hi Aleksey,

If I understand this correctly, the motivation for EpsilonGC is to be 
able to measure the overheads due to GC pauses and GC barriers and 
measure only the application throughput without GC jitter, and then use 
that as a baseline for measuring performance of an actual GC 
implementation compared to EpsilonGC.

Howerver, automatic memory management is quite complicated when you 
think about it. Will EpsilonGC allocate all memory up-front, or expand 
the heap? In the case where it expanded on-demand until it runs out of 
memory, what consequences does that potential expansion have on 
throughput? In the case it is allocated upfront, will pages be 
pre-touched? If so, what NUMA nodes will the pre-mapped memory map in 
to? Will mutators try to allocate NUMA-local memory? What consequences 
will the larger heap footprint have on the throughput because of 
decreased memory locality and as a result increased last level cache 
misses and suddenly having to spread to more NUMA nodes? Does the larger 
footprint change the requirements on compressed oops and what 
encoding/decoding of oop compression is required? In case of an 
expanding heap - can it even use compressed oops? In case of a not 
expanding heap allocated up-front, does a comparison of a GC using 
compressed oops with a baseline that can inherently not use it make 
sense? Will lack of compaction and resulting possibly worse object 
locality of memory accesses affect performance?

I am not convinced that we can just remove GC-induced overheads from the 
picture and measure the application throughput without the GC by using 
an EpsilonGC as proposed. At least I do not think I would use it to draw 
conclusions about GC-induced throughput loss. It seems like an apples to 
oranges comparison to me. Or perhaps I have missed something?

Thanks,
/Erik

On 2017-07-18 13:23, Aleksey Shipilev wrote:
> Hi Erik,
>
> Thanks for looking into this!
>
> On 07/18/2017 12:09 PM, Erik Helin wrote:
>> first of all, thanks for trying this out and starting a discussion. Regarding
>> the JEP, I have a few questions/comments:
>> - the JEP specifies "last-drop performance improvements" as a
>>    motivation. However, I think you also know that taking a pause and
>>    compacting a heap that is mostly filled with garbage most likely
>>    results in higher throughput*. So are you thinking in terms of pauses
>>    here when you say performance?
> This cuts both ways: while it is true that moving GC improves locality [1], it
> is also true that the runtime overhead from barriers can be quite high [2, 3,
> 4]. So, "performance" in that section is tied to both throughput (no barriers)
> and pauses (no pauses).
>
> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
> [3] Also, remember the reason for UseCondCardMark
> [4] Also, remember the whole thing about G1 barriers
>
>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>    not the perfect baseline, since it is just a theoretical exercise.
>>    Just cranking up the heap and using Serial is more realistic
>>    baseline, but even using that as a baseline is questionable.
> It sometimes is. Non-generational GC is a good baseline for some workloads. Even
> Serial does not cut it, because even if you crank up old and trim down young,
> there is no way to disable reference write barrier store that maintains card tables.
>
>> - the JEP specifies this as an experimental feature, meaning that you
>>    intend non-JVM developers to be able to run this. Have you considered
>>    the cost of supporting this option? You say "New jtreg tests under
>>    hotspot/gc/epsilon would be enough to assert correctness". For which
>>    platforms? How often should these tests be run, every night?
> I think for all platforms, somewhere in hs-tier3? IMO, current test set in
> hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my
> 4-core i7.
>
>> Whenever we want to do large changes, like updating logging, tracing, etc,
>> will we have to take Epsilon GC into account? Will there be serviceability
>> support for Epsilon GC, like jstat, MXBeans, perf counters etc?
> I tried to address the maintenance costs in the JEP? It is unlikely to cause
> trouble, since it mostly calls into the shared code. And GC interface work would
> hopefully make BarrierSet into more shareable chunk of interface, which makes
> the whole thing even more self-contained. There is some new code in MemoryPools
> that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean
> that reports allocation pressure, although I'd need to add a test to assert that.
>
> To me, if the no-op GC requires much maintenance whenever something in JVM is
> changing, that points to the insanity of GC interface. No-op GC is a good canary
> in the coalmine for this. This is why one of the motivations is seeing what
> exactly a minimal GC should support to be functional.
>
>
>> - You quote "The experience, however, tells that many players in the
>>    Java ecosystem already did this exercise with expunging GC from their
>>    custom-built JVMs". So it seems that those users that want something
>>    like Epsilon GC are fine with building OpenJDK themselves? Having
>>    -XX:+UseEpsilonGC as a developer flag is much different compared to
>>    exposing it (and supporting, even if in experimental mode) to users.
> There is a fair share of survivorship bias: we know about people who succeeded,
> do we know how many failed or given up? I think developers who do day-to-day
> Hotspot development grossly underestimate the effort required to even build a
> custom JVM. Most power users I know have did this exercise with great pains. I
> used to sing the same song to them: just build OpenJDK yourself, but then pesky
> details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType,
> oh new compilers that build OpenJDK with warnings and build does treat warnings
> as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc.
> etc. As much as OpenJDK build improved over the years, I am not audacious enough
> to claim it would ever be a completely smooth experience :) Now I am just
> willingly hand them binary builds.
>
> So I think having the experimental feature available in the actual product build
> extends the feature exposure. For example, suppose you are the academic writing
> a paper on GC, would you accept custom-build JVM into your results, or would you
> rather pick up the "gold" binary build from a standard distribution and run with it?
>
>
>> I guess most of my question can be summarized as: this seems like it perhaps
>> could be useful tool for JVM GC developers, why do you want to expose the flag
>> to non-JVM developers (given all the work/support/maintenance that comes with
>> that)?
> My initial thought was that the discussion about the costs should involve
> discussing the actual code. This is why there is a complete implementation in
> the Sandbox, and also the webrev posted.
>
> In the months following my initial (crazy) experiments, I had multiple people
> coming to me and asking when Epsilon is going to be in JDK, because they want to
> use it. And those were the ultra-power-users who actually know what they are
> doing with their garbage-free applications.
>
> So the short answer about why Epsilon is good to have in product is because the
> cost seems low, the benefits are present, and so cost/benefit is still low.
>
>
>> It is _great_ that you are experimenting and trying out new ideas in the VM,
>> please continue doing that! Please don't interpret my questions/comments as
>> to grumpy, this is just my experience from maintaining 5-6 different GC
>> algorithms for more than five years that is speaking. There is _always_ a
>> maintenance cost :)
> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
> scary to you, given your experience maintaining the related code?
>
> Thanks,
> -Aleksey
>


From shade at redhat.com  Tue Jul 18 13:26:03 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 18 Jul 2017 15:26:03 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>
Message-ID: <f981cf41-92d1-0c80-1e40-f85dbd7f2189@redhat.com>

On 07/18/2017 02:37 PM, Erik Helin wrote:
>> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
>> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
>> [3] Also, remember the reason for UseCondCardMark
>> [4] Also, remember the whole thing about G1 barriers
> 
> Absolutely, barriers can come with an overhead. But a barrier that consists of
> dirtying a card does not come with a quite high overhead. In fact, it comes with
> a very low overhead :)

Mhm! "Low" is in the eye of beholder. You can't beat zero overhead. And there
are people who literally count instructions on their hot paths, while still
developing in Java.

Let me ask you a trick question: how do you *know* the card mark overhead is
small, if you don't have a no-barrier GC to compare against?


>>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>>   not the perfect baseline, since it is just a theoretical exercise.
>>>   Just cranking up the heap and using Serial is more realistic
>>>   baseline, but even using that as a baseline is questionable.
>>
>> It sometimes is. Non-generational GC is a good baseline for some workloads. Even
>> Serial does not cut it, because even if you crank up old and trim down young,
>> there is no way to disable reference write barrier store that maintains card
>> tables.
> 
> I will still point out though that a GC without a barrier is still just a
> theoretical baseline. One could imagine a single-gen mark-compact GC for OpenJDK
> (that would require no barriers), but AFAIK almost all users prefer the slight
> overhead of dirtying a card (and in return get a generational GC) for the use
> cases where a single-gen mark-compact algorithm would be applicable.

Mark-compact, maybe. But single-gen mark-sweep algorithms are plenty, see e.g.
Go runtime. I have hard time seeing how is that theoretical.


> However, again, this might be useful for someone who wants try to do some
> changes to the JVM GC code. But that, to me, is not enough to expose it to
> non-JVM developers. It could be useful to have in the source code though, maybe
> like a --with-jvm-feature kind of thing?

That would go against the maintainability argument, no? Because you will still
have to maintain the code, *and* it will require building a special JVM flavor.
So it is a lose-lose: neither users get it, nor maintainers have simpler lives.


> [snip] Such users will still be able to get binary builds if someone is willing to
> produce them with Epsilon GC. There are plenty of OpenJDK binary builds
> available from various organizations/companies.

Well, yes. I actually happen to know the company which can distribute this in
the downstream OpenJDK builds, and reap the ultra-power-users loyalty. But, I am
maintaining that having the code upstream is beneficial, even if that company is
going to do maintenance work either way.


>> So the short answer about why Epsilon is good to have in product is because the
>> cost seems low, the benefits are present, and so cost/benefit is still low.
> 
> And it is here that our opinions differ :) For you the maintenance cost is low,
> whereas for me, having yet another command-line flag, yet another code path,
> gets in the way. You have to respect that we have different background and
> experiences here.

I am not trying to challenge your background or experience here, I am
challenging the cost estimates though. Because ad absurdum, we can shoot down
any feature change coming into JVM, just because it introduces yet another flag,
yet another code path, etc.

I cannot see where the Epsilon maintenance would be a burden: it comes with
automated tests that run fast, its implementation seemss trivial, its exposure
to VM code seems trivial too (apart from the BarrierSet thing that would be
trimmed down with GC interface work).


>> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
>> scary to you, given your experience maintaining the related code?
> 
> I don't like taking the role of the grumpy open source maintainer :) No, the
> code is not scary, code is rarely scary IMO, it is just code. Running tests,
> fixing that a test -Xmx1g isn't run on a RPi, having additional code paths, more
> cases to take into consideration when refactoring, is burdensome. And to me, the
> benefits of benchmarking against Epsilon vs benchmarking against Serial/Parallel
> isn't that high to me.
> 
> But, I can understand that it is useful when trying to evaluate for example the
> cost of stores into a HashMap. Which is why I'm not against the code, but I'm
> not keen on exposing this to non-JVM developers.

I hear you, but thing is, Epsilon does not seem a coding exercise anymore.
Epsilon is useful for GC performance work especially when readily available, and
there are willing users to adopt it. Similarly how we respect maintainers'
burden in the product, we have to also see what benefits users, especially the
ones who are championing our project performance even by cutting corners with
e.g. no-op GCs.

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170718/090840c0/signature.asc>

From rkennke at redhat.com  Tue Jul 18 13:28:26 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 18 Jul 2017 15:28:26 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <596E0B04.8030407@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <596E0B04.8030407@oracle.com>
Message-ID: <ca5648f8-bf2e-41aa-abdc-58044eefa032@redhat.com>

At the very least, Epsilon's a great tool for measuring the cost of
barriers.

How many times have we heard the question: 'but what is the overhead of
the additional barriers of Shenandoah?' And we couldn't really answer
it. Compared to what? G1? Serial? Parallel? CMS? Each of which has their
own peculiarities when it comes to barriers.

With Epsilon it is possible to construct a benchmark that does certain
heap accesses (primitive/objects reads/writes special stuff like CASes,
etc) and do no more allocations (thus locality spread doesn't really
matter) and give an answer to those questions and say: no-barriers
throughput is this, and with that GC's barriers, we have this. etc

I realize that such results are a bit theoretical, but it gives a much
better idea than not having any way to measure this in an isolated way
at all.

Roman

Am 18.07.2017 um 15:20 schrieb Erik ?sterlund:
> Hi Aleksey,
>
> If I understand this correctly, the motivation for EpsilonGC is to be
> able to measure the overheads due to GC pauses and GC barriers and
> measure only the application throughput without GC jitter, and then
> use that as a baseline for measuring performance of an actual GC
> implementation compared to EpsilonGC.
>
> Howerver, automatic memory management is quite complicated when you
> think about it. Will EpsilonGC allocate all memory up-front, or expand
> the heap? In the case where it expanded on-demand until it runs out of
> memory, what consequences does that potential expansion have on
> throughput? In the case it is allocated upfront, will pages be
> pre-touched? If so, what NUMA nodes will the pre-mapped memory map in
> to? Will mutators try to allocate NUMA-local memory? What consequences
> will the larger heap footprint have on the throughput because of
> decreased memory locality and as a result increased last level cache
> misses and suddenly having to spread to more NUMA nodes? Does the
> larger footprint change the requirements on compressed oops and what
> encoding/decoding of oop compression is required? In case of an
> expanding heap - can it even use compressed oops? In case of a not
> expanding heap allocated up-front, does a comparison of a GC using
> compressed oops with a baseline that can inherently not use it make
> sense? Will lack of compaction and resulting possibly worse object
> locality of memory accesses affect performance?
>
> I am not convinced that we can just remove GC-induced overheads from
> the picture and measure the application throughput without the GC by
> using an EpsilonGC as proposed. At least I do not think I would use it
> to draw conclusions about GC-induced throughput loss. It seems like an
> apples to oranges comparison to me. Or perhaps I have missed something?
>
> Thanks,
> /Erik
>
> On 2017-07-18 13:23, Aleksey Shipilev wrote:
>> Hi Erik,
>>
>> Thanks for looking into this!
>>
>> On 07/18/2017 12:09 PM, Erik Helin wrote:
>>> first of all, thanks for trying this out and starting a discussion.
>>> Regarding
>>> the JEP, I have a few questions/comments:
>>> - the JEP specifies "last-drop performance improvements" as a
>>>    motivation. However, I think you also know that taking a pause and
>>>    compacting a heap that is mostly filled with garbage most likely
>>>    results in higher throughput*. So are you thinking in terms of
>>> pauses
>>>    here when you say performance?
>> This cuts both ways: while it is true that moving GC improves
>> locality [1], it
>> is also true that the runtime overhead from barriers can be quite
>> high [2, 3,
>> 4]. So, "performance" in that section is tied to both throughput (no
>> barriers)
>> and pauses (no pauses).
>>
>> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
>> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
>> [3] Also, remember the reason for UseCondCardMark
>> [4] Also, remember the whole thing about G1 barriers
>>
>>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>>    not the perfect baseline, since it is just a theoretical exercise.
>>>    Just cranking up the heap and using Serial is more realistic
>>>    baseline, but even using that as a baseline is questionable.
>> It sometimes is. Non-generational GC is a good baseline for some
>> workloads. Even
>> Serial does not cut it, because even if you crank up old and trim
>> down young,
>> there is no way to disable reference write barrier store that
>> maintains card tables.
>>
>>> - the JEP specifies this as an experimental feature, meaning that you
>>>    intend non-JVM developers to be able to run this. Have you
>>> considered
>>>    the cost of supporting this option? You say "New jtreg tests under
>>>    hotspot/gc/epsilon would be enough to assert correctness". For which
>>>    platforms? How often should these tests be run, every night?
>> I think for all platforms, somewhere in hs-tier3? IMO, current test
>> set in
>> hotspot/gc/epsilon is fairly complete, and it takes less than a
>> minute on my
>> 4-core i7.
>>
>>> Whenever we want to do large changes, like updating logging,
>>> tracing, etc,
>>> will we have to take Epsilon GC into account? Will there be
>>> serviceability
>>> support for Epsilon GC, like jstat, MXBeans, perf counters etc?
>> I tried to address the maintenance costs in the JEP? It is unlikely
>> to cause
>> trouble, since it mostly calls into the shared code. And GC interface
>> work would
>> hopefully make BarrierSet into more shareable chunk of interface,
>> which makes
>> the whole thing even more self-contained. There is some new code in
>> MemoryPools
>> that handles the minimal diagnostics. MXBeans still work, at least
>> ThreadMXBean
>> that reports allocation pressure, although I'd need to add a test to
>> assert that.
>>
>> To me, if the no-op GC requires much maintenance whenever something
>> in JVM is
>> changing, that points to the insanity of GC interface. No-op GC is a
>> good canary
>> in the coalmine for this. This is why one of the motivations is
>> seeing what
>> exactly a minimal GC should support to be functional.
>>
>>
>>> - You quote "The experience, however, tells that many players in the
>>>    Java ecosystem already did this exercise with expunging GC from
>>> their
>>>    custom-built JVMs". So it seems that those users that want something
>>>    like Epsilon GC are fine with building OpenJDK themselves? Having
>>>    -XX:+UseEpsilonGC as a developer flag is much different compared to
>>>    exposing it (and supporting, even if in experimental mode) to users.
>> There is a fair share of survivorship bias: we know about people who
>> succeeded,
>> do we know how many failed or given up? I think developers who do
>> day-to-day
>> Hotspot development grossly underestimate the effort required to even
>> build a
>> custom JVM. Most power users I know have did this exercise with great
>> pains. I
>> used to sing the same song to them: just build OpenJDK yourself, but
>> then pesky
>> details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode,
>> oh FreeType,
>> oh new compilers that build OpenJDK with warnings and build does
>> treat warnings
>> as errors, oh actual API mismatches against msvcrt, glibc, whatever,
>> etc. etc.
>> etc. As much as OpenJDK build improved over the years, I am not
>> audacious enough
>> to claim it would ever be a completely smooth experience :) Now I am
>> just
>> willingly hand them binary builds.
>>
>> So I think having the experimental feature available in the actual
>> product build
>> extends the feature exposure. For example, suppose you are the
>> academic writing
>> a paper on GC, would you accept custom-build JVM into your results,
>> or would you
>> rather pick up the "gold" binary build from a standard distribution
>> and run with it?
>>
>>
>>> I guess most of my question can be summarized as: this seems like it
>>> perhaps
>>> could be useful tool for JVM GC developers, why do you want to
>>> expose the flag
>>> to non-JVM developers (given all the work/support/maintenance that
>>> comes with
>>> that)?
>> My initial thought was that the discussion about the costs should
>> involve
>> discussing the actual code. This is why there is a complete
>> implementation in
>> the Sandbox, and also the webrev posted.
>>
>> In the months following my initial (crazy) experiments, I had
>> multiple people
>> coming to me and asking when Epsilon is going to be in JDK, because
>> they want to
>> use it. And those were the ultra-power-users who actually know what
>> they are
>> doing with their garbage-free applications.
>>
>> So the short answer about why Epsilon is good to have in product is
>> because the
>> cost seems low, the benefits are present, and so cost/benefit is
>> still low.
>>
>>
>>> It is _great_ that you are experimenting and trying out new ideas in
>>> the VM,
>>> please continue doing that! Please don't interpret my
>>> questions/comments as
>>> to grumpy, this is just my experience from maintaining 5-6 different GC
>>> algorithms for more than five years that is speaking. There is
>>> _always_ a
>>> maintenance cost :)
>> Yeah, I know how that feels. Look at the actual Epsilon changes, do
>> they look
>> scary to you, given your experience maintaining the related code?
>>
>> Thanks,
>> -Aleksey
>>
>


From thomas.schatzl at oracle.com  Tue Jul 18 13:34:41 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 18 Jul 2017 15:34:41 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
Message-ID: <1500384881.2815.79.camel@oracle.com>

Hi Aleksey,

? I would like to expand this cost/benefit analysis a bit; I think the
most contentious point brought up by Erik has been the develop vs.
experimental flag issue.

For that, let me present you my understanding of the size and costs of
making this an experimental (actually product) vs. develop flag for the
intended target group as presented here.

On Tue, 2017-07-18 at 13:23 +0200, Aleksey Shipilev wrote:
> Hi Erik,
> 
> Thanks for looking into this!
> 
> On 07/18/2017 12:09 PM, Erik Helin wrote:
> > 
> > first of all, thanks for trying this out and starting a discussion.
> > Regarding the JEP, I have a few questions/comments:
[...]
> 
> > - why do you think Epsilon GC is a good baseline? IMHO, no barriers
> > is not the perfect baseline, since it is just a theoretical
> > exercise. Just cranking up the heap and using Serial is more
> > realistic ? baseline, but even using that as a baseline is
> > questionable.
> It sometimes is. Non-generational GC is a good baseline for some
> workloads. Even Serial does not cut it, because even if you crank up
> old and trim down young, there is no way to disable reference write
> barrier store that maintains card tables.

Not prevented by making it a develop option.

> > - the JEP specifies this as an experimental feature, meaning that
> > you intend non-JVM developers to be able to run this. Have you
> > considered the cost of supporting this option? You say "New jtreg
> > tests under hotspot/gc/epsilon would be enough to assert
> > correctness". For which platforms? How often should these tests be
> > run, every night??
> I think for all platforms, somewhere in hs-tier3? IMO, current test
> set in hotspot/gc/epsilon is fairly complete, and it takes less than
> a minute on my 4-core i7.

Running it daily, on X platforms on Y OSes for Z releases adds up
quickly. Could run something else instead. And there is always
something else to run on these machines, trust me. :)

> >
> > Whenever we want to do large changes, like updating logging,
> > tracing, etc, will we have to take Epsilon GC into account? Will
> > there be serviceability support for Epsilon GC, like jstat,
> > MXBeans, perf counters etc?
> I tried to address the maintenance costs in the JEP? It is unlikely
> to cause trouble, since it mostly calls into the shared code. And GC
> interface work would hopefully make BarrierSet into more shareable
> chunk of interface, which makes the whole thing even more self-
> contained. There is some new code in MemoryPools that handles the
> minimal diagnostics. MXBeans still work, at least ThreadMXBean
> that reports allocation pressure, although I'd need to add a test to
> assert that.
> 
> To me, if the no-op GC requires much maintenance whenever something
> in JVM is changing, that points to the insanity of GC interface. No-
> op GC is a good canary in the coalmine for this. This is why one of
> the motivations is seeing what exactly a minimal GC should support to
> be functional.

Sanity checking of the interfaces is not prevented by a develop option.

> > 
> > - You quote "The experience, however, tells that many players in
> > the Java ecosystem already did this exercise with expunging GC from
> > their custom-built JVMs". So it seems that those users that want
> > something like Epsilon GC are fine with building OpenJDK
> > themselves? Having -XX:+UseEpsilonGC as a developer flag is much
> > different compared to exposing it (and supporting, even if in
> > experimental mode) to users.
>
> There is a fair share of survivorship bias: we know about people who
> succeeded, do we know how many failed or given up? I think developers
> who do day-to-day Hotspot development grossly underestimate the
> effort required to even build a custom JVM. Most power users I know
> have did this exercise with great pains. I used to sing the same song
> to them: just build OpenJDK yourself, but then pesky details pour in.
> Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType, oh
> new compilers that build OpenJDK with warnings and build does treat
> warnings as errors, oh actual API mismatches against msvcrt, glibc,
> whatever, etc. etc. etc. As much as OpenJDK build improved over the
> years, I am not audacious enough to claim it would ever be a
> completely smooth experience :) Now I am just willingly hand them
> binary builds.
> 
> So I think having the experimental feature available in the actual
> product build extends the feature exposure.

I agree here.

The question is, by how much. So academics (and I am not trying to hit
on academics here, you brought them up ;)) that write a paper on GC but
never need to rebuild the VM (including the JDK here) because they
don't do any changes would be inconvenienced.

Let me ask, how many do you expect these to be? From my understanding there seems to be a very manageable yearly total GC paper output at the usual conferences. Not sure how putting Epsilon GC in product would improve that.

So, even after all these target group concerns, how much time do you think these persons writing that paper (that do not need to recompile the VM and need to show their numbers in Epsilon GC) are going to spend on getting numbers compared to the hypothetical time for compiling the VM?

[My personal experience is that when developing any changes by far most of the time is spent on waiting for the machine(s) to complete testing, not writing any actual changes or building. When writing a paper I my experience is that a very large part of the time is spent on running and re-running tests over and over again to be able to understand and explain results, or tweaking changes, or simply fixing bugs for some results]

> For example, suppose you are the academic writing a paper on GC,
> would you accept custom-build JVM into your results, or would you
> rather pick up the "gold" binary build from a standard distribution
> and run with it?

Not sure what you meant with this latter argument, if it is actually an
argument. If I wanted to effect a change in the VM and measure it, I
would already need to change and recompile the VM. So it is not a big
stretch to imagine that baselines could come from something recompiled.
I have seen quite a few papers using modified baselines for one or the
other reason (like adding necessary instrumentation, maybe fixing
obvious bugs).

>From experience I know that for many reasons it is already often
basically impossible for somebody else to reproduce particular results
(without extreme effort) if not impossible. Even understanding some
baseline results may require some imagination how they were obtained.
Not even talking about reproducing them. There seems to be a very small
step from trusting results from a "gold" official binary to trusting a
slightly modified one.


As for the amount of inconvenience, I think the users that already need
to recompile for their changes are not very much inconvenienced. I.e.
changing a single "develop" to "product" seems to be a very small
effort.

> > I guess most of my question can be summarized as: this seems like
> > it perhaps could be useful tool for JVM GC developers, why do you
> > want to expose the flag to non-JVM developers (given all the
> > work/support/maintenance that comes with that)?
> My initial thought was that the discussion about the costs should
> involve discussing the actual code. This is why there is a complete
> implementation in the Sandbox, and also the webrev posted.
> 
> In the months following my initial (crazy) experiments, I had
> multiple people coming to me and asking when Epsilon is going to be
> in JDK, because they want to use it. And those were the ultra-power-
> users who actually know what they are doing with their garbage-free
> applications.

Aren't ultra-power-users able to rebuild the VM? What is their cost vs.
the effort spent into making their applications garbage-free or
implementing the necessary workarounds to be able to use that gc
(mentioned load-balancer trickery etc)?

> So the short answer about why Epsilon is good to have in product is
> because the cost seems low, the benefits are present, and so
> cost/benefit is still low.

The number of people benefitting from having this available in a
product build seems to be extremely small. And so seem to be their
relative costs to fix that.

Increased exposure seems to be a real recurring cost for maintenance in
the product, although it seems relatively small compared to other
features. Still somebody has to do it.

> > It is _great_ that you are experimenting and trying out new ideas
> > in the VM, please continue doing that! Please don't interpret my
> > questions/comments as to grumpy, this is just my experience from
> > maintaining 5-6 different GC algorithms for more than five years
> > that is speaking. There is _always_ a maintenance cost :)
> Yeah, I know how that feels. Look at the actual Epsilon changes, do
> they look scary to you, given your experience maintaining the related
> code?

Well, 1500 LOC (well, ~800 without the tests) of changes do look scary
to me, whatever they do :)

Overall, on the question of develop vs. experimental option, I would
tend to prefer a develop option.
In this area there simply seem to be too many downsides compared to the
upsides for an extremely limited user group.

Thanks,
? Thomas


From shade at redhat.com  Tue Jul 18 13:44:24 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 18 Jul 2017 15:44:24 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <596E0B04.8030407@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <596E0B04.8030407@oracle.com>
Message-ID: <b2716e76-7863-01d8-64dd-8e81f594f4c8@redhat.com>

On 07/18/2017 03:20 PM, Erik ?sterlund wrote:
> If I understand this correctly, the motivation for EpsilonGC is to be able to
> measure the overheads due to GC pauses and GC barriers and measure only the
> application throughput without GC jitter, and then use that as a baseline for
> measuring performance of an actual GC implementation compared to EpsilonGC.

There are several motivations, all in "Motivation" section in JEP. Performance
work is one of them, that's right.

> Howerver, automatic memory management is quite complicated when you think about
> it. 

Yes, and lots of those are handled by the shared code that Epsilon calls into,
just like any other GC.

> Will EpsilonGC allocate all memory up-front, or expand the heap? In the case
> where it expanded on-demand until it runs out of memory, what consequences does
> that potential expansion have on throughput? 

It does have consequences, the same kind of consequences it has with allocating
TLABs. You can trim them down with larger TLABs, larger pages, pre-touching, all
of which are handled outside of Epsilon, by shared code.

> In the case it is allocated upfront, will pages be pre-touched?
Oh yes, there are two lines of code that also handle AlwaysPreTouch. But
otherwise it is handled by shared heap space allocation code. I would like to
see AlwaysPreTouch handled more consistently across GCs though. This is my point
from another mail: if Epsilon has to do something on its own, it is a good sign
shared GC utilities are not much of use.

> If so, what NUMA nodes will the pre-mapped memory map in to? Will mutators
> try to allocate NUMA-local memory?
I think this is handled by shared code, at least for NUMA interleaving. I would
hope that NUMA-aware allocation could be granular to TLABs, in which case it
goes into shared code too, instead of pushing to reimplement this for every GC.
If not, then Epsilon is not fully NUMA-aware.

> What consequences will the larger heap footprint have on the throughput
> because of decreased memory locality and as a result increased last level
> cache misses and suddenly having to spread to more NUMA nodes?
Yes, it would. See two paragraphs below:

> Does the larger footprint change the requirements on compressed oops and
> what encoding/decoding of oop compression is required? In case of an
> expanding heap - can it even use compressed oops? In case of a not expanding
> heap allocated up-front, does a comparison of a GC using compressed oops with
> a baseline that can inherently not use it make sense?
I guess the only relevant point here is, what happens if you need more heap than
32 GB, and then you have to disable compressed oops? In which case, of course,
you will lose. But, you have to keep in mind that the target applications that
are supposed to benefit from Epsilon are low-heap, quite probably zero-garbage.
In this case, the question about heap size is moot: you allocate enough heap to
hold your live data, whether with Epsilon or not.

> Will lack of compaction and resulting possibly worse object locality of
> memory accesses affect performance?
Yes, it would. But it cuts both ways: having more throughput *if* you code with
locality in mind. I am not against GCs that compact, but I do understand there
are cases where I don't want them either.

> I am not convinced that we can just remove GC-induced overheads from the picture
> and measure the application throughput without the GC by using an EpsilonGC as
> proposed. At least I do not think I would use it to draw conclusions about
> GC-induced throughput loss. It seems like an apples to oranges comparison to me.
> Or perhaps I have missed something?

I think this uses a strawman pointing out all other things that could go wrong,
to claim that the only thing the actual no-op GC implementation has to do (e.g.
empty BarrierSet, allocation, and responding to heap exhaustion) is not needed
either :)

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170718/34440076/signature.asc>

From shade at redhat.com  Tue Jul 18 13:46:40 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 18 Jul 2017 15:46:40 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <1500384881.2815.79.camel@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <1500384881.2815.79.camel@oracle.com>
Message-ID: <0cf664ca-1532-7fd3-6644-ef6b910663dd@redhat.com>

Hi Thomas,

(reading the rest a bit later)

On 07/18/2017 03:34 PM, Thomas Schatzl wrote:
> I would like to expand this cost/benefit analysis a bit; I think the
> most contentious point brought up by Erik has been the develop vs.
> experimental flag issue.

> For that, let me present you my understanding of the size and costs of
> making this an experimental (actually product) vs. develop flag for the
> intended target group as presented here.

> Overall, on the question of develop vs. experimental option, I would tend to
> prefer a develop option. In this area there simply seem to be too many
> downsides compared to the upsides for an extremely limited user group.
Ok, suppose we want to hide it from most users. Now we need an option that is
available in release builds (because you want to test native GC performance),
but not openly available in release builds. Which option type is that? I thought
"experimental" is closest to that.

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170718/1a64ca9f/signature.asc>

From shade at redhat.com  Tue Jul 18 14:04:56 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 18 Jul 2017 16:04:56 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <1500384881.2815.79.camel@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <1500384881.2815.79.camel@oracle.com>
Message-ID: <7dc40654-6045-19a4-5610-51c460b38bdb@redhat.com>

(I have read the rest)

Okay, you have convinced me, maintainers do not want to have it exposed as
experimental option. Would you be willing to accept it as develop then?

Other random ramblings:

On 07/18/2017 03:34 PM, Thomas Schatzl wrote:
> Running it daily, on X platforms on Y OSes for Z releases adds up
> quickly. Could run something else instead. And there is always
> something else to run on these machines, trust me. :)

Right. Well, I have recently authored a few changes [1,2] that made Shenandoah
GC tests run around 20% faster in fastdebug. I suppose some of that improvement
is applicable to other GCs too. My question is, can I please have 1 minute of
that machine time per build back as payment? :D

[1] http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/f922d99ce776
[2] http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/9fe3d41b0e51

> The question is, by how much. So academics (and I am not trying to hit
> on academics here, you brought them up ;)) that write a paper on GC but
> never need to rebuild the VM (including the JDK here) because they
> don't do any changes would be inconvenienced.
> 
> Let me ask, how many do you expect these to be? From my understanding there
> seems to be a very manageable yearly total GC paper output at the usual
> conferences. Not sure how putting Epsilon GC in product would improve that.

"Build it and they will come" works here. "develop" is seen as unstable and
untouchable by most.

> As for the amount of inconvenience, I think the users that already need
> to recompile for their changes are not very much inconvenienced. I.e.
> changing a single "develop" to "product" seems to be a very small
> effort.

Okay, we can do this downstream.

> Aren't ultra-power-users able to rebuild the VM? What is their cost vs.
> the effort spent into making their applications garbage-free or
> implementing the necessary workarounds to be able to use that gc
> (mentioned load-balancer trickery etc)?

I am pretty sure they would be much, much, much happier to download the
Oracle/RedHat/Azul's binary build and run with it in production, thus
capitalizing on all the testing those companies did for their JDK binaries.
Native compilers and native toolchains are the bottomless sources of bugs too,
right?


Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170718/8108b960/signature.asc>

From erik.helin at oracle.com  Tue Jul 18 15:22:50 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Tue, 18 Jul 2017 17:22:50 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <f981cf41-92d1-0c80-1e40-f85dbd7f2189@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>
 <f981cf41-92d1-0c80-1e40-f85dbd7f2189@redhat.com>
Message-ID: <e7d5147d-f5bd-b011-d8bc-0667c36485d1@oracle.com>

On 07/18/2017 03:26 PM, Aleksey Shipilev wrote:
> On 07/18/2017 02:37 PM, Erik Helin wrote:
>>> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
>>> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
>>> [3] Also, remember the reason for UseCondCardMark
>>> [4] Also, remember the whole thing about G1 barriers
>>
>> Absolutely, barriers can come with an overhead. But a barrier that consists of
>> dirtying a card does not come with a quite high overhead. In fact, it comes with
>> a very low overhead :)
>
> Mhm! "Low" is in the eye of beholder. You can't beat zero overhead. And there
> are people who literally count instructions on their hot paths, while still
> developing in Java.
>
> Let me ask you a trick question: how do you *know* the card mark overhead is
> small, if you don't have a no-barrier GC to compare against?

There is no need for trick questions. Aleksey, we are working towards 
the same goal: making OpenJDK's GCs better. That doesn't mean we can't 
have different opinions on a few topics.

You of course know the cost a GC barrier by measuring it. You measure it 
by constructing a build where you do not emit the barriers and compare 
it to a build where you do. Again, I have already said that I can see 
your work being useful for other JVM developers.

>>>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>>>   not the perfect baseline, since it is just a theoretical exercise.
>>>>   Just cranking up the heap and using Serial is more realistic
>>>>   baseline, but even using that as a baseline is questionable.
>>>
>>> It sometimes is. Non-generational GC is a good baseline for some workloads. Even
>>> Serial does not cut it, because even if you crank up old and trim down young,
>>> there is no way to disable reference write barrier store that maintains card
>>> tables.
>>
>> I will still point out though that a GC without a barrier is still just a
>> theoretical baseline. One could imagine a single-gen mark-compact GC for OpenJDK
>> (that would require no barriers), but AFAIK almost all users prefer the slight
>> overhead of dirtying a card (and in return get a generational GC) for the use
>> cases where a single-gen mark-compact algorithm would be applicable.
>
> Mark-compact, maybe. But single-gen mark-sweep algorithms are plenty, see e.g.
> Go runtime. I have hard time seeing how is that theoretical.

That is not what I said. As I wrote above:

 > but AFAIK almost all users prefer the slight
 > overhead of dirtying a card (and in return get a generational GC) for
 > the use cases where a single-gen mark-compact algorithm would be
 > applicable.

There are of course use cases for single-gen mark-sweep algorithms, and 
as I write above, for single-gen mark-compact algorithms as well. But 
for Java, and OpenJDK, at least it is my understanding that most users 
prefer a generational algorithm like Serial compared to a single-gen 
mark-compact algorithm (at least I have not seen a lot of users asking 
for that). But maybe I'm missing something here?

This is why I wrote, and still think, that a GC without a barrier for 
Java seems more like a theoretical baseline. There are of course single 
generational GC algorithms that uses a barrier that it would be very 
interesting to see implemented in OpenJDK (including the great work that 
you and others are doing with Shenandoah).

>> However, again, this might be useful for someone who wants try to do some
>> changes to the JVM GC code. But that, to me, is not enough to expose it to
>> non-JVM developers. It could be useful to have in the source code though, maybe
>> like a --with-jvm-feature kind of thing?
>
> That would go against the maintainability argument, no? Because you will still
> have to maintain the code, *and* it will require building a special JVM flavor.
> So it is a lose-lose: neither users get it, nor maintainers have simpler lives.

No, I don't view it that way. Having the code in the upstream repository 
and having it exposed in binary builds are two very different things to 
me, and comes with very different requirements in terms of maintenance. 
If the code is in the upstream repository, then it is a tool for 
developers working in OpenJDK and for integrators building OpenJDK. We 
have a much easier time changing such code compared to code that users 
have come to rely on (and expect certain behavior from).

>> [snip] Such users will still be able to get binary builds if someone is willing to
>> produce them with Epsilon GC. There are plenty of OpenJDK binary builds
>> available from various organizations/companies.
>
> Well, yes. I actually happen to know the company which can distribute this in
> the downstream OpenJDK builds, and reap the ultra-power-users loyalty. But, I am
> maintaining that having the code upstream is beneficial, even if that company is
> going to do maintenance work either way.
>
>
>>> So the short answer about why Epsilon is good to have in product is because the
>>> cost seems low, the benefits are present, and so cost/benefit is still low.
>>
>> And it is here that our opinions differ :) For you the maintenance cost is low,
>> whereas for me, having yet another command-line flag, yet another code path,
>> gets in the way. You have to respect that we have different background and
>> experiences here.
>
> I am not trying to challenge your background or experience here, I am
> challenging the cost estimates though. Because ad absurdum, we can shoot down
> any feature change coming into JVM, just because it introduces yet another flag,
> yet another code path, etc.

Do you see me doing that? I at least hope I am welcoming to everyone 
that wants to contribute a patch to OpenJDK, big or small (please let me 
know otherwise).

> I cannot see where the Epsilon maintenance would be a burden: it comes with
> automated tests that run fast, its implementation seemss trivial, its exposure
> to VM code seems trivial too (apart from the BarrierSet thing that would be
> trimmed down with GC interface work).

And from my experience there is always maintenance work (documentation, 
support, testing matrix increase, etc) with supporting a new kind of 
collector. You and I just do a different cost/benefit analysis on 
exposing this behavior to non-JVM developers.

>>> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
>>> scary to you, given your experience maintaining the related code?
>>
>> I don't like taking the role of the grumpy open source maintainer :) No, the
>> code is not scary, code is rarely scary IMO, it is just code. Running tests,
>> fixing that a test -Xmx1g isn't run on a RPi, having additional code paths, more
>> cases to take into consideration when refactoring, is burdensome. And to me, the
>> benefits of benchmarking against Epsilon vs benchmarking against Serial/Parallel
>> isn't that high to me.
>>
>> But, I can understand that it is useful when trying to evaluate for example the
>> cost of stores into a HashMap. Which is why I'm not against the code, but I'm
>> not keen on exposing this to non-JVM developers.
>
> I hear you, but thing is, Epsilon does not seem a coding exercise anymore.
> Epsilon is useful for GC performance work especially when readily available, and
> there are willing users to adopt it. Similarly how we respect maintainers'
> burden in the product, we have to also see what benefits users, especially the
> ones who are championing our project performance even by cutting corners with
> e.g. no-op GCs.

Yes, you always have to weigh the benefits against the costs, and in 
this case, exposing Epsilon GC to non-JVM developers seems, at least for 
now and to me, taht the benefits do not outweigh the costs. Who knows, 
maybe this will change and we redo the cost/benefit analysis? It is very 
easy to go from developer flag to experimental flag, it is way, way 
harder to go from experimental flag to developer flag.

Thanks,
Erik

> Thanks,
> -Aleksey
>


From shade at redhat.com  Tue Jul 18 15:41:21 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 18 Jul 2017 17:41:21 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <e7d5147d-f5bd-b011-d8bc-0667c36485d1@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>
 <f981cf41-92d1-0c80-1e40-f85dbd7f2189@redhat.com>
 <e7d5147d-f5bd-b011-d8bc-0667c36485d1@oracle.com>
Message-ID: <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com>

Hi Erik,

I think we are coming to a consensus here.

Piece-wise:

On 07/18/2017 05:22 PM, Erik Helin wrote:
> That is not what I said. As I wrote above:
> 
>> but AFAIK almost all users prefer the slight
>> overhead of dirtying a card (and in return get a generational GC) for
>> the use cases where a single-gen mark-compact algorithm would be
>> applicable.
> 
> There are of course use cases for single-gen mark-sweep algorithms, and as I
> write above, for single-gen mark-compact algorithms as well. But for Java, and
> OpenJDK, at least it is my understanding that most users prefer a generational
> algorithm like Serial compared to a single-gen mark-compact algorithm (at least
> I have not seen a lot of users asking for that). But maybe I'm missing something
> here?

Mmm, "prefer" is not the same as "have no other option than trust JVM developers
that generational is better for their workloads, and having no energy to try to
build the collector proving otherwise". Because there is no no collector in
OpenJDK that avoids generational barriers. Saying "prefer" here is very very odd.

> No, I don't view it that way. Having the code in the upstream repository and
> having it exposed in binary builds are two very different things to me, and
> comes with very different requirements in terms of maintenance. If the code is
> in the upstream repository, then it is a tool for developers working in OpenJDK
> and for integrators building OpenJDK. We have a much easier time changing such
> code compared to code that users have come to rely on (and expect certain
> behavior from).

Okay. I am still quite a bit puzzled why "experimental" comes with any notion of
supportability, compatibility, testing coverage, etc. I don't think most of
current experimental options declared in globals.hpp come with that in mind. In
fact, many are even marked with "(Unsafe) (Unstable)"...


>> I hear you, but thing is, Epsilon does not seem a coding exercise anymore.
>> Epsilon is useful for GC performance work especially when readily available, and
>> there are willing users to adopt it. Similarly how we respect maintainers'
>> burden in the product, we have to also see what benefits users, especially the
>> ones who are championing our project performance even by cutting corners with
>> e.g. no-op GCs.
> 
> Yes, you always have to weigh the benefits against the costs, and in this case,
> exposing Epsilon GC to non-JVM developers seems, at least for now and to me,
> taht the benefits do not outweigh the costs. Who knows, maybe this will change
> and we redo the cost/benefit analysis? It is very easy to go from developer flag
> to experimental flag, it is way, way harder to go from experimental flag to
> developer flag.

Okay, that sounds like a compromise to me: push Epsilon under "develop" flag,
and then ask users or downstreams to switch it to "product" if they want. This
is not ideal, but it works. Does that resolve your concerns?

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170718/b0007566/signature.asc>

From erik.helin at oracle.com  Wed Jul 19 09:17:41 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 19 Jul 2017 11:17:41 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>
 <f981cf41-92d1-0c80-1e40-f85dbd7f2189@redhat.com>
 <e7d5147d-f5bd-b011-d8bc-0667c36485d1@oracle.com>
 <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com>
Message-ID: <34ae4e22-c1aa-ff99-1a0d-9fec183280b9@oracle.com>

On 07/18/2017 05:41 PM, Aleksey Shipilev wrote:
>> Yes, you always have to weigh the benefits against the costs, and in this case,
>> exposing Epsilon GC to non-JVM developers seems, at least for now and to me,
>> taht the benefits do not outweigh the costs. Who knows, maybe this will change
>> and we redo the cost/benefit analysis? It is very easy to go from developer flag
>> to experimental flag, it is way, way harder to go from experimental flag to
>> developer flag.
>
> Okay, that sounds like a compromise to me: push Epsilon under "develop" flag,
> and then ask users or downstreams to switch it to "product" if they want. This
> is not ideal, but it works. Does that resolve your concerns?

Yep, I would prefer it to be a develop flag. Will you update the JEP to 
reflect this?

Thanks,
Erik

> Thanks,
> -Aleksey
>


From thomas.schatzl at oracle.com  Wed Jul 19 09:27:05 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 19 Jul 2017 11:27:05 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>
 <f981cf41-92d1-0c80-1e40-f85dbd7f2189@redhat.com>
 <e7d5147d-f5bd-b011-d8bc-0667c36485d1@oracle.com>
 <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com>
Message-ID: <1500456425.2870.36.camel@oracle.com>

Hi Aleksey,

On Tue, 2017-07-18 at 17:41 +0200, Aleksey Shipilev wrote:
> Hi Erik,
> 
> I think we are coming to a consensus here.
> 
> Piece-wise:
> 
> On 07/18/2017 05:22 PM, Erik Helin wrote:
> > 
> > No, I don't view it that way. Having the code in the upstream
> > repository and having it exposed in binary builds are two very
> > different things to me, and comes with very different requirements
> > in terms of maintenance. If the code is in the upstream repository,
> > then it is a tool for developers working in OpenJDK and for
> > integrators building OpenJDK. We have a much easier time changing
> > such code compared to code that users have come to rely on (and
> > expect certain behavior from).
>
> Okay. I am still quite a bit puzzled why "experimental" comes with
> any notion of supportability, compatibility, testing coverage, etc.?

Every option that is exposed to the user in the product build is part
of the public API, and so must be supported similar to other options.
An experimental option is just another "official" interface to the user
as described by the CSR wiki page [1].

Just consider this: a security issue in an experimental option is just
as much a security issue in the product as any other. Since we do not
want to wait that to happen, it needs the same support and testing as
any other.

Experimental options are (at least in the GC group) more obscure
options that help you shoot yourselves into your foot performance wise
better if you fiddle too much with them :)
So the use -XX:+UseExperimentalVMOptions is more an acknowledgment for
you that you are really sure you want to do that.

They may be still required for some users for application (what we
think are) corner cases that are not (yet?) handled well automatically
by the VM. Or as alternatives for other product options that only apply
to e.g. a single collector. Or just mislabelled as such.

> I don't think most of current experimental options declared in
>?globals.hpp come with that in mind. In fact, many are even marked?
> with "(Unsafe) (Unstable)"...

The VM is a very old project, from before when terms like "unit
testing", "code coverage" and related were a thing. Around 28 of those
remaining out of 1729 in globals.hpp does not sound too bad. Could be
better of course (also the actual number of switches ;)).

Also I am not sure whether they are actually unsafe and unstable any
more.

Thanks,
? Thomas

[1]?https://wiki.openjdk.java.net/display/csr/?; there is a more
detailed, likely provisional guide [2] covering options a bit more.
[2]?http://cr.openjdk.java.net/~darcy/OpenJdkDevGuide/OpenJdkDevelopers
Guide.v0.777.html#kinds_of_interfaces


From shade at redhat.com  Wed Jul 19 10:32:08 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 19 Jul 2017 12:32:08 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <1500456425.2870.36.camel@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>
 <f981cf41-92d1-0c80-1e40-f85dbd7f2189@redhat.com>
 <e7d5147d-f5bd-b011-d8bc-0667c36485d1@oracle.com>
 <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com>
 <1500456425.2870.36.camel@oracle.com>
Message-ID: <384f94c6-96c3-20f0-2ea2-a9fafd29d99c@redhat.com>

On 07/19/2017 11:27 AM, Thomas Schatzl wrote:
>> Okay. I am still quite a bit puzzled why "experimental" comes with
>> any notion of supportability, compatibility, testing coverage, etc. 
> 
> Every option that is exposed to the user in the product build is part
> of the public API, and so must be supported similar to other options.
> An experimental option is just another "official" interface to the user
> as described by the CSR wiki page [1].
> 
> Just consider this: a security issue in an experimental option is just
> as much a security issue in the product as any other. Since we do not
> want to wait that to happen, it needs the same support and testing as
> any other.

But, but... the definition in globals.hpp:

// experimental flags are in support of features that ***are not
// part of the officially supported product***, but are available
// for experimenting with. They could, for example, be performance
// features that ***may not have undergone full or rigorous QA***, but which may
// help performance in some cases and released for experimentation
// by the community of users and developers. This flag also allows one to
// be able to build a fully supported product that nonetheless also
// ships with some ***unsupported, lightly tested***, experimental features.
// Like the UnlockDiagnosticVMOptions flag above, there is a corresponding
// UnlockExperimentalVMOptions flag, which allows the control and
// modification of the experimental flags.

(emphasis mine)

Are you saying that GC group makes that definition stronger by saying
experimental flags are like product functional-stability-wise, but not
performance-wise? So, that means GC group runs the functional testing with every
combination of experimental options?

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170719/d774a69a/signature.asc>

From shade at redhat.com  Wed Jul 19 12:12:28 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 19 Jul 2017 14:12:28 +0200
Subject: RFC: Epsilon GC JEP
In-Reply-To: <34ae4e22-c1aa-ff99-1a0d-9fec183280b9@oracle.com>
References: <67f6d4a2-d129-1491-4906-473586dc6680@redhat.com>
 <d9fec080-0351-b466-3999-d321d5e4b56b@redhat.com>
 <621d6f35-617c-d603-3159-cd537831e66e@oracle.com>
 <858737aa-b8b1-dfdf-a099-1e0decb706ab@redhat.com>
 <8f9b4995-f687-47c8-30e0-5cae513b8947@oracle.com>
 <f981cf41-92d1-0c80-1e40-f85dbd7f2189@redhat.com>
 <e7d5147d-f5bd-b011-d8bc-0667c36485d1@oracle.com>
 <3fd29a77-7f0e-070b-8abd-a4f7ea29ecc5@redhat.com>
 <34ae4e22-c1aa-ff99-1a0d-9fec183280b9@oracle.com>
Message-ID: <3438f311-80e4-8e12-3e58-a8a0f7750858@redhat.com>

On 07/19/2017 11:17 AM, Erik Helin wrote:
> On 07/18/2017 05:41 PM, Aleksey Shipilev wrote:
>>> Yes, you always have to weigh the benefits against the costs, and in this case,
>>> exposing Epsilon GC to non-JVM developers seems, at least for now and to me,
>>> taht the benefits do not outweigh the costs. Who knows, maybe this will change
>>> and we redo the cost/benefit analysis? It is very easy to go from developer flag
>>> to experimental flag, it is way, way harder to go from experimental flag to
>>> developer flag.
>>
>> Okay, that sounds like a compromise to me: push Epsilon under "develop" flag,
>> and then ask users or downstreams to switch it to "product" if they want. This
>> is not ideal, but it works. Does that resolve your concerns?
> 
> Yep, I would prefer it to be a develop flag. Will you update the JEP to reflect
> this?

Updated.

Better yet, the implementation is updated to make Epsilon 'develop'. Which
required some trickery to make the tests pass with release builds, and survive
changing the flag back to 'product' or 'experimental' without omitting the
tests. Also, my build servers now patch Epsilon builds back to 'experimental'.
<much-maintainability-wow-doge.jpg>

Cheers,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170719/458fed0c/signature.asc>

From mikael.gerdin at oracle.com  Wed Jul 19 14:52:51 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 19 Jul 2017 16:52:51 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <1500042870.3458.84.camel@oracle.com>
References: <1500031158.3458.41.camel@oracle.com>
 <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
 <1500034800.3458.75.camel@oracle.com>
 <cef86fcc-c15b-f279-51f5-29aa0617033d@redhat.com>
 <1500042870.3458.84.camel@oracle.com>
Message-ID: <18f703d2-fb15-6ae8-affa-9d4ea11c85e1@oracle.com>

Hi Thomas,

On 2017-07-14 16:34, Thomas Schatzl wrote:
> Hi again,
> 
> On Fri, 2017-07-14 at 15:18 +0200, Aleksey Shipilev wrote:
>> On 07/14/2017 02:20 PM, Thomas Schatzl wrote:
>>>
>>> Not completely sure what you are referring to, but I split some
>>> very
>>> long asserts across lines.
>> Yes, I meant that, sorry for not being clear. Any webrev that
>> requires me to scroll horizontally on 2560-pixel wide screen triggers
>> me!
> 
> I noticed that too :)
> 
>>>>
>>>> *) So, mark_reference_grey used to be called from
>>>> G1CMSATBBufferClosure on
>>>> objects below TAMS, but now it would get called on objects past
>>>> TAMS
>>>> too?
>>> CMTask::make_reference_grey() now calls
>>> G1ConcurrentMark::mark_in_next_bitmap(), not
>>> ConcurrentMark::par_mark()
>>> which does not exist any more:
>>> G1ConcurrentMark::mark_in_next_bitmap()
>>> in the first check filters out marking attempts above nTAMS
>>> (g1ConcurrentMark.inline.hpp:47 now), returning false, which makes
>>> make_reference_grey() exit immediately in that case. This seems to
>>> achieve the same effect.
>> Ah, I missed that part! I agree this part is fine then.
>>
>>>
>>> If you are worried whether there is a performance difference
>>> because maybe now we do more work in some cases, all paths
>>> previously leading to the former G1ConcurrentMark::par_mark() did
>>> the nTAMS check in one way or another already (of course in
>>> inconsistent fashion) so there should be no change here.
>> No, I am not worried. SATB-heavy workloads have problems way beyond
>> bitmap marking :)
>>
>>>
>>> New webrevs:
>>> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1/ (full)
>> Looks good to me.
> 
> Thanks. Unfortunately, after re-appyling and fixing other changes based
> on this one I noticed that I missed one opportunity to refactor in
> G1CMTask::deal_with_reference(). I would like to add this to this
> changeset still... sorry.
> 
> There is some note about some perf optimization that mentions that it
> is advantagous to do the nTAMS check before determining the heap
> region; however I do not think this is an issue.
> 
> Quickly comparing runs of a fairly large and reference-intensive
> workload (BigRAMTester with 20g heap e.g. attached to JDK-8152438),
> marking cycles with the latest webrev.2 are at least as fast as without
> any of this RFR's changes.
> 
> New webrevs:
> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff)
> http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full)

Looks good to me.
/Mikael

> 
> Thanks,
>    Thomas
> 


From thomas.schatzl at oracle.com  Wed Jul 19 14:58:13 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 19 Jul 2017 16:58:13 +0200
Subject: RFR (S/M): 8184348: Merge G1ConcurrentMark::par_mark() and
 G1ConcurrentMark::grayRoot()
In-Reply-To: <18f703d2-fb15-6ae8-affa-9d4ea11c85e1@oracle.com>
References: <1500031158.3458.41.camel@oracle.com>
 <aa9af55e-db94-a70c-77b0-d5bda4add389@redhat.com>
 <1500034800.3458.75.camel@oracle.com>
 <cef86fcc-c15b-f279-51f5-29aa0617033d@redhat.com>
 <1500042870.3458.84.camel@oracle.com>
 <18f703d2-fb15-6ae8-affa-9d4ea11c85e1@oracle.com>
Message-ID: <1500476293.2568.0.camel@oracle.com>

Hi Mikael,

On Wed, 2017-07-19 at 16:52 +0200, Mikael Gerdin wrote:
> Hi Thomas,
> 
> On 2017-07-14 16:34, Thomas Schatzl wrote:
> >?
> > New webrevs:
> > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.1_to_2 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8184348/webrev.2 (full)
> Looks good to me.
> /Mikael

? thanks for your review.

Thomas


From milan.mimica at gmail.com  Wed Jul 19 17:05:28 2017
From: milan.mimica at gmail.com (Milan Mimica)
Date: Wed, 19 Jul 2017 17:05:28 +0000
Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as belonging to
 mtGC
Message-ID: <CAC+6wjoNV-Tr=35cA_R0f4ZKTQnVHnOduZHWsti6GayUaOUXCw@mail.gmail.com>

Hello

I'm resending the two patches (JDK-8176571, JDK-8182169) from my new email
address which I will be using from now on in this ML. I was notified my OCA
has been approved.

The patches have previously been discussed and generally approved. I
recreated them against the recent tip, and also removed overloaded
constructors from CHeapBitmap by using default parameters, as discussed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170719/0efe441d/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: heapBitMap_nmt.diff
Type: text/x-patch
Size: 16567 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170719/0efe441d/heapBitMap_nmt.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: refactor_array_allocator.diff
Type: text/x-patch
Size: 11799 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170719/0efe441d/refactor_array_allocator.diff>

From email.sundarms at gmail.com  Wed Jul 19 20:24:50 2017
From: email.sundarms at gmail.com (Sundara Mohan M)
Date: Wed, 19 Jul 2017 13:24:50 -0700
Subject: G1MonitoringSupport unused generation counter
Message-ID: <CAEY0QqCiOTFm6DhFO1oo_FDC0r83s+WNkyyxBVO09bUFJ0seUw@mail.gmail.com>

Hi,

Was trying to understand why old generation mx bean was notified in case of
G1GC and saw following code

G1MonitoringSupport.hpp
  //  young collection set counters.  The _eden_counters,
  // _from_counters, and _to_counters are associated with
  // this "generational" counter.
  GenerationCounters*  _young_collection_counters;
  //  old collection set counters. The _old_space_counters
  // below are associated with this "generational" counter.
  GenerationCounters*  _old_collection_counters;

I don't see these counters updated anywhere. What is the use of these
counters in G1GC.


only following is updated in g1CollectedHeap.cpp
  //  incremental collections both young and mixed
  CollectorCounters*   _incremental_collection_counters;
  //  full stop-the-world collections
  CollectorCounters*   _full_collection_counters;

Is there any mail thread/doc which explains more about this.

Thanks,
Sundar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170719/9594192b/attachment.htm>

From thomas.schatzl at oracle.com  Thu Jul 20 07:37:14 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 20 Jul 2017 09:37:14 +0200
Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as
 belonging to mtGC
In-Reply-To: <CAC+6wjoNV-Tr=35cA_R0f4ZKTQnVHnOduZHWsti6GayUaOUXCw@mail.gmail.com>
References: <CAC+6wjoNV-Tr=35cA_R0f4ZKTQnVHnOduZHWsti6GayUaOUXCw@mail.gmail.com>
Message-ID: <1500536234.2924.0.camel@oracle.com>

Hi Milan,

On Wed, 2017-07-19 at 17:05 +0000, Milan Mimica wrote:
> Hello
> 
> I'm resending the two patches (JDK-8176571, JDK-8182169) from my new
> email address which I will be using from now on in this ML. I was
> notified my OCA has been approved.
> 
> The patches have previously been discussed and generally approved. I
> recreated them against the recent tip, and also removed overloaded
> constructors from CHeapBitmap by using default parameters, as
> discussed.

? great!

Looks good. I can sponsor as soon as Kim or anybody else gives his
okay.

Thanks,
? Thomas


From erik.helin at oracle.com  Thu Jul 20 08:44:38 2017
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 20 Jul 2017 10:44:38 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
 <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
 <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>
 <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com>
 <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com>
 <b0fc4b0c-7067-0b10-e12e-9023be2ddaae@oracle.com>
 <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com>
Message-ID: <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com>

On 07/17/2017 02:07 PM, Roman Kennke wrote:
>>> Ok, added those and some more that I found. Not sure why we'd need
>>> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out
>>> for now.
>>
>> Because you are accessing CMSCollcetor in:
>>
>>  99   NOT_PRODUCT(
>>  100     virtual size_t skip_header_HeapWords() { return
>> CMSCollector::skip_header_HeapWords(); }
>>  101   )
>>
>> and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An
>> alternative would of course be to just declare skip_header_HeapWords()
>> in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then
>> you only need to include concurrentMarkSweeoGeneration.hpp in
>> cmsHeap.cpp.
> Ah ok, I've missed that one. Added it now.

Where did you add it? I don't see any include of
"gc/cms/concurrentMarkSweepGeneration.hpp" in cmsHeap.hpp?

Thanks,
Erik


From mikael.gerdin at oracle.com  Thu Jul 20 09:55:35 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 20 Jul 2017 11:55:35 +0200
Subject: RFR (S): 8183121: Add information about scanned and skipped cards
 during UpdateRS
In-Reply-To: <1499861583.6693.3.camel@oracle.com>
References: <1499861583.6693.3.camel@oracle.com>
Message-ID: <0f0689fe-b532-9708-dca9-70d4ed01415b@oracle.com>

Hi Thomas,

On 2017-07-12 14:13, Thomas Schatzl wrote:
> Hi all,
> 
>    can I have reviews for this small change that adds some information
> about how many cards were scanned/skipped during Update RS.
> 
> This information is much better than just the number of processed
> buffers, although I kept them for now.
> 
> This change is based on Erik's changes for JDK-8183539.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8183121
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8183121/webrev

Looks fine to me.
/Mikael

> Testing:
> jprt, test case
> 
> Thanks,
>    Thomas
> 


From thomas.schatzl at oracle.com  Thu Jul 20 10:13:59 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 20 Jul 2017 12:13:59 +0200
Subject: RFR (S): 8183121: Add information about scanned and skipped
 cards during UpdateRS
In-Reply-To: <0f0689fe-b532-9708-dca9-70d4ed01415b@oracle.com>
References: <1499861583.6693.3.camel@oracle.com>
 <0f0689fe-b532-9708-dca9-70d4ed01415b@oracle.com>
Message-ID: <1500545639.2924.2.camel@oracle.com>

Hi,

On Thu, 2017-07-20 at 11:55 +0200, Mikael Gerdin wrote:
> Hi Thomas,
> 
> On 2017-07-12 14:13, Thomas Schatzl wrote:
> > 
> > Hi all,
> > 
> > ???can I have reviews for this small change that adds some
> > information
> > about how many cards were scanned/skipped during Update RS.
> > 
> > This information is much better than just the number of processed
> > buffers, although I kept them for now.
> > 
> > This change is based on Erik's changes for JDK-8183539.
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8183121
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8183121/webrev
> Looks fine to me.
> /Mikael

? thanks for your review.

Thomas


From rkennke at redhat.com  Thu Jul 20 10:46:34 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 20 Jul 2017 12:46:34 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
 <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
 <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>
 <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com>
 <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com>
 <b0fc4b0c-7067-0b10-e12e-9023be2ddaae@oracle.com>
 <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com>
 <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com>
Message-ID: <5bed5268-d690-9cdd-c2aa-e9b822687378@redhat.com>

Am 20.07.2017 um 10:44 schrieb Erik Helin:
> On 07/17/2017 02:07 PM, Roman Kennke wrote:
>>>> Ok, added those and some more that I found. Not sure why we'd need
>>>> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out
>>>> for now.
>>>
>>> Because you are accessing CMSCollcetor in:
>>>
>>>  99   NOT_PRODUCT(
>>>  100     virtual size_t skip_header_HeapWords() { return
>>> CMSCollector::skip_header_HeapWords(); }
>>>  101   )
>>>
>>> and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An
>>> alternative would of course be to just declare skip_header_HeapWords()
>>> in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then
>>> you only need to include concurrentMarkSweeoGeneration.hpp in
>>> cmsHeap.cpp.
>> Ah ok, I've missed that one. Added it now.
>
> Where did you add it? I don't see any include of
> "gc/cms/concurrentMarkSweepGeneration.hpp" in cmsHeap.hpp?

Hmm. I honestly don't know how that disappeared :-)

Differential:
http://cr.openjdk.java.net/~rkennke/8179387/webrev.09.diff/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.09.diff/>
Full:
http://cr.openjdk.java.net/~rkennke/8179387/webrev.09/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.09/>

I hope it's ok now.

Cheers, Roman


From rkennke at redhat.com  Thu Jul 20 10:53:18 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 20 Jul 2017 12:53:18 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <6f2c6de7-298b-bf14-ab1f-430c4acd43c9@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
 <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
 <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>
 <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com>
Message-ID: <c155ec75-5a79-6f69-9d21-4fa951ddd3ac@redhat.com>

Hi all,

Robbin found some more missing includes in jprt testing (thanks!!)

Differential:
http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18.diff/>
Full:
http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18/>

Am I breaking the record for most webrev revisions? :-P

According the Robbin, builds are now all clean.

Can I get final reviews and then a sponsor?

Thanks,
Roman

Am 16.07.2017 um 10:25 schrieb Robbin Ehn:
> Hi Roman,
>
> On 2017-07-12 15:32, Roman Kennke wrote:
>> Hi Robbin and all,
>>
>> I fixed the 32bit failures by using jlong in all relevant places:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>
>>
>> then Robbin found another problem. SafepointCleanupTest started to fail,
>> because "mark nmethods" is no longer printed. This made me think that
>> we're not measuring the conflated (and possibly parallelized)
>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
>> "safepoint cleanup tasks" which measures the total duration of safepoint
>> cleanup. We can't reasonably measure a possibly parallel and conflated
>> pass standalone, but we can measure all and by subtrating all the other
>> subphases, get an idea how long deflation and nmethod marking take up.
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>
>>
>> The full webrev is now:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>>
>> Hope that's all ;-)
>
> With this changeset something always pop-ups.
>
> Failure reason: Targets failed.  Target macosx_x64_10.9-fastdebug FAILED.
>
>  /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++
> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS
> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE
> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions
> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16
> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070
> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN
> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef
> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS
> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86
> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64
> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED
> -DINCLUDE_AOT
> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm
> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22:
> error: variable has incomplete type 'StrongRootsScope'
>     StrongRootsScope srs(num_cleanup_workers);
>                      ^
> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
> note: forward declaration of 'StrongRootsScope'
> class StrongRootsScope;
>       ^
> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22:
> error: variable has incomplete type 'StrongRootsScope'
>     StrongRootsScope srs(1);
>                      ^
> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
> note: forward declaration of 'StrongRootsScope'
> class StrongRootsScope;
>       ^
> 2 errors generated.
> make[3]: ***
> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o]
> Error 1
> make[3]: *** Waiting for unfinished jobs....
> make[2]: *** [hotspot-server-libs] Error 2
>
> Send me the new webrev and I'll test it before the 16th round of
> review :)
>
> /Robbin
>
>>
>> Roman
>>
>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>>> Hi, unfortunately the push failed on 32-bit.
>>>
>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>>
>>> I do not have anytime to look at this, so here is the error.
>>>
>>> /Robbin
>>>
>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>> member function 'long int nmethod::stack_traversal_mark()':
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>
>>> error: call of overloaded 'load_acquire(volatile long int*)' is
>>> ambiguous
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>
>>> note: candidates are:
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>
>>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>
>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>> to 'const volatile jint* {aka const volatile int*}'
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>
>>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>
>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>> to 'const volatile juint* {aka const volatile unsigned int*}'
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>
>>> error: call of overloaded 'release_store(volatile long int*, long
>>> int&)' is ambiguous
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>
>>> note: candidates are:
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>>                  from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>
>>> note: static void OrderAccess::release_store(volatile jint*, jint)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>
>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>> to 'volatile jint* {aka volatile int*}'
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>
>>> note: static void OrderAccess::release_store(volatile juint*, juint)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>
>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>> to 'volatile juint* {aka volatile unsigned int*}'
>>>
>>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>>> I'll start a push now.
>>>>
>>>> /Robbin
>>>>
>>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>>> Ok, so I guess I need a sponsor for this now:
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>
>>>>> Roman
>>>>>
>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>>
>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>>> Hi Robbin,
>>>>>>>>>
>>>>>>>>> Far down ->
>>>>>>>>>
>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>>
>>>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>>>> +  }
>>>>>>>>>>>
>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>>> consistent
>>>>>>>>>>> with an OrderAccess::storestore() that's not properly
>>>>>>>>>>> documented
>>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>>
>>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>>
>>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case
>>>>>>>>>>>> that
>>>>>>>>>>>>    I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>>> sweeper)
>>>>>>>>>>>>    is holding still.
>>>>>>>>>>>
>>>>>>>>>>> and:
>>>>>>>>>>>
>>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>>
>>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>>> marking
>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>>> (outside
>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>>> storestore()
>>>>>>>>>> should be necessary.
>>>>>>>>>>
>>>>>>>>>>  From Igor's comment I can see how it happened though:
>>>>>>>>>> Apparently
>>>>>>>>>> there
>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>>> with
>>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>>> required
>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>>> also put
>>>>>>>>>> a storestore() in the other places that call
>>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>>> storestore()
>>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>>> 'for
>>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>>> necessary in
>>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>>
>>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>>> Refactor the
>>>>>>>>>> code so that both paths at least call the storestore() in the
>>>>>>>>>> same
>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and
>>>>>>>>>> call
>>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>>
>>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>>
>>>>>>>>> So there is a slight optimization when not running sweeper to
>>>>>>>>> skip
>>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>>
>>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>>>>> _stack_traversal_mark; }
>>>>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>>
>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>>> that
>>>>>>>>> it is concurrent accessed.
>>>>>>>>> And remove both storestore.
>>>>>>>>>
>>>>>>>>> "Also neither of these state variables are volatile in
>>>>>>>>> nmethod, so
>>>>>>>>> even the compiler may reorder the stores"
>>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>>
>>>>>>>>> I think _state also should use la/rs semantics instead, but
>>>>>>>>> that's
>>>>>>>>> another story.
>>>>>>>> Like this?
>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>
>>>>>>> Yes, exactly, I like this!
>>>>>>> Dan? Igor ? Tobias?
>>>>>>>
>>>>>>
>>>>>> That seems correct.
>>>>>>
>>>>>> igor
>>>>>>
>>>>>>> Thanks Roman!
>>>>>>>
>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow
>>>>>>> this
>>>>>>> thread/changeset to the end!
>>>>>>>
>>>>>>> /Robbin
>>>>>>>
>>>>>>>> Roman
>>>>>>
>>>>>
>>


From thomas.schatzl at oracle.com  Thu Jul 20 11:06:31 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 20 Jul 2017 13:06:31 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com>
Message-ID: <1500548791.2924.6.camel@oracle.com>

Hi all,

? Erik and Mikael had a look at it and suggested several further
cleanups, removing about 40 LOC. These included:

- instead of G1CMBitMapR0 use properly const'ified G1CMBitmaps
- change the _start and _word_size members into an equivalent MemRegion
- minor cleanups, removing obsolete asserts.

Webrevs:
http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1_to_2/?(diff)
http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2/?(full)
Testing:
jprt

Thanks,
? Thomas


From mikael.gerdin at oracle.com  Thu Jul 20 13:04:11 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 20 Jul 2017 15:04:11 +0200
Subject: Request for Comments: 8184734: Rework G1 root scanning to avoid
 multiple CLD passes
Message-ID: <aa5be97e-25d5-9b7e-578b-72db86b14638@oracle.com>

Hi all,

Please review this preliminary change to clean up G1 root processing a bit.
I've not run this through a lot of testing but this will give you a 
general idea about where I think we should be going.

The basic idea is explained in the bug text but I'll reproduce it here 
as well:

> After JDK-8154580 we no longer need the multi-pass CLD scanning in G1.
> The reason for this is that classes which are strongly reachable from interpreter frames are kept alive by marking the mirror in the initial mark pause.
> 
> The current solution to this was to first ensure that in an initial step all CLDs which were strongly reachable had to be scanned and claimed before any weakly reachable CLDs could be scanned and claimed. This code can now be simplified and we can walk all the CLDs in one go, only doing strong marking on the ones which are strong as per always_strong_cld_do.
> This cleanup also allows us to remove the claimed marks clearing since CLD scanning is now completely single threaded.
> 
> Waiting for strong classes to be discovered is still needed for the case where an nmethod on the stack is the single root to a class. 


Webrev: http://cr.openjdk.java.net/~mgerdin/8184734/webrev.0/
Bug: https://bugs.openjdk.java.net/browse/JDK-8184734
Testing: jprt, some local tonga tests, kitchensink and runThese

Suggestions on further testing would be much appreciated!

Thanks
/Mikael


From rkennke at redhat.com  Thu Jul 20 14:58:51 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 20 Jul 2017 16:58:51 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <1500034180.3458.67.camel@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
 <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
 <1500034180.3458.67.camel@oracle.com>
Message-ID: <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com>

Am 14.07.2017 um 14:09 schrieb Thomas Schatzl:
> Hi Roman,
>
> On Fri, 2017-07-14 at 13:24 +0200, Roman Kennke wrote:
>> Am 14.07.2017 um 13:12 schrieb Aleksey Shipilev:
>>> Hi Thomas,
>>>
>>> On 07/14/2017 12:58 PM, Thomas Schatzl wrote:
>>>>>> The next CR JDK-8184347 will deal with moving G1CMBitmap*
>>>>>> into separate files.
>>>>>  And while you're at it, you may want to move it to gc/shared
>>>>> and renamed it to something like MarkBitmap?
>>>>> https://bugs.openjdk.java.net/browse/JDK-8180193
>>>>>
>>>> Not particularly against this change, but I think we should do
>>>> the move and renaming separately when the change is actually
>>>> required, i.e. just before there is another dependency on it.
>>> I think this would be inconvenient, because when "another
>>> dependency" would come in a large webrev, it would have to include
>>> the CMBitmap move too, complicating reviews.
>> I understood it such that we would post the moving around of gc/g1
>> files to gc/shared right before we'd post Shenandoah (in the not-so-
>> distant future, hopefully). That would work for me. I wouldn't like
>> to include everything in a giant webrev :-P
>>
>   that is exactly what I meant - thanks for your understanding.
>
> Thomas
>
I just found out that CMS has its own bitmap class too, and it looks
mostly like a copy of the G1 bitmap class :-) So that would be another
user of a gc/shared bitmap class in the future.

Roman


From daniel.daugherty at oracle.com  Thu Jul 20 16:43:34 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 20 Jul 2017 10:43:34 -0600
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <c155ec75-5a79-6f69-9d21-4fa951ddd3ac@redhat.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
 <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
 <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>
 <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com>
 <c155ec75-5a79-6f69-9d21-4fa951ddd3ac@redhat.com>
Message-ID: <0c200752-6e45-6132-d937-4e9429ed9f95@oracle.com>

On 7/20/17 4:53 AM, Roman Kennke wrote:
> Hi all,
>
> Robbin found some more missing includes in jprt testing (thanks!!)
>
> Differential:
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18.diff/>
> Full:
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18/>
>
> Am I breaking the record for most webrev revisions? :-P
>
> According the Robbin, builds are now all clean.
>
> Can I get final reviews and then a sponsor?

src/share/vm/runtime/safepoint.cpp
     No comments.

Only reviewed the one file that changed since webrev.15.

Thumbs up!

Dan


>
> Thanks,
> Roman
>
> Am 16.07.2017 um 10:25 schrieb Robbin Ehn:
>> Hi Roman,
>>
>> On 2017-07-12 15:32, Roman Kennke wrote:
>>> Hi Robbin and all,
>>>
>>> I fixed the 32bit failures by using jlong in all relevant places:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>
>>>
>>> then Robbin found another problem. SafepointCleanupTest started to fail,
>>> because "mark nmethods" is no longer printed. This made me think that
>>> we're not measuring the conflated (and possibly parallelized)
>>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
>>> "safepoint cleanup tasks" which measures the total duration of safepoint
>>> cleanup. We can't reasonably measure a possibly parallel and conflated
>>> pass standalone, but we can measure all and by subtrating all the other
>>> subphases, get an idea how long deflation and nmethod marking take up.
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>
>>>
>>> The full webrev is now:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>>>
>>> Hope that's all ;-)
>> With this changeset something always pop-ups.
>>
>> Failure reason: Targets failed.  Target macosx_x64_10.9-fastdebug FAILED.
>>
>>   /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++
>> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS
>> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE
>> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions
>> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16
>> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070
>> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN
>> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef
>> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS
>> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86
>> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64
>> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED
>> -DINCLUDE_AOT
>> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm
>> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22:
>> error: variable has incomplete type 'StrongRootsScope'
>>      StrongRootsScope srs(num_cleanup_workers);
>>                       ^
>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
>> note: forward declaration of 'StrongRootsScope'
>> class StrongRootsScope;
>>        ^
>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22:
>> error: variable has incomplete type 'StrongRootsScope'
>>      StrongRootsScope srs(1);
>>                       ^
>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
>> note: forward declaration of 'StrongRootsScope'
>> class StrongRootsScope;
>>        ^
>> 2 errors generated.
>> make[3]: ***
>> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o]
>> Error 1
>> make[3]: *** Waiting for unfinished jobs....
>> make[2]: *** [hotspot-server-libs] Error 2
>>
>> Send me the new webrev and I'll test it before the 16th round of
>> review :)
>>
>> /Robbin
>>
>>> Roman
>>>
>>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>>>> Hi, unfortunately the push failed on 32-bit.
>>>>
>>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>>>
>>>> I do not have anytime to look at this, so here is the error.
>>>>
>>>> /Robbin
>>>>
>>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>>> member function 'long int nmethod::stack_traversal_mark()':
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>>
>>>> error: call of overloaded 'load_acquire(volatile long int*)' is
>>>> ambiguous
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>>
>>>> note: candidates are:
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>>
>>>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>>>> <near match>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>>
>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>> to 'const volatile jint* {aka const volatile int*}'
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>>
>>>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>>>> <near match>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>>
>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>> to 'const volatile juint* {aka const volatile unsigned int*}'
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>>
>>>> error: call of overloaded 'release_store(volatile long int*, long
>>>> int&)' is ambiguous
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>>
>>>> note: candidates are:
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>>
>>>> note: static void OrderAccess::release_store(volatile jint*, jint)
>>>> <near match>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>>
>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>> to 'volatile jint* {aka volatile int*}'
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>>
>>>> note: static void OrderAccess::release_store(volatile juint*, juint)
>>>> <near match>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>>
>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>> to 'volatile juint* {aka volatile unsigned int*}'
>>>>
>>>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>>>> I'll start a push now.
>>>>>
>>>>> /Robbin
>>>>>
>>>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>>>> Ok, so I guess I need a sponsor for this now:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>
>>>>>> Roman
>>>>>>
>>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>>>
>>>>>>>> Hi Roman,
>>>>>>>>
>>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>>>> Hi Robbin,
>>>>>>>>>> Far down ->
>>>>>>>>>>
>>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>>>
>>>>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>>>>> +  }
>>>>>>>>>>>>
>>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>>>> consistent
>>>>>>>>>>>> with an OrderAccess::storestore() that's not properly
>>>>>>>>>>>> documented
>>>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>>>
>>>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>>>
>>>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case
>>>>>>>>>>>>> that
>>>>>>>>>>>>>     I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>>>> sweeper)
>>>>>>>>>>>>>     is holding still.
>>>>>>>>>>>> and:
>>>>>>>>>>>>
>>>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>>>> marking
>>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>>>> (outside
>>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>>>> storestore()
>>>>>>>>>>> should be necessary.
>>>>>>>>>>>
>>>>>>>>>>>   From Igor's comment I can see how it happened though:
>>>>>>>>>>> Apparently
>>>>>>>>>>> there
>>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>>>> with
>>>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>>>> required
>>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>>>> also put
>>>>>>>>>>> a storestore() in the other places that call
>>>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>>>> storestore()
>>>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>>>> 'for
>>>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>>>> necessary in
>>>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>>>
>>>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>>>> Refactor the
>>>>>>>>>>> code so that both paths at least call the storestore() in the
>>>>>>>>>>> same
>>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and
>>>>>>>>>>> call
>>>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>>>
>>>>>>>>>> So there is a slight optimization when not running sweeper to
>>>>>>>>>> skip
>>>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>>>
>>>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>>>>>> _stack_traversal_mark; }
>>>>>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>>>
>>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>>>> that
>>>>>>>>>> it is concurrent accessed.
>>>>>>>>>> And remove both storestore.
>>>>>>>>>>
>>>>>>>>>> "Also neither of these state variables are volatile in
>>>>>>>>>> nmethod, so
>>>>>>>>>> even the compiler may reorder the stores"
>>>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>>>
>>>>>>>>>> I think _state also should use la/rs semantics instead, but
>>>>>>>>>> that's
>>>>>>>>>> another story.
>>>>>>>>> Like this?
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>> Yes, exactly, I like this!
>>>>>>>> Dan? Igor ? Tobias?
>>>>>>>>
>>>>>>> That seems correct.
>>>>>>>
>>>>>>> igor
>>>>>>>
>>>>>>>> Thanks Roman!
>>>>>>>>
>>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow
>>>>>>>> this
>>>>>>>> thread/changeset to the end!
>>>>>>>>
>>>>>>>> /Robbin
>>>>>>>>
>>>>>>>>> Roman


From rkennke at redhat.com  Thu Jul 20 16:50:58 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 20 Jul 2017 18:50:58 +0200
Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap
 into its own subclass
In-Reply-To: <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com>
References: <b33ca127-c0d1-5a4b-7565-0ffe2ca6fe52@redhat.com>
 <3521009f-6fab-4f8e-2375-b9d665a4c70b@redhat.com>
 <fb90f88a-ef22-550e-6ee9-35f29472dc01@oracle.com>
 <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com>
 <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com>
 <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com>
 <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com>
 <f9a4179a-03f6-52d2-0395-d9a402d26c76@oracle.com>
 <d5a501af-1930-e4b0-0482-685b8e48698d@oracle.com>
 <13358626-e399-e352-1711-587416621aac@redhat.com>
 <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com>
 <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com>
 <d6904105-310e-352a-c253-7718cc3cbf53@oracle.com>
 <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com>
 <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com>
 <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com>
 <b0fc4b0c-7067-0b10-e12e-9023be2ddaae@oracle.com>
 <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com>
 <30452c37-794f-33f8-b9e5-1aba185c1a3d@oracle.com>
Message-ID: <dd2642d4-abdf-7bca-9407-be6dd9616ae4@redhat.com>

Hi Erik,

as discussed on IRC, I also changed references to GenCollectedHeap
inside gc/cms to use CMSHeap instead, where applicable.

Differential:
http://cr.openjdk.java.net/~rkennke/8179387/webrev.10.diff/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.10.diff/>
Full:
http://cr.openjdk.java.net/~rkennke/8179387/webrev.10/
<http://cr.openjdk.java.net/%7Erkennke/8179387/webrev.10/>

I also need a 2nd reviewer.

Roman

Am 20.07.2017 um 10:44 schrieb Erik Helin:
> On 07/17/2017 02:07 PM, Roman Kennke wrote:
>>>> Ok, added those and some more that I found. Not sure why we'd need
>>>> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out
>>>> for now.
>>>
>>> Because you are accessing CMSCollcetor in:
>>>
>>>  99   NOT_PRODUCT(
>>>  100     virtual size_t skip_header_HeapWords() { return
>>> CMSCollector::skip_header_HeapWords(); }
>>>  101   )
>>>
>>> and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An
>>> alternative would of course be to just declare skip_header_HeapWords()
>>> in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then
>>> you only need to include concurrentMarkSweeoGeneration.hpp in
>>> cmsHeap.cpp.
>> Ah ok, I've missed that one. Added it now.
>
> Where did you add it? I don't see any include of
> "gc/cms/concurrentMarkSweepGeneration.hpp" in cmsHeap.hpp?
>
> Thanks,
> Erik


From kim.barrett at oracle.com  Thu Jul 20 17:34:13 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 20 Jul 2017 13:34:13 -0400
Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as
 belonging to mtGC
In-Reply-To: <1500536234.2924.0.camel@oracle.com>
References: <CAC+6wjoNV-Tr=35cA_R0f4ZKTQnVHnOduZHWsti6GayUaOUXCw@mail.gmail.com>
 <1500536234.2924.0.camel@oracle.com>
Message-ID: <B85ECC71-4EE0-450A-B7F3-CFDD03501A7D@oracle.com>

> On Jul 20, 2017, at 3:37 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Milan,
> 
> On Wed, 2017-07-19 at 17:05 +0000, Milan Mimica wrote:
>> Hello
>> 
>> I'm resending the two patches (JDK-8176571, JDK-8182169) from my new
>> email address which I will be using from now on in this ML. I was
>> notified my OCA has been approved.
>> 
>> The patches have previously been discussed and generally approved. I
>> recreated them against the recent tip, and also removed overloaded
>> constructors from CHeapBitmap by using default parameters, as
>> discussed.
> 
>   great!
> 
> Looks good. I can sponsor as soon as Kim or anybody else gives his
> okay.
> 
> Thanks,
>   Thomas

Looks good.


From thomas.schatzl at oracle.com  Thu Jul 20 18:40:42 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 20 Jul 2017 20:40:42 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <1500548791.2924.6.camel@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com>
 <1500548791.2924.6.camel@oracle.com>
Message-ID: <1500576042.2688.10.camel@oracle.com>

Hi again,

? a few more cleanups could be found that were worth picking up here.

On Thu, 2017-07-20 at 13:06 +0200, Thomas Schatzl wrote:
> Hi all,
> 
> ? Erik and Mikael had a look at it and suggested several further
> cleanups, removing about 40 LOC more. These included:
> 
> - instead of G1CMBitMapR0 use properly const'ified G1CMBitmaps
> - change the _start and _word_size members into an equivalent
> MemRegion
> - minor cleanups, removing obsolete asserts, simplify code.
> 
> Webrevs:
> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1_to_2/?(diff)
> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2/?(full)
> Testing:
> jprt

Webrevs:
http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/?(diff)
http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/?(full)
Testing:
jprt

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Thu Jul 20 18:41:00 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 20 Jul 2017 20:41:00 +0200
Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their
 own files
Message-ID: <1500576060.2688.11.camel@oracle.com>

Hi all,

? can I have reviews for this wrap-up of the G1CMBitmap cleanup? It
simply moves all G1CMBitmap related code into their own files.

Although it's a large change, it's really only moving code.

Depends on JDK-8184346, based on webrev.3.

CR:
https://bugs.openjdk.java.net/browse/JDK-8184347
Webrev:
http://cr.openjdk.java.net/~tschatzl/8184347/webrev/
Testing:
jprt

Thomas


From shade at redhat.com  Thu Jul 20 18:46:38 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 20 Jul 2017 20:46:38 +0200
Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their
 own files
In-Reply-To: <1500576060.2688.11.camel@oracle.com>
References: <1500576060.2688.11.camel@oracle.com>
Message-ID: <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com>

On 07/20/2017 08:41 PM, Thomas Schatzl wrote:
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8184347/webrev/

Looks good to me.

Would you like us to RFE moving this to gc/shared some time later? This would
quire probably need to decouple listeners from the otherwise GC agnostic code.

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170720/fde15a14/signature.asc>

From shade at redhat.com  Thu Jul 20 18:50:50 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 20 Jul 2017 20:50:50 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <1500576042.2688.10.camel@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com>
 <1500548791.2924.6.camel@oracle.com> <1500576042.2688.10.camel@oracle.com>
Message-ID: <b8214f28-6de7-08e5-1447-311501b5331f@redhat.com>

On 07/20/2017 08:40 PM, Thomas Schatzl wrote:
> Webrevs:
> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/ (full)

Generally good, comments:

 *) Long log_debug, log_warning, assert lines in g1CollectedHeap.cpp,
g1ConcurrentMark.cpp, g1ConcurrentMark.hpp

 *) It seems the field and method names are camel-cased and thus
style-inconsistent with the rest of the code?
625   const G1CMBitMap* const prevMarkBitMap() const { return _prevMarkBitMap; }
626   G1CMBitMap* nextMarkBitMap() const { return _nextMarkBitMap; }

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170720/12b91112/signature.asc>

From robbin.ehn at oracle.com  Thu Jul 20 21:52:28 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Thu, 20 Jul 2017 23:52:28 +0200
Subject: RFR: Parallelize safepoint cleanup
In-Reply-To: <0c200752-6e45-6132-d937-4e9429ed9f95@oracle.com>
References: <a4451878-a38f-3cd3-9136-425b424e7ade@redhat.com>
 <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com>
 <e27c9cc2-5209-e2ab-57a1-a21d0de8dd12@oracle.com>
 <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com>
 <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com>
 <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com>
 <a1e460bb-6129-0425-217e-8e8b7b6e35c9@oracle.com>
 <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com>
 <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com>
 <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com>
 <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com>
 <F54B29FF-C4A3-48DA-BB4E-2F6DEED753A3@oracle.com>
 <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com>
 <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com>
 <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com>
 <db9f3ce9-5d97-30f8-43f1-9562bc695603@redhat.com>
 <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com>
 <c155ec75-5a79-6f69-9d21-4fa951ddd3ac@redhat.com>
 <0c200752-6e45-6132-d937-4e9429ed9f95@oracle.com>
Message-ID: <4715671c-82bd-914c-edf0-0ad616723a16@oracle.com>

On 07/20/2017 06:43 PM, Daniel D. Daugherty wrote:
> On 7/20/17 4:53 AM, Roman Kennke wrote:
>> Hi all,
>>
>> Robbin found some more missing includes in jprt testing (thanks!!)
>>
>> Differential:
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18.diff/>
>> Full:
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18/>
>>
>> Am I breaking the record for most webrev revisions? :-P
>>
>> According the Robbin, builds are now all clean.
>>
>> Can I get final reviews and then a sponsor?
> 
> src/share/vm/runtime/safepoint.cpp
>      No comments.
> 
> Only reviewed the one file that changed since webrev.15.
> 
> Thumbs up!

+1, since the incremental changes are trivial I'll sponsor the push now.

We seem to have an issue with:
gc/arguments/TestAggressiveHeap.java (8183910)
So push might need a couple of reruns.

/Robbin

> 
> Dan
> 
> 
> 
>>
>> Thanks,
>> Roman
>>
>> Am 16.07.2017 um 10:25 schrieb Robbin Ehn:
>>> Hi Roman,
>>>
>>> On 2017-07-12 15:32, Roman Kennke wrote:
>>>> Hi Robbin and all,
>>>>
>>>> I fixed the 32bit failures by using jlong in all relevant places:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>
>>>>
>>>> then Robbin found another problem. SafepointCleanupTest started to fail,
>>>> because "mark nmethods" is no longer printed. This made me think that
>>>> we're not measuring the conflated (and possibly parallelized)
>>>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
>>>> "safepoint cleanup tasks" which measures the total duration of safepoint
>>>> cleanup. We can't reasonably measure a possibly parallel and conflated
>>>> pass standalone, but we can measure all and by subtrating all the other
>>>> subphases, get an idea how long deflation and nmethod marking take up.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>
>>>>
>>>> The full webrev is now:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>>>>
>>>> Hope that's all ;-)
>>> With this changeset something always pop-ups.
>>>
>>> Failure reason: Targets failed.  Target macosx_x64_10.9-fastdebug FAILED.
>>>
>>>   /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++
>>> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS
>>> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE
>>> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions
>>> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16
>>> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070
>>> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN
>>> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef
>>> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS
>>> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86
>>> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64
>>> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED
>>> -DINCLUDE_AOT
>>> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm
>>> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22:
>>> error: variable has incomplete type 'StrongRootsScope'
>>>      StrongRootsScope srs(num_cleanup_workers);
>>>                       ^
>>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
>>> note: forward declaration of 'StrongRootsScope'
>>> class StrongRootsScope;
>>>        ^
>>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22:
>>> error: variable has incomplete type 'StrongRootsScope'
>>>      StrongRootsScope srs(1);
>>>                       ^
>>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
>>> note: forward declaration of 'StrongRootsScope'
>>> class StrongRootsScope;
>>>        ^
>>> 2 errors generated.
>>> make[3]: ***
>>> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o]
>>> Error 1
>>> make[3]: *** Waiting for unfinished jobs....
>>> make[2]: *** [hotspot-server-libs] Error 2
>>>
>>> Send me the new webrev and I'll test it before the 16th round of
>>> review :)
>>>
>>> /Robbin
>>>
>>>> Roman
>>>>
>>>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>>>>> Hi, unfortunately the push failed on 32-bit.
>>>>>
>>>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>>>>
>>>>> I do not have anytime to look at this, so here is the error.
>>>>>
>>>>> /Robbin
>>>>>
>>>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>>>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>>>>> In file included from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>>>> member function 'long int nmethod::stack_traversal_mark()':
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>>>
>>>>> error: call of overloaded 'load_acquire(volatile long int*)' is
>>>>> ambiguous
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>>>
>>>>> note: candidates are:
>>>>> In file included from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>>>
>>>>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>>>>> <near match>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>>>
>>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>>> to 'const volatile jint* {aka const volatile int*}'
>>>>> In file included from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>>>
>>>>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>>>>> <near match>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>>>
>>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>>> to 'const volatile juint* {aka const volatile unsigned int*}'
>>>>> In file included from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>>>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>>>
>>>>> error: call of overloaded 'release_store(volatile long int*, long
>>>>> int&)' is ambiguous
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>>>
>>>>> note: candidates are:
>>>>> In file included from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>>
>>>>>                   from
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>>>
>>>>> note: static void OrderAccess::release_store(volatile jint*, jint)
>>>>> <near match>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>>>
>>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>>> to 'volatile jint* {aka volatile int*}'
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>>>
>>>>> note: static void OrderAccess::release_store(volatile juint*, juint)
>>>>> <near match>
>>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>>>
>>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>>> to 'volatile juint* {aka volatile unsigned int*}'
>>>>>
>>>>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>>>>> I'll start a push now.
>>>>>>
>>>>>> /Robbin
>>>>>>
>>>>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>>>>> Ok, so I guess I need a sponsor for this now:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>
>>>>>>> Roman
>>>>>>>
>>>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Roman,
>>>>>>>>>
>>>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>>>>> Hi Robbin,
>>>>>>>>>>> Far down ->
>>>>>>>>>>>
>>>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>>>>
>>>>>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>>>>>> +  }
>>>>>>>>>>>>>
>>>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>>>>> consistent
>>>>>>>>>>>>> with an OrderAccess::storestore() that's not properly
>>>>>>>>>>>>> documented
>>>>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>     I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>>>>> sweeper)
>>>>>>>>>>>>>>     is holding still.
>>>>>>>>>>>>> and:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>>>>> marking
>>>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>>>>> (outside
>>>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>>>>> storestore()
>>>>>>>>>>>> should be necessary.
>>>>>>>>>>>>
>>>>>>>>>>>>   From Igor's comment I can see how it happened though:
>>>>>>>>>>>> Apparently
>>>>>>>>>>>> there
>>>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>>>>> with
>>>>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>>>>> required
>>>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>>>>> also put
>>>>>>>>>>>> a storestore() in the other places that call
>>>>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>>>>> storestore()
>>>>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>>>>> 'for
>>>>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>>>>> necessary in
>>>>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>>>>
>>>>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>>>>> Refactor the
>>>>>>>>>>>> code so that both paths at least call the storestore() in the
>>>>>>>>>>>> same
>>>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and
>>>>>>>>>>>> call
>>>>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>>>>
>>>>>>>>>>> So there is a slight optimization when not running sweeper to
>>>>>>>>>>> skip
>>>>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>>>>
>>>>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>>>>>>> _stack_traversal_mark; }
>>>>>>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>>>>
>>>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>>>>> that
>>>>>>>>>>> it is concurrent accessed.
>>>>>>>>>>> And remove both storestore.
>>>>>>>>>>>
>>>>>>>>>>> "Also neither of these state variables are volatile in
>>>>>>>>>>> nmethod, so
>>>>>>>>>>> even the compiler may reorder the stores"
>>>>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>>>>
>>>>>>>>>>> I think _state also should use la/rs semantics instead, but
>>>>>>>>>>> that's
>>>>>>>>>>> another story.
>>>>>>>>>> Like this?
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>>> Yes, exactly, I like this!
>>>>>>>>> Dan? Igor ? Tobias?
>>>>>>>>>
>>>>>>>> That seems correct.
>>>>>>>>
>>>>>>>> igor
>>>>>>>>
>>>>>>>>> Thanks Roman!
>>>>>>>>>
>>>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow
>>>>>>>>> this
>>>>>>>>> thread/changeset to the end!
>>>>>>>>>
>>>>>>>>> /Robbin
>>>>>>>>>
>>>>>>>>>> Roman
> 


From kishor.kharbas at intel.com  Fri Jul 21 01:34:44 2017
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Fri, 21 Jul 2017 01:34:44 +0000
Subject: RFR(M): 8171181: Supporting heap allocation on alternative
 memory devices
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F39018DC66@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F39018C57C@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F39018DC66@ORSMSX116.amr.corp.intel.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F390191140@ORSMSX116.amr.corp.intel.com>

I have a new version of this patch at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.06/

This version has been tested on Windows, Linux, Solaris and Mac OS. I could not get access to AIX for testing.
I used tmpfs to test the functionality. Cases that were tested were.

1.       Allocation of heap using file mapping when -XX:HeapDir= option is used.

2.       Creation of nameless temporary file for Heap allocation which prevents access to file using its name.

3.       Correct deletion and freeing up of space allocated for file under different exit conditions.

4.       Error handling when path specified is not present, heap size is more than size of file system, etc.

- Kishor

From: Kharbas, Kishor
Sent: Tuesday, July 11, 2017 6:40 PM
To: 'hotspot-gc-dev at openjdk.java.net' <hotspot-gc-dev at openjdk.java.net>
Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory devices

Greetings,

I have an updated patch for JEP https://bugs.openjdk.java.net/browse/JDK-8171181 at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.05
This patch fixes the bugs pointed earlier and other suggestions to make the code less intrusive.

I have also sent this to 'hotspot-runtime-dev' mailing list (included below).

I would appreciate comments and feedback.

Thanks
Kishor

From: Kharbas, Kishor
Sent: Monday, July 10, 2017 1:53 PM
To: hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>
Cc: Kharbas, Kishor <kishor.kharbas at intel.com<mailto:kishor.kharbas at intel.com>>
Subject: RFR(M): 8171181: Supporting heap allocation on alternative memory devices

Hello all!


I have an updated patch for https://bugs.openjdk.java.net/browse/JDK-8171181 at http://cr.openjdk.java.net/~kkharbas/8171181/webrev.05
I have lost the old email chain so had to start a fresh one. The archived conversation can be found at - http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-March/022733.html


1.       I have worked on all the comments and fixed the bugs. Mainly bugs fixed are related to sigprocmask() and changed the implementation such that 'fd' is not passed all the way down the call stack. Thus minimizing function signature changes.


2.       Patch supports all OS'es. Consolidated all Posix compliant OS's implementation in os_posix.cpp.


3.       The patch is tested on Windows and Linux. Working on testing it on other OS'es.


Let me know if this version looks clean and correct.

Thanks
Kishor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170721/eea68cd4/attachment.htm>

From mikael.gerdin at oracle.com  Fri Jul 21 07:42:36 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Fri, 21 Jul 2017 09:42:36 +0200
Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their
 own files
In-Reply-To: <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com>
References: <1500576060.2688.11.camel@oracle.com>
 <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com>
Message-ID: <bcd0df9c-c809-ec2a-030d-d38235e79c68@oracle.com>

Hi,

On 2017-07-20 20:46, Aleksey Shipilev wrote:
> On 07/20/2017 08:41 PM, Thomas Schatzl wrote:
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8184347/webrev/
> 
> Looks good to me.

+1
/Mikael

> 
> Would you like us to RFE moving this to gc/shared some time later? This would
> quire probably need to decouple listeners from the otherwise GC agnostic code.
> 
> Thanks,
> -Aleksey
> 


From rkennke at redhat.com  Fri Jul 21 08:02:58 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 21 Jul 2017 10:02:58 +0200
Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their
 own files
In-Reply-To: <1500576060.2688.11.camel@oracle.com>
References: <1500576060.2688.11.camel@oracle.com>
Message-ID: <66187932-2b94-edfc-4910-18acaf5a61a2@redhat.com>

Hi Thomas,

this change looks good to me.

Roman (not an official reviewer)

> Hi all,
>
>   can I have reviews for this wrap-up of the G1CMBitmap cleanup? It
> simply moves all G1CMBitmap related code into their own files.
>
> Although it's a large change, it's really only moving code.
>
> Depends on JDK-8184346, based on webrev.3.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8184347
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8184347/webrev/
> Testing:
> jprt
>
> Thomas
>


From rkennke at redhat.com  Fri Jul 21 08:06:17 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 21 Jul 2017 10:06:17 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <1500576042.2688.10.camel@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com>
 <1500548791.2924.6.camel@oracle.com> <1500576042.2688.10.camel@oracle.com>
Message-ID: <0ac30a4c-752d-d27f-ce57-748265ac8eb6@redhat.com>

Looks good to me.

Roman

> Hi again,
>
>   a few more cleanups could be found that were worth picking up here.
>
> On Thu, 2017-07-20 at 13:06 +0200, Thomas Schatzl wrote:
>> Hi all,
>>
>>   Erik and Mikael had a look at it and suggested several further
>> cleanups, removing about 40 LOC more. These included:
>>
>> - instead of G1CMBitMapR0 use properly const'ified G1CMBitmaps
>> - change the _start and _word_size members into an equivalent
>> MemRegion
>> - minor cleanups, removing obsolete asserts, simplify code.
>>
>> Webrevs:
>> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.1_to_2/ (diff)
>> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2/ (full)
>> Testing:
>> jprt
> Webrevs:
> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/ (full)
> Testing:
> jprt
>
> Thanks,
>   Thomas


From rkennke at redhat.com  Fri Jul 21 08:07:14 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 21 Jul 2017 10:07:14 +0200
Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their
 own files
In-Reply-To: <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com>
References: <1500576060.2688.11.camel@oracle.com>
 <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com>
Message-ID: <6e8ff36c-8501-9b7c-9a86-efa19524c728@redhat.com>

Am 20.07.2017 um 20:46 schrieb Aleksey Shipilev:
> On 07/20/2017 08:41 PM, Thomas Schatzl wrote:
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8184347/webrev/
> Looks good to me.
>
> Would you like us to RFE moving this to gc/shared some time later?

I think we already discussed this, and I believe the answer was yes? ;-)

https://bugs.openjdk.java.net/browse/JDK-8180193

Roman


From shade at redhat.com  Fri Jul 21 08:08:55 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 21 Jul 2017 10:08:55 +0200
Subject: RFR (S): 8184347: Move G1CMBitMap and support classes into their
 own files
In-Reply-To: <6e8ff36c-8501-9b7c-9a86-efa19524c728@redhat.com>
References: <1500576060.2688.11.camel@oracle.com>
 <248b03ed-07eb-ae48-8cf3-b215e687fc35@redhat.com>
 <6e8ff36c-8501-9b7c-9a86-efa19524c728@redhat.com>
Message-ID: <123819a2-f2ca-e9ad-6ad4-e84bd8ba1231@redhat.com>

On 07/21/2017 10:07 AM, Roman Kennke wrote:
> Am 20.07.2017 um 20:46 schrieb Aleksey Shipilev:
>> On 07/20/2017 08:41 PM, Thomas Schatzl wrote:
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8184347/webrev/
>> Looks good to me.
>>
>> Would you like us to RFE moving this to gc/shared some time later?
> 
> I think we already discussed this, and I believe the answer was yes? ;-)
> 
> https://bugs.openjdk.java.net/browse/JDK-8180193

Missed that! :)

-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170721/9af5563b/signature.asc>

From mikael.gerdin at oracle.com  Fri Jul 21 08:15:48 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Fri, 21 Jul 2017 10:15:48 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <b8214f28-6de7-08e5-1447-311501b5331f@redhat.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com>
 <1500548791.2924.6.camel@oracle.com> <1500576042.2688.10.camel@oracle.com>
 <b8214f28-6de7-08e5-1447-311501b5331f@redhat.com>
Message-ID: <9d2dbdbb-354a-bd0c-3bb1-dad8902c13b6@oracle.com>

Hi Aleksey,

On 2017-07-20 20:50, Aleksey Shipilev wrote:
> On 07/20/2017 08:40 PM, Thomas Schatzl wrote:
>> Webrevs:
>> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/ (diff)
>> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/ (full)
> 

Looks fine to me too.

> Generally good, comments:
> 
>   *) Long log_debug, log_warning, assert lines in g1CollectedHeap.cpp,
> g1ConcurrentMark.cpp, g1ConcurrentMark.hpp
> 
>   *) It seems the field and method names are camel-cased and thus
> style-inconsistent with the rest of the code?
> 625   const G1CMBitMap* const prevMarkBitMap() const { return _prevMarkBitMap; }
> 626   G1CMBitMap* nextMarkBitMap() const { return _nextMarkBitMap; }

I think the idea is to perform that renaming in G1ConcurrentMark in a 
later change since this one tries to only concern G1CMBitMap.

/Mikael

> 
> Thanks,
> -Aleksey
> 


From shade at redhat.com  Fri Jul 21 08:17:22 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 21 Jul 2017 10:17:22 +0200
Subject: RFR (M): 8184346: Clean up G1CMBitmap
In-Reply-To: <9d2dbdbb-354a-bd0c-3bb1-dad8902c13b6@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <5bdc1a11-1159-7e81-ca31-fd96719f89c0@redhat.com>
 <1500548791.2924.6.camel@oracle.com> <1500576042.2688.10.camel@oracle.com>
 <b8214f28-6de7-08e5-1447-311501b5331f@redhat.com>
 <9d2dbdbb-354a-bd0c-3bb1-dad8902c13b6@oracle.com>
Message-ID: <90f10619-2265-a6a9-7a6c-dd4c5e2a6082@redhat.com>

On 07/21/2017 10:15 AM, Mikael Gerdin wrote:
> On 2017-07-20 20:50, Aleksey Shipilev wrote:
>> On 07/20/2017 08:40 PM, Thomas Schatzl wrote:
>>> Webrevs:
>>> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.2_to_3/ (diff)
>>> http://cr.openjdk.java.net/~tschatzl/8184346/webrev.3/ (full)
>>
> 
> Looks fine to me too.
> 
>> Generally good, comments:
>>
>>   *) Long log_debug, log_warning, assert lines in g1CollectedHeap.cpp,
>> g1ConcurrentMark.cpp, g1ConcurrentMark.hpp
>>
>>   *) It seems the field and method names are camel-cased and thus
>> style-inconsistent with the rest of the code?
>> 625   const G1CMBitMap* const prevMarkBitMap() const { return _prevMarkBitMap; }
>> 626   G1CMBitMap* nextMarkBitMap() const { return _nextMarkBitMap; }
> 
> I think the idea is to perform that renaming in G1ConcurrentMark in a later
> change since this one tries to only concern G1CMBitMap.

No problem! Fix the long asserts, and I am happy with the patch.

-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170721/bd69d9c3/signature.asc>

From kirk at kodewerk.com  Fri Jul 21 07:34:02 2017
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Fri, 21 Jul 2017 10:34:02 +0300
Subject: Bug in G1
In-Reply-To: <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
 <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
 <1500034180.3458.67.camel@oracle.com>
 <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com>
Message-ID: <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170721/78d4fbae/attachment.htm>

From rkennke at redhat.com  Fri Jul 21 10:13:24 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 21 Jul 2017 12:13:24 +0200
Subject: RFR: 8184751: Provide thread pool for parallel safepoint cleanup
Message-ID: <8ec1092c-b01e-80a9-23dd-8447e30c675e@redhat.com>

This is a follow-up to 8180932: Parallelize safepoint cleanup, which
should land in JDK10 real soon now.

In order to actually be able to parallelize safepoint cleanup, we now
need the GC to provide some worker threads.

In this change, I propose to create one globally (i.e. for all GCs) in
CollectedHeap, if ParallelSafepointCleanupThreads>1. The flag defaults
to 0, which means it's doing cleanup using the VMThread (i.e. exactly
current behaviour).

We have already discussed this, and came to the conclusion that it does
not really make sense to share the GC's worker threads here, because
they may not be idle, but only suspended from concurrent work (i.e. by
SuspendibleThreadSet::synchronize() or similar).

http://cr.openjdk.java.net/~rkennke/8184751/webrev.00/
<http://cr.openjdk.java.net/%7Erkennke/8184751/webrev.00/>

What do you think?


Roman


From thomas.schatzl at oracle.com  Fri Jul 21 14:34:27 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 21 Jul 2017 16:34:27 +0200
Subject: Bug in G1
In-Reply-To: <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
 <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
 <1500034180.3458.67.camel@oracle.com>
 <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com>
 <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com>
Message-ID: <1500647667.2385.33.camel@oracle.com>

Hi Kirk,

On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote:
> Hi all,
> 
> A while back I mentioned to Erik at JFokus that I was seeing a
> puzzling behavior in the G1 where without any obvious failure, heap
> occupancy after collections would spike which would trigger a full
> which would (unexpectedly) completely recover everything down to the
> expected live set. Yesterday while working with Simone Bordet on the
> problem we came to the realization that we were seeing a pattern
> prior to the ramp up to the Full, Survivor space would be
> ergonomically resized to 0 -> 0. The only way to reset the situation
> was to run a full collection. In our minds this doesn?t make any
> sense to reset survivor space to 0. So far this is an observation
> from a single GC log but I recall seeing the pattern in many other
> logs. Before I go through the exercise of building a super grep to
> run over my G1 log repo I?d like to ask; under what conditions would
> it make sense to have the survivor space resized to 0? And if not,
> ?would this be bug in G1? We tried reproducing the behavior in some
> test applications but I fear we often only see this happening in
> production applications that have been running for several days. It?s
> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9.

? sounds similar to?https://bugs.openjdk.java.net/browse/JDK-8037500.
Could you please post the type of collections for a few more gcs before
the zero-sized ones? It would be particularly interesting if there is a
mixed gc with to-space exhaustion just before this sequence. And if
there are log messages with attempts to start marking too.

As for why that bug has been closed as "won't fix" because we do not
have a reproducer (any more) to test any changes in addition to the
stated reasons that the performance impact seemed minor at that time.

There have been some changes in how the next gc is calculated in 9 too,
so I do not know either if 9 is also affected (particularly one of
these young-only gc's would not be issued any more).

I can think of at least one more reasons other than stated in the CR
why this occurs at least for 8u60+ builds. There is the possibility
particularly in conjunction with humongous object allocation that after
starting the mutator, immediately afterwards a young gc that reclaims
zero space is issued, e.g.:

young-gc, has X regions left at the end, starts mutators
mutator 1 allocates exactly X regions as humongous objects
mutator 2 allocates, finds that there are no regions left, issues
young-gc request; in this young-gc eden and survivor are of obviously
of zero size
[...and so on...]

Note that this pattern could repeat multiple times as young gc may
reclaim space from humongous objects (eager reclaim!) until at some
point it ran into full gc.

The logging that shows humongous object allocation (something about
reaching threshold and starting marking) could confirm this situation.

No guarantees about that being the actual issue though.

Thanks,
? Thomas


From milan.mimica at gmail.com  Sun Jul 23 08:31:37 2017
From: milan.mimica at gmail.com (Milan Mimica)
Date: Sun, 23 Jul 2017 08:31:37 +0000
Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as
 belonging to mtGC
In-Reply-To: <1500536234.2924.0.camel@oracle.com>
References: <CAC+6wjoNV-Tr=35cA_R0f4ZKTQnVHnOduZHWsti6GayUaOUXCw@mail.gmail.com>
 <1500536234.2924.0.camel@oracle.com>
Message-ID: <CAC+6wjr8PwpqdU5uhBdj5yxPgW76-5fvSZE_njRFqoKf+p711g@mail.gmail.com>

?et, 20. srp 2017. u 09:37 Thomas Schatzl <thomas.schatzl at oracle.com>
napisao je:

>
>   great!
>
> Looks good. I can sponsor as soon as Kim or anybody else gives his
> okay.
>

Hi

I just noticed my heapBitMap_nmt.diff includes the other one. Find the
corrected one in attachment.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170723/255bd957/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: heapBitMap_nmt.diff
Type: text/x-patch
Size: 5361 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170723/255bd957/heapBitMap_nmt.diff>

From kirk at kodewerk.com  Sun Jul 23 10:51:39 2017
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Sun, 23 Jul 2017 13:51:39 +0300
Subject: Bug in G1
In-Reply-To: <1500647667.2385.33.camel@oracle.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
 <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
 <1500034180.3458.67.camel@oracle.com>
 <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com>
 <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com>
 <1500647667.2385.33.camel@oracle.com>
Message-ID: <A7B172E5-185F-43A2-B992-3F724C6DA851@kodewerk.com>

Thanks for the information. I?ve shared the entire log with you on dropbox. Feel free to distribute it as you see fit.	

I see the to-space exhausted but there doesn?t appear to be a mixed collection involved. Below is a single sequence up to and including the Full.

Kind regards,
Kirk


2017-05-23T20:42:55.303-0400: 40580.396: Application time: 0.8539675 seconds
2017-05-23T20:42:55.304-0400: 40580.398: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 169869312 bytes, new threshold 15 (max 15)
- age   1:    3278808 bytes,    3278808 total
- age   2:   71278552 bytes,   74557360 total
- age   3:     533720 bytes,   75091080 total
- age   4:   12897544 bytes,   87988624 total
- age   5:     796672 bytes,   88785296 total
- age   6:     503288 bytes,   89288584 total
2017-05-23T20:42:55.363-0400: 40580.457: [SoftReference, 0 refs, 0.0010011 secs]2017-05-23T20:42:55.364-0400: 40580.458: [WeakReference, 367 refs, 0.0006136 secs]2017-05-23T20:42:55.365-0400: 40580.458: [FinalReference, 7659 refs, 0.0014460 secs]2017-05-23T20:42:55.366-0400: 40580.460: [PhantomReference, 0 refs, 0 refs, 0.0011060 secs]2017-05-23T20:42:55.367-0400: 40580.461: [JNI Weak Reference, 0.0000647 secs], 0.0669684 secs]
   [Parallel Time: 57.9 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 40580398.1, Avg: 40580398.2, Max: 40580398.3, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 2.7, Diff: 1.0, Sum: 15.2]
      [Update RS (ms): Min: 15.2, Avg: 15.7, Max: 15.8, Diff: 0.6, Sum: 125.4]
         [Processed Buffers: Min: 44, Avg: 50.1, Max: 62, Diff: 18, Sum: 401]
      [Scan RS (ms): Min: 1.9, Avg: 2.0, Max: 2.1, Diff: 0.2, Sum: 15.9]
      [Code Root Scanning (ms): Min: 1.6, Avg: 1.7, Max: 1.7, Diff: 0.1, Sum: 13.5]
      [Object Copy (ms): Min: 36.0, Avg: 36.2, Max: 36.2, Diff: 0.2, Sum: 289.2]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
      [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 1.0]
      [GC Worker Total (ms): Min: 57.4, Avg: 57.5, Max: 57.6, Diff: 0.2, Sum: 460.3]
      [GC Worker End (ms): Min: 40580455.7, Avg: 40580455.7, Max: 40580455.8, Diff: 0.1]
   [Code Root Fixup: 0.2 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.8 ms]
   [Other: 8.1 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 4.7 ms]
      [Ref Enq: 0.3 ms]
      [Redirty Cards: 0.3 ms]
      [Humongous Register: 0.2 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 1.9 ms]
   [Eden: 2484.0M(2484.0M)->0.0B(3544.0M) Survivors: 98.0M->100.0M Heap: 5189.0M(7168.0M)->2708.0M(7168.0M)]
 [Times: user=0.45 sys=0.03, real=0.07 secs] 
2017-05-23T20:42:55.372-0400: 40580.465: Total time for which application threads were stopped: 0.0685303 seconds, Stopping threads took: 0.0001346 seconds
2017-05-23T20:42:59.372-0400: 40584.465: Application time: 4.0004774 seconds
2017-05-23T20:42:59.376-0400: 40584.469: Total time for which application threads were stopped: 0.0036324 seconds, Stopping threads took: 0.0023017 seconds
2017-05-23T20:43:05.891-0400: 40590.984: Application time: 6.5149722 seconds
2017-05-23T20:43:05.892-0400: 40590.985: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 239075328 bytes, new threshold 15 (max 15)
- age   1:    4934368 bytes,    4934368 total
- age   2:    2633808 bytes,    7568176 total
- age   3:   71264464 bytes,   78832640 total
- age   4:     527368 bytes,   79360008 total
- age   5:   12893400 bytes,   92253408 total
- age   6:     750128 bytes,   93003536 total
- age   7:     432784 bytes,   93436320 total
2017-05-23T20:43:09.493-0400: 40594.586: [SoftReference, 0 refs, 0.0067938 secs]2017-05-23T20:43:09.500-0400: 40594.593: [WeakReference, 0 refs, 0.0033881 secs]2017-05-23T20:43:09.503-0400: 40594.597: [FinalReference, 0 refs, 0.0005787 secs]2017-05-23T20:43:09.504-0400: 40594.597: [PhantomReference, 0 refs, 0 refs, 0.0011377 secs]2017-05-23T20:43:09.505-0400: 40594.598: [JNI Weak Reference, 0.0000618 secs] (to-space exhausted), 4.8672247 secs]
   [Parallel Time: 3599.9 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 40590985.9, Avg: 40590986.0, Max: 40590986.1, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.3, Diff: 0.6, Sum: 15.2]
      [Update RS (ms): Min: 68.3, Avg: 68.4, Max: 68.5, Diff: 0.2, Sum: 547.6]
         [Processed Buffers: Min: 32, Avg: 49.0, Max: 72, Diff: 40, Sum: 392]
      [Scan RS (ms): Min: 2.8, Avg: 2.9, Max: 3.0, Diff: 0.1, Sum: 23.2]
      [Code Root Scanning (ms): Min: 2.4, Avg: 2.5, Max: 2.5, Diff: 0.1, Sum: 19.7]
      [Object Copy (ms): Min: 3523.7, Avg: 3523.8, Max: 3523.9, Diff: 0.2, Sum: 28190.6]
      [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.5]
      [GC Worker Total (ms): Min: 3599.6, Avg: 3599.7, Max: 3599.8, Diff: 0.2, Sum: 28797.6]
      [GC Worker End (ms): Min: 40594585.6, Avg: 40594585.7, Max: 40594585.7, Diff: 0.1]
   [Code Root Fixup: 0.3 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 1.2 ms]
   [Other: 1265.8 ms]
      [Evacuation Failure: 1248.2 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 12.4 ms]
      [Ref Enq: 0.5 ms]
      [Redirty Cards: 2.1 ms]
      [Humongous Register: 0.2 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 1.5 ms]
   [Eden: 3544.0M(3544.0M)->0.0B(976.0M) Survivors: 100.0M->456.0M Heap: 6274.3M(7168.0M)->5978.2M(7168.0M)]
 [Times: user=13.58 sys=0.11, real=4.86 secs] 
2017-05-23T20:43:10.760-0400: 40595.853: Total time for which application threads were stopped: 4.8690628 seconds, Stopping threads took: 0.0002136 seconds
2017-05-23T20:43:11.762-0400: 40596.855: Application time: 1.0019247 seconds
2017-05-23T20:43:11.763-0400: 40596.856: Total time for which application threads were stopped: 0.0015356 seconds, Stopping threads took: 0.0003107 seconds
2017-05-23T20:43:11.880-0400: 40596.973: Application time: 0.1164884 seconds
2017-05-23T20:43:11.881-0400: 40596.974: [GC pause (G1 Humongous Allocation) (young) (initial-mark)
Desired survivor size 94371840 bytes, new threshold 1 (max 15)
- age   1:  477501112 bytes,  477501112 total
- age   2:     182296 bytes,  477683408 total
- age   3:      78880 bytes,  477762288 total
- age   4:      45376 bytes,  477807664 total
- age   5:      92304 bytes,  477899968 total
- age   6:      75448 bytes,  477975416 total
- age   7:      86752 bytes,  478062168 total
- age   8:      71408 bytes,  478133576 total
2017-05-23T20:43:17.335-0400: 40602.428: [SoftReference, 0 refs, 0.0071133 secs]2017-05-23T20:43:17.342-0400: 40602.435: [WeakReference, 3 refs, 0.0007987 secs]2017-05-23T20:43:17.343-0400: 40602.436: [FinalReference, 182 refs, 0.0017603 secs]2017-05-23T20:43:17.345-0400: 40602.438: [PhantomReference, 0 refs, 0 refs, 0.0015961 secs]2017-05-23T20:43:17.346-0400: 40602.440: [JNI Weak Reference, 0.0000730 secs] (to-space exhausted), 6.1987667 secs]
   [Parallel Time: 5446.3 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 40596975.6, Avg: 40596975.7, Max: 40596975.8, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 2.9, Avg: 3.1, Max: 3.2, Diff: 0.3, Sum: 24.4]
      [Update RS (ms): Min: 10.1, Avg: 10.3, Max: 10.5, Diff: 0.4, Sum: 82.6]
         [Processed Buffers: Min: 33, Avg: 40.2, Max: 51, Diff: 18, Sum: 322]
      [Scan RS (ms): Min: 30.7, Avg: 31.1, Max: 32.4, Diff: 1.8, Sum: 249.0]
      [Code Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.6, Diff: 0.5, Sum: 2.8]
      [Object Copy (ms): Min: 5399.2, Avg: 5400.6, Max: 5400.9, Diff: 1.7, Sum: 43204.5]
      [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.3]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
      [GC Worker Total (ms): Min: 5445.5, Avg: 5445.6, Max: 5445.7, Diff: 0.2, Sum: 43565.0]
      [GC Worker End (ms): Min: 40602421.3, Avg: 40602421.4, Max: 40602421.4, Diff: 0.1]
   [Code Root Fixup: 0.3 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.8 ms]
   [Other: 751.4 ms]
      [Evacuation Failure: 728.5 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 17.8 ms]
      [Ref Enq: 0.5 ms]
      [Redirty Cards: 2.1 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.2 ms]
      [Free CSet: 0.8 ms]
   [Eden: 878.0M(976.0M)->0.0B(1424.0M) Survivors: 456.0M->8192.0K Heap: 6856.2M(7168.0M)->6908.2M(7168.0M)]
 [Times: user=11.66 sys=1.15, real=6.19 secs] 
2017-05-23T20:43:18.080-0400: 40603.173: [GC concurrent-root-region-scan-start]
2017-05-23T20:43:18.080-0400: 40603.173: Total time for which application threads were stopped: 6.2005443 seconds, Stopping threads took: 0.0002322 seconds
2017-05-23T20:43:18.080-0400: 40603.174: Application time: 0.0002882 seconds
2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-root-region-scan-end, 0.0339339 secs]
2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-mark-start]
2017-05-23T20:43:18.142-0400: 40603.235: Total time for which application threads were stopped: 0.0613820 seconds, Stopping threads took: 0.0001677 seconds
2017-05-23T20:43:18.142-0400: 40603.236: Application time: 0.0005017 seconds
2017-05-23T20:43:18.144-0400: 40603.237: Total time for which application threads were stopped: 0.0013197 seconds, Stopping threads took: 0.0001188 seconds
2017-05-23T20:43:18.144-0400: 40603.237: Application time: 0.0001781 seconds
2017-05-23T20:43:18.144-0400: 40603.238: Total time for which application threads were stopped: 0.0005735 seconds, Stopping threads took: 0.0000568 seconds
2017-05-23T20:43:18.728-0400: 40603.821: Application time: 0.5835349 seconds
2017-05-23T20:43:18.730-0400: 40603.823: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 94371840 bytes, new threshold 15 (max 15)
- age   1:    8388248 bytes,    8388248 total
2017-05-23T20:43:19.821-0400: 40604.914: [SoftReference, 0 refs, 0.0009673 secs]2017-05-23T20:43:19.822-0400: 40604.915: [WeakReference, 0 refs, 0.0006733 secs]2017-05-23T20:43:19.823-0400: 40604.916: [FinalReference, 0 refs, 0.0006260 secs]2017-05-23T20:43:19.823-0400: 40604.917: [PhantomReference, 0 refs, 0 refs, 0.0013002 secs]2017-05-23T20:43:19.825-0400: 40604.918: [JNI Weak Reference, 0.0000642 secs] (to-space exhausted), 1.2567408 secs]
   [Parallel Time: 1084.5 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 40603823.4, Avg: 40603823.5, Max: 40603823.6, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.7, Sum: 15.3]
      [Update RS (ms): Min: 23.8, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum: 191.7]
         [Processed Buffers: Min: 49, Avg: 53.5, Max: 60, Diff: 11, Sum: 428]
      [Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 8.6]
      [Code Root Scanning (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.8]
      [Object Copy (ms): Min: 1056.4, Avg: 1056.8, Max: 1057.2, Diff: 0.8, Sum: 8454.7]
      [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.0]
         [Termination Attempts: Min: 1, Avg: 3.8, Max: 7, Diff: 6, Sum: 30]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.5]
      [GC Worker Total (ms): Min: 1084.0, Avg: 1084.1, Max: 1084.2, Diff: 0.2, Sum: 8673.2]
      [GC Worker End (ms): Min: 40604907.6, Avg: 40604907.7, Max: 40604907.7, Diff: 0.0]
   [Code Root Fixup: 0.3 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.3 ms]
   [Other: 171.7 ms]
      [Evacuation Failure: 159.4 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 9.9 ms]
      [Ref Enq: 0.6 ms]
      [Redirty Cards: 0.6 ms]
      [Humongous Register: 0.2 ms]
      [Humongous Reclaim: 0.3 ms]
      [Free CSet: 0.2 ms]
   [Eden: 230.0M(1424.0M)->0.0B(1432.0M) Survivors: 8192.0K->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
 [Times: user=2.33 sys=0.34, real=1.26 secs] 
2017-05-23T20:43:19.987-0400: 40605.080: Total time for which application threads were stopped: 1.2587489 seconds, Stopping threads took: 0.0002182 seconds
2017-05-23T20:43:19.987-0400: 40605.080: Application time: 0.0003101 seconds
2017-05-23T20:43:19.988-0400: 40605.082: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 94371840 bytes, new threshold 15 (max 15)
2017-05-23T20:43:20.020-0400: 40605.113: [SoftReference, 0 refs, 0.0008856 secs]2017-05-23T20:43:20.020-0400: 40605.114: [WeakReference, 0 refs, 0.0005588 secs]2017-05-23T20:43:20.021-0400: 40605.114: [FinalReference, 0 refs, 0.0006006 secs]2017-05-23T20:43:20.022-0400: 40605.115: [PhantomReference, 0 refs, 0 refs, 0.0010837 secs]2017-05-23T20:43:20.023-0400: 40605.116: [JNI Weak Reference, 0.0000610 secs], 0.0356212 secs]
   [Parallel Time: 30.0 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 40605081.9, Avg: 40605082.0, Max: 40605082.1, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.9, Avg: 2.0, Max: 2.5, Diff: 0.6, Sum: 16.1]
      [Update RS (ms): Min: 27.3, Avg: 27.4, Max: 27.5, Diff: 0.2, Sum: 219.3]
         [Processed Buffers: Min: 82, Avg: 87.4, Max: 92, Diff: 10, Sum: 699]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.4]
      [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.4]
      [GC Worker Total (ms): Min: 29.7, Avg: 29.8, Max: 29.9, Diff: 0.2, Sum: 238.5]
      [GC Worker End (ms): Min: 40605111.8, Avg: 40605111.8, Max: 40605111.8, Diff: 0.0]
   [Code Root Fixup: 0.3 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.2 ms]
   [Other: 5.1 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 4.0 ms]
      [Ref Enq: 0.2 ms]
      [Redirty Cards: 0.2 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.2 ms]
      [Free CSet: 0.1 ms]
   [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
 [Times: user=0.25 sys=0.00, real=0.04 secs] 
2017-05-23T20:43:20.024-0400: 40605.118: Total time for which application threads were stopped: 0.0372043 seconds, Stopping threads took: 0.0001640 seconds
2017-05-23T20:43:20.025-0400: 40605.118: Application time: 0.0002435 seconds
2017-05-23T20:43:20.026-0400: 40605.119: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 94371840 bytes, new threshold 15 (max 15)
2017-05-23T20:43:20.030-0400: 40605.123: [SoftReference, 0 refs, 0.0009405 secs]2017-05-23T20:43:20.031-0400: 40605.124: [WeakReference, 0 refs, 0.0005771 secs]2017-05-23T20:43:20.032-0400: 40605.125: [FinalReference, 0 refs, 0.0005766 secs]2017-05-23T20:43:20.032-0400: 40605.125: [PhantomReference, 0 refs, 0 refs, 0.0011847 secs]2017-05-23T20:43:20.033-0400: 40605.127: [JNI Weak Reference, 0.0000549 secs], 0.0087717 secs]
   [Parallel Time: 3.0 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 40605119.3, Avg: 40605119.4, Max: 40605119.5, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 2.6, Diff: 1.0, Sum: 14.8]
      [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
         [Processed Buffers: Min: 0, Avg: 0.5, Max: 2, Diff: 2, Sum: 4]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
      [Termination (ms): Min: 0.0, Avg: 0.6, Max: 0.6, Diff: 0.6, Sum: 4.4]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
      [GC Worker Total (ms): Min: 2.6, Avg: 2.6, Max: 2.7, Diff: 0.1, Sum: 21.1]
      [GC Worker End (ms): Min: 40605122.0, Avg: 40605122.1, Max: 40605122.1, Diff: 0.1]
   [Code Root Fixup: 0.3 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.3 ms]
   [Other: 5.2 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 4.1 ms]
      [Ref Enq: 0.3 ms]
      [Redirty Cards: 0.3 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 0.1 ms]
   [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
 [Times: user=0.03 sys=0.00, real=0.01 secs] 
2017-05-23T20:43:20.035-0400: 40605.128: Total time for which application threads were stopped: 0.0102350 seconds, Stopping threads took: 0.0000635 seconds
2017-05-23T20:43:20.035-0400: 40605.128: Application time: 0.0002150 seconds
2017-05-23T20:43:20.036-0400: 40605.129: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 94371840 bytes, new threshold 15 (max 15)
2017-05-23T20:43:20.040-0400: 40605.133: [SoftReference, 0 refs, 0.0010156 secs]2017-05-23T20:43:20.041-0400: 40605.134: [WeakReference, 0 refs, 0.0006580 secs]2017-05-23T20:43:20.042-0400: 40605.135: [FinalReference, 0 refs, 0.0006435 secs]2017-05-23T20:43:20.042-0400: 40605.136: [PhantomReference, 0 refs, 0 refs, 0.0012604 secs]2017-05-23T20:43:20.044-0400: 40605.137: [JNI Weak Reference, 0.0000513 secs], 0.0087896 secs]
   [Parallel Time: 2.7 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 40605129.6, Avg: 40605129.7, Max: 40605129.8, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.5, Diff: 0.8, Sum: 14.9]
      [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
         [Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.3]
      [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.5]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
      [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: 19.5]
      [GC Worker End (ms): Min: 40605132.1, Avg: 40605132.2, Max: 40605132.2, Diff: 0.0]
   [Code Root Fixup: 0.3 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.3 ms]
   [Other: 5.5 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 4.4 ms]
      [Ref Enq: 0.3 ms]
      [Redirty Cards: 0.3 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 0.1 ms]
   [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
 [Times: user=0.04 sys=0.00, real=0.01 secs] 
2017-05-23T20:43:20.045-0400: 40605.138: Total time for which application threads were stopped: 0.0101403 seconds, Stopping threads took: 0.0000614 seconds
2017-05-23T20:43:20.045-0400: 40605.139: Application time: 0.0001681 seconds
2017-05-23T20:43:20.046-0400: 40605.140: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 94371840 bytes, new threshold 15 (max 15)
2017-05-23T20:43:20.050-0400: 40605.144: [SoftReference, 0 refs, 0.0008321 secs]2017-05-23T20:43:20.051-0400: 40605.145: [WeakReference, 0 refs, 0.0006103 secs]2017-05-23T20:43:20.052-0400: 40605.145: [FinalReference, 0 refs, 0.0007194 secs]2017-05-23T20:43:20.053-0400: 40605.146: [PhantomReference, 0 refs, 0 refs, 0.0010705 secs]2017-05-23T20:43:20.054-0400: 40605.147: [JNI Weak Reference, 0.0000508 secs], 0.0084107 secs]
   [Parallel Time: 2.7 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 40605139.9, Avg: 40605140.0, Max: 40605140.1, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.8, Sum: 15.1]
      [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
         [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
      [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.2]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
      [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: 19.2]
      [GC Worker End (ms): Min: 40605142.4, Avg: 40605142.4, Max: 40605142.5, Diff: 0.1]
   [Code Root Fixup: 0.3 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.2 ms]
   [Other: 5.1 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 4.1 ms]
      [Ref Enq: 0.3 ms]
      [Redirty Cards: 0.2 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 0.1 ms]
   [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
 [Times: user=0.03 sys=0.01, real=0.01 secs] 
2017-05-23T20:43:20.055-0400: 40605.148: Total time for which application threads were stopped: 0.0097185 seconds, Stopping threads took: 0.0001029 seconds
2017-05-23T20:43:20.055-0400: 40605.148: Application time: 0.0001505 seconds
2017-05-23T20:43:20.056-0400: 40605.149: [Full GC (Allocation Failure) 2017-05-23T20:43:22.446-0400: 40607.540: [SoftReference, 1667 refs, 0.0003772 secs]2017-05-23T20:43:22.447-0400: 40607.541: [WeakReference, 5626 refs, 0.0008068 secs]2017-05-23T20:43:22.448-0400: 40607.541: [FinalReference, 4015 refs, 0.0015169 secs]2017-05-23T20:43:22.450-0400: 40607.543: [PhantomReference, 1 refs, 372 refs, 0.0001585 secs]2017-05-23T20:43:22.450-0400: 40607.543: [JNI Weak Reference, 0.0000963 secs] 7139M->2327M(7168M), 9.7036499 secs]
   [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->2327.6M(7168.0M)], [Metaspace: 108907K->108428K(1150976K)]
 [Times: user=13.22 sys=0.00, real=9.70 secs] 
2017-05-23T20:43:29.760-0400: 40614.853: Total time for which application threads were stopped: 9.7047785 seconds, Stopping threads took: 0.0000566 seconds
2017-05-23T20:43:29.760-0400: 40614.854: [GC concurrent-mark-abort]
2017-05-23T20:43:29.763-0400: 40614.856: Application time: 0.0029444 seconds
> On Jul 21, 2017, at 5:34 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kirk,
> 
> On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote:
>> Hi all,
>> 
>> A while back I mentioned to Erik at JFokus that I was seeing a
>> puzzling behavior in the G1 where without any obvious failure, heap
>> occupancy after collections would spike which would trigger a full
>> which would (unexpectedly) completely recover everything down to the
>> expected live set. Yesterday while working with Simone Bordet on the
>> problem we came to the realization that we were seeing a pattern
>> prior to the ramp up to the Full, Survivor space would be
>> ergonomically resized to 0 -> 0. The only way to reset the situation
>> was to run a full collection. In our minds this doesn?t make any
>> sense to reset survivor space to 0. So far this is an observation
>> from a single GC log but I recall seeing the pattern in many other
>> logs. Before I go through the exercise of building a super grep to
>> run over my G1 log repo I?d like to ask; under what conditions would
>> it make sense to have the survivor space resized to 0? And if not,
>>  would this be bug in G1? We tried reproducing the behavior in some
>> test applications but I fear we often only see this happening in
>> production applications that have been running for several days. It?s
>> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9.
> 
>   sounds similar to https://bugs.openjdk.java.net/browse/JDK-8037500.
> Could you please post the type of collections for a few more gcs before
> the zero-sized ones? It would be particularly interesting if there is a
> mixed gc with to-space exhaustion just before this sequence. And if
> there are log messages with attempts to start marking too.
> 
> As for why that bug has been closed as "won't fix" because we do not
> have a reproducer (any more) to test any changes in addition to the
> stated reasons that the performance impact seemed minor at that time.
> 
> There have been some changes in how the next gc is calculated in 9 too,
> so I do not know either if 9 is also affected (particularly one of
> these young-only gc's would not be issued any more).
> 
> I can think of at least one more reasons other than stated in the CR
> why this occurs at least for 8u60+ builds. There is the possibility
> particularly in conjunction with humongous object allocation that after
> starting the mutator, immediately afterwards a young gc that reclaims
> zero space is issued, e.g.:
> 
> young-gc, has X regions left at the end, starts mutators
> mutator 1 allocates exactly X regions as humongous objects
> mutator 2 allocates, finds that there are no regions left, issues
> young-gc request; in this young-gc eden and survivor are of obviously
> of zero size
> [...and so on...]
> 
> Note that this pattern could repeat multiple times as young gc may
> reclaim space from humongous objects (eager reclaim!) until at some
> point it ran into full gc.
> 
> The logging that shows humongous object allocation (something about
> reaching threshold and starting marking) could confirm this situation.
> 
> No guarantees about that being the actual issue though.
> 
> Thanks,
>   Thomas
> 


From vitalyd at gmail.com  Sun Jul 23 16:43:50 2017
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Sun, 23 Jul 2017 16:43:50 +0000
Subject: Bug in G1
In-Reply-To: <A7B172E5-185F-43A2-B992-3F724C6DA851@kodewerk.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
 <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
 <1500034180.3458.67.camel@oracle.com>
 <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com>
 <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com>
 <1500647667.2385.33.camel@oracle.com>
 <A7B172E5-185F-43A2-B992-3F724C6DA851@kodewerk.com>
Message-ID: <CAHjP37G=c-BPDBpuZZHytVc+iGHh=iPvBJZ8PMJ9RTP6yit7vg@mail.gmail.com>

I've seen G1 get into a similar loop.  Do you see any concurrent mark
initiation? It's possible conc marking is still running and therefore mixed
GCs aren't possible yet.  There are some ways to tune G1 to initiate
concurrent marking sooner (or more "aggressively" with more conc GC
threads), but would be good to first know if you're seeing that.

On Sun, Jul 23, 2017 at 6:52 AM Kirk Pepperdine <kirk at kodewerk.com> wrote:

> Thanks for the information. I?ve shared the entire log with you on
> dropbox. Feel free to distribute it as you see fit.
>
> I see the to-space exhausted but there doesn?t appear to be a mixed
> collection involved. Below is a single sequence up to and including the
> Full.
>
> Kind regards,
> Kirk
>
>
> 2017-05-23T20:42:55.303-0400: 40580.396: Application time: 0.8539675
> seconds
> 2017-05-23T20:42:55.304-0400: 40580.398: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 169869312 bytes, new threshold 15 (max 15)
> - age   1:    3278808 bytes,    3278808 total
> - age   2:   71278552 bytes,   74557360 total
> - age   3:     533720 bytes,   75091080 total
> - age   4:   12897544 bytes,   87988624 total
> - age   5:     796672 bytes,   88785296 total
> - age   6:     503288 bytes,   89288584 total
> 2017-05-23T20:42:55.363-0400: 40580.457: [SoftReference, 0 refs, 0.0010011
> secs]2017-05-23T20:42:55.364-0400: 40580.458: [WeakReference, 367 refs,
> 0.0006136 secs]2017-05-23T20:42:55.365-0400: 40580.458: [FinalReference,
> 7659 refs, 0.0014460 secs]2017-05-23T20:42:55.366-0400: 40580.460:
> [PhantomReference, 0 refs, 0 refs, 0.0011060
> secs]2017-05-23T20:42:55.367-0400: 40580.461: [JNI Weak Reference,
> 0.0000647 secs], 0.0669684 secs]
>    [Parallel Time: 57.9 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40580398.1, Avg: 40580398.2, Max:
> 40580398.3, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 2.7, Diff: 1.0,
> Sum: 15.2]
>       [Update RS (ms): Min: 15.2, Avg: 15.7, Max: 15.8, Diff: 0.6, Sum:
> 125.4]
>          [Processed Buffers: Min: 44, Avg: 50.1, Max: 62, Diff: 18, Sum:
> 401]
>       [Scan RS (ms): Min: 1.9, Avg: 2.0, Max: 2.1, Diff: 0.2, Sum: 15.9]
>       [Code Root Scanning (ms): Min: 1.6, Avg: 1.7, Max: 1.7, Diff: 0.1,
> Sum: 13.5]
>       [Object Copy (ms): Min: 36.0, Avg: 36.2, Max: 36.2, Diff: 0.2, Sum:
> 289.2]
>       [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum:
> 1.0]
>       [GC Worker Total (ms): Min: 57.4, Avg: 57.5, Max: 57.6, Diff: 0.2,
> Sum: 460.3]
>       [GC Worker End (ms): Min: 40580455.7, Avg: 40580455.7, Max:
> 40580455.8, Diff: 0.1]
>    [Code Root Fixup: 0.2 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.8 ms]
>    [Other: 8.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.7 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 1.9 ms]
>    [Eden: 2484.0M(2484.0M)->0.0B(3544.0M) Survivors: 98.0M->100.0M Heap:
> 5189.0M(7168.0M)->2708.0M(7168.0M)]
>  [Times: user=0.45 sys=0.03, real=0.07 secs]
> 2017-05-23T20:42:55.372-0400: 40580.465: Total time for which application
> threads were stopped: 0.0685303 seconds, Stopping threads took: 0.0001346
> seconds
> 2017-05-23T20:42:59.372-0400: 40584.465: Application time: 4.0004774
> seconds
> 2017-05-23T20:42:59.376-0400: 40584.469: Total time for which application
> threads were stopped: 0.0036324 seconds, Stopping threads took: 0.0023017
> seconds
> 2017-05-23T20:43:05.891-0400: 40590.984: Application time: 6.5149722
> seconds
> 2017-05-23T20:43:05.892-0400: 40590.985: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 239075328 bytes, new threshold 15 (max 15)
> - age   1:    4934368 bytes,    4934368 total
> - age   2:    2633808 bytes,    7568176 total
> - age   3:   71264464 bytes,   78832640 total
> - age   4:     527368 bytes,   79360008 total
> - age   5:   12893400 bytes,   92253408 total
> - age   6:     750128 bytes,   93003536 total
> - age   7:     432784 bytes,   93436320 total
> 2017-05-23T20:43:09.493-0400: 40594.586: [SoftReference, 0 refs, 0.0067938
> secs]2017-05-23T20:43:09.500-0400: 40594.593: [WeakReference, 0 refs,
> 0.0033881 secs]2017-05-23T20:43:09.503-0400: 40594.597: [FinalReference, 0
> refs, 0.0005787 secs]2017-05-23T20:43:09.504-0400: 40594.597:
> [PhantomReference, 0 refs, 0 refs, 0.0011377
> secs]2017-05-23T20:43:09.505-0400: 40594.598: [JNI Weak Reference,
> 0.0000618 secs] (to-space exhausted), 4.8672247 secs]
>    [Parallel Time: 3599.9 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40590985.9, Avg: 40590986.0, Max:
> 40590986.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.3, Diff: 0.6,
> Sum: 15.2]
>       [Update RS (ms): Min: 68.3, Avg: 68.4, Max: 68.5, Diff: 0.2, Sum:
> 547.6]
>          [Processed Buffers: Min: 32, Avg: 49.0, Max: 72, Diff: 40, Sum:
> 392]
>       [Scan RS (ms): Min: 2.8, Avg: 2.9, Max: 3.0, Diff: 0.1, Sum: 23.2]
>       [Code Root Scanning (ms): Min: 2.4, Avg: 2.5, Max: 2.5, Diff: 0.1,
> Sum: 19.7]
>       [Object Copy (ms): Min: 3523.7, Avg: 3523.8, Max: 3523.9, Diff: 0.2,
> Sum: 28190.6]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 3599.6, Avg: 3599.7, Max: 3599.8, Diff:
> 0.2, Sum: 28797.6]
>       [GC Worker End (ms): Min: 40594585.6, Avg: 40594585.7, Max:
> 40594585.7, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 1.2 ms]
>    [Other: 1265.8 ms]
>       [Evacuation Failure: 1248.2 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 12.4 ms]
>       [Ref Enq: 0.5 ms]
>       [Redirty Cards: 2.1 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 1.5 ms]
>    [Eden: 3544.0M(3544.0M)->0.0B(976.0M) Survivors: 100.0M->456.0M Heap:
> 6274.3M(7168.0M)->5978.2M(7168.0M)]
>  [Times: user=13.58 sys=0.11, real=4.86 secs]
> 2017-05-23T20:43:10.760-0400: 40595.853: Total time for which application
> threads were stopped: 4.8690628 seconds, Stopping threads took: 0.0002136
> seconds
> 2017-05-23T20:43:11.762-0400: 40596.855: Application time: 1.0019247
> seconds
> 2017-05-23T20:43:11.763-0400: 40596.856: Total time for which application
> threads were stopped: 0.0015356 seconds, Stopping threads took: 0.0003107
> seconds
> 2017-05-23T20:43:11.880-0400: 40596.973: Application time: 0.1164884
> seconds
> 2017-05-23T20:43:11.881-0400: 40596.974: [GC pause (G1 Humongous
> Allocation) (young) (initial-mark)
> Desired survivor size 94371840 bytes, new threshold 1 (max 15)
> - age   1:  477501112 bytes,  477501112 total
> - age   2:     182296 bytes,  477683408 total
> - age   3:      78880 bytes,  477762288 total
> - age   4:      45376 bytes,  477807664 total
> - age   5:      92304 bytes,  477899968 total
> - age   6:      75448 bytes,  477975416 total
> - age   7:      86752 bytes,  478062168 total
> - age   8:      71408 bytes,  478133576 total
> 2017-05-23T20:43:17.335-0400: 40602.428: [SoftReference, 0 refs, 0.0071133
> secs]2017-05-23T20:43:17.342-0400: 40602.435: [WeakReference, 3 refs,
> 0.0007987 secs]2017-05-23T20:43:17.343-0400: 40602.436: [FinalReference,
> 182 refs, 0.0017603 secs]2017-05-23T20:43:17.345-0400: 40602.438:
> [PhantomReference, 0 refs, 0 refs, 0.0015961
> secs]2017-05-23T20:43:17.346-0400: 40602.440: [JNI Weak Reference,
> 0.0000730 secs] (to-space exhausted), 6.1987667 secs]
>    [Parallel Time: 5446.3 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40596975.6, Avg: 40596975.7, Max:
> 40596975.8, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 2.9, Avg: 3.1, Max: 3.2, Diff: 0.3,
> Sum: 24.4]
>       [Update RS (ms): Min: 10.1, Avg: 10.3, Max: 10.5, Diff: 0.4, Sum:
> 82.6]
>          [Processed Buffers: Min: 33, Avg: 40.2, Max: 51, Diff: 18, Sum:
> 322]
>       [Scan RS (ms): Min: 30.7, Avg: 31.1, Max: 32.4, Diff: 1.8, Sum:
> 249.0]
>       [Code Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.6, Diff: 0.5,
> Sum: 2.8]
>       [Object Copy (ms): Min: 5399.2, Avg: 5400.6, Max: 5400.9, Diff: 1.7,
> Sum: 43204.5]
>       [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.3]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 5445.5, Avg: 5445.6, Max: 5445.7, Diff:
> 0.2, Sum: 43565.0]
>       [GC Worker End (ms): Min: 40602421.3, Avg: 40602421.4, Max:
> 40602421.4, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.8 ms]
>    [Other: 751.4 ms]
>       [Evacuation Failure: 728.5 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 17.8 ms]
>       [Ref Enq: 0.5 ms]
>       [Redirty Cards: 2.1 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.2 ms]
>       [Free CSet: 0.8 ms]
>    [Eden: 878.0M(976.0M)->0.0B(1424.0M) Survivors: 456.0M->8192.0K Heap:
> 6856.2M(7168.0M)->6908.2M(7168.0M)]
>  [Times: user=11.66 sys=1.15, real=6.19 secs]
> 2017-05-23T20:43:18.080-0400: 40603.173: [GC
> concurrent-root-region-scan-start]
> 2017-05-23T20:43:18.080-0400: 40603.173: Total time for which application
> threads were stopped: 6.2005443 seconds, Stopping threads took: 0.0002322
> seconds
> 2017-05-23T20:43:18.080-0400: 40603.174: Application time: 0.0002882
> seconds
> 2017-05-23T20:43:18.114-0400: 40603.207: [GC
> concurrent-root-region-scan-end, 0.0339339 secs]
> 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-mark-start]
> 2017-05-23T20:43:18.142-0400: 40603.235: Total time for which application
> threads were stopped: 0.0613820 seconds, Stopping threads took: 0.0001677
> seconds
> 2017-05-23T20:43:18.142-0400: 40603.236: Application time: 0.0005017
> seconds
> 2017-05-23T20:43:18.144-0400: 40603.237: Total time for which application
> threads were stopped: 0.0013197 seconds, Stopping threads took: 0.0001188
> seconds
> 2017-05-23T20:43:18.144-0400: 40603.237: Application time: 0.0001781
> seconds
> 2017-05-23T20:43:18.144-0400: 40603.238: Total time for which application
> threads were stopped: 0.0005735 seconds, Stopping threads took: 0.0000568
> seconds
> 2017-05-23T20:43:18.728-0400: 40603.821: Application time: 0.5835349
> seconds
> 2017-05-23T20:43:18.730-0400: 40603.823: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> - age   1:    8388248 bytes,    8388248 total
> 2017-05-23T20:43:19.821-0400: 40604.914: [SoftReference, 0 refs, 0.0009673
> secs]2017-05-23T20:43:19.822-0400: 40604.915: [WeakReference, 0 refs,
> 0.0006733 secs]2017-05-23T20:43:19.823-0400: 40604.916: [FinalReference, 0
> refs, 0.0006260 secs]2017-05-23T20:43:19.823-0400: 40604.917:
> [PhantomReference, 0 refs, 0 refs, 0.0013002
> secs]2017-05-23T20:43:19.825-0400: 40604.918: [JNI Weak Reference,
> 0.0000642 secs] (to-space exhausted), 1.2567408 secs]
>    [Parallel Time: 1084.5 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40603823.4, Avg: 40603823.5, Max:
> 40603823.6, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.7,
> Sum: 15.3]
>       [Update RS (ms): Min: 23.8, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum:
> 191.7]
>          [Processed Buffers: Min: 49, Avg: 53.5, Max: 60, Diff: 11, Sum:
> 428]
>       [Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 8.6]
>       [Code Root Scanning (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1,
> Sum: 0.8]
>       [Object Copy (ms): Min: 1056.4, Avg: 1056.8, Max: 1057.2, Diff: 0.8,
> Sum: 8454.7]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.0]
>          [Termination Attempts: Min: 1, Avg: 3.8, Max: 7, Diff: 6, Sum: 30]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 1084.0, Avg: 1084.1, Max: 1084.2, Diff:
> 0.2, Sum: 8673.2]
>       [GC Worker End (ms): Min: 40604907.6, Avg: 40604907.7, Max:
> 40604907.7, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 171.7 ms]
>       [Evacuation Failure: 159.4 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 9.9 ms]
>       [Ref Enq: 0.6 ms]
>       [Redirty Cards: 0.6 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.3 ms]
>       [Free CSet: 0.2 ms]
>    [Eden: 230.0M(1424.0M)->0.0B(1432.0M) Survivors: 8192.0K->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=2.33 sys=0.34, real=1.26 secs]
> 2017-05-23T20:43:19.987-0400: 40605.080: Total time for which application
> threads were stopped: 1.2587489 seconds, Stopping threads took: 0.0002182
> seconds
> 2017-05-23T20:43:19.987-0400: 40605.080: Application time: 0.0003101
> seconds
> 2017-05-23T20:43:19.988-0400: 40605.082: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.020-0400: 40605.113: [SoftReference, 0 refs, 0.0008856
> secs]2017-05-23T20:43:20.020-0400: 40605.114: [WeakReference, 0 refs,
> 0.0005588 secs]2017-05-23T20:43:20.021-0400: 40605.114: [FinalReference, 0
> refs, 0.0006006 secs]2017-05-23T20:43:20.022-0400: 40605.115:
> [PhantomReference, 0 refs, 0 refs, 0.0010837
> secs]2017-05-23T20:43:20.023-0400: 40605.116: [JNI Weak Reference,
> 0.0000610 secs], 0.0356212 secs]
>    [Parallel Time: 30.0 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605081.9, Avg: 40605082.0, Max:
> 40605082.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.9, Avg: 2.0, Max: 2.5, Diff: 0.6,
> Sum: 16.1]
>       [Update RS (ms): Min: 27.3, Avg: 27.4, Max: 27.5, Diff: 0.2, Sum:
> 219.3]
>          [Processed Buffers: Min: 82, Avg: 87.4, Max: 92, Diff: 10, Sum:
> 699]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.4]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum:
> 0.4]
>       [GC Worker Total (ms): Min: 29.7, Avg: 29.8, Max: 29.9, Diff: 0.2,
> Sum: 238.5]
>       [GC Worker End (ms): Min: 40605111.8, Avg: 40605111.8, Max:
> 40605111.8, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.2 ms]
>    [Other: 5.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.0 ms]
>       [Ref Enq: 0.2 ms]
>       [Redirty Cards: 0.2 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.2 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.25 sys=0.00, real=0.04 secs]
> 2017-05-23T20:43:20.024-0400: 40605.118: Total time for which application
> threads were stopped: 0.0372043 seconds, Stopping threads took: 0.0001640
> seconds
> 2017-05-23T20:43:20.025-0400: 40605.118: Application time: 0.0002435
> seconds
> 2017-05-23T20:43:20.026-0400: 40605.119: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.030-0400: 40605.123: [SoftReference, 0 refs, 0.0009405
> secs]2017-05-23T20:43:20.031-0400: 40605.124: [WeakReference, 0 refs,
> 0.0005771 secs]2017-05-23T20:43:20.032-0400: 40605.125: [FinalReference, 0
> refs, 0.0005766 secs]2017-05-23T20:43:20.032-0400: 40605.125:
> [PhantomReference, 0 refs, 0 refs, 0.0011847
> secs]2017-05-23T20:43:20.033-0400: 40605.127: [JNI Weak Reference,
> 0.0000549 secs], 0.0087717 secs]
>    [Parallel Time: 3.0 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605119.3, Avg: 40605119.4, Max:
> 40605119.5, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 2.6, Diff: 1.0,
> Sum: 14.8]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
>          [Processed Buffers: Min: 0, Avg: 0.5, Max: 2, Diff: 2, Sum: 4]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>       [Termination (ms): Min: 0.0, Avg: 0.6, Max: 0.6, Diff: 0.6, Sum: 4.4]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 2.6, Avg: 2.6, Max: 2.7, Diff: 0.1, Sum:
> 21.1]
>       [GC Worker End (ms): Min: 40605122.0, Avg: 40605122.1, Max:
> 40605122.1, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 5.2 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.1 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.03 sys=0.00, real=0.01 secs]
> 2017-05-23T20:43:20.035-0400: 40605.128: Total time for which application
> threads were stopped: 0.0102350 seconds, Stopping threads took: 0.0000635
> seconds
> 2017-05-23T20:43:20.035-0400: 40605.128: Application time: 0.0002150
> seconds
> 2017-05-23T20:43:20.036-0400: 40605.129: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.040-0400: 40605.133: [SoftReference, 0 refs, 0.0010156
> secs]2017-05-23T20:43:20.041-0400: 40605.134: [WeakReference, 0 refs,
> 0.0006580 secs]2017-05-23T20:43:20.042-0400: 40605.135: [FinalReference, 0
> refs, 0.0006435 secs]2017-05-23T20:43:20.042-0400: 40605.136:
> [PhantomReference, 0 refs, 0 refs, 0.0012604
> secs]2017-05-23T20:43:20.044-0400: 40605.137: [JNI Weak Reference,
> 0.0000513 secs], 0.0087896 secs]
>    [Parallel Time: 2.7 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605129.6, Avg: 40605129.7, Max:
> 40605129.8, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.5, Diff: 0.8,
> Sum: 14.9]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
>          [Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.3]
>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.5]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum:
> 19.5]
>       [GC Worker End (ms): Min: 40605132.1, Avg: 40605132.2, Max:
> 40605132.2, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 5.5 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.4 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.04 sys=0.00, real=0.01 secs]
> 2017-05-23T20:43:20.045-0400: 40605.138: Total time for which application
> threads were stopped: 0.0101403 seconds, Stopping threads took: 0.0000614
> seconds
> 2017-05-23T20:43:20.045-0400: 40605.139: Application time: 0.0001681
> seconds
> 2017-05-23T20:43:20.046-0400: 40605.140: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.050-0400: 40605.144: [SoftReference, 0 refs, 0.0008321
> secs]2017-05-23T20:43:20.051-0400: 40605.145: [WeakReference, 0 refs,
> 0.0006103 secs]2017-05-23T20:43:20.052-0400: 40605.145: [FinalReference, 0
> refs, 0.0007194 secs]2017-05-23T20:43:20.053-0400: 40605.146:
> [PhantomReference, 0 refs, 0 refs, 0.0010705
> secs]2017-05-23T20:43:20.054-0400: 40605.147: [JNI Weak Reference,
> 0.0000508 secs], 0.0084107 secs]
>    [Parallel Time: 2.7 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605139.9, Avg: 40605140.0, Max:
> 40605140.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.8,
> Sum: 15.1]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
>          [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.2]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum:
> 19.2]
>       [GC Worker End (ms): Min: 40605142.4, Avg: 40605142.4, Max:
> 40605142.5, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.2 ms]
>    [Other: 5.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.1 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.2 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.03 sys=0.01, real=0.01 secs]
> 2017-05-23T20:43:20.055-0400: 40605.148: Total time for which application
> threads were stopped: 0.0097185 seconds, Stopping threads took: 0.0001029
> seconds
> 2017-05-23T20:43:20.055-0400: 40605.148: Application time: 0.0001505
> seconds
> 2017-05-23T20:43:20.056-0400: 40605.149: [Full GC (Allocation Failure)
> 2017-05-23T20:43:22.446-0400: 40607.540: [SoftReference, 1667 refs,
> 0.0003772 secs]2017-05-23T20:43:22.447-0400: 40607.541: [WeakReference,
> 5626 refs, 0.0008068 secs]2017-05-23T20:43:22.448-0400: 40607.541:
> [FinalReference, 4015 refs, 0.0015169 secs]2017-05-23T20:43:22.450-0400:
> 40607.543: [PhantomReference, 1 refs, 372 refs, 0.0001585
> secs]2017-05-23T20:43:22.450-0400: 40607.543: [JNI Weak Reference,
> 0.0000963 secs] 7139M->2327M(7168M), 9.7036499 secs]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->2327.6M(7168.0M)], [Metaspace: 108907K->108428K(1150976K)]
>  [Times: user=13.22 sys=0.00, real=9.70 secs]
> 2017-05-23T20:43:29.760-0400: 40614.853: Total time for which application
> threads were stopped: 9.7047785 seconds, Stopping threads took: 0.0000566
> seconds
> 2017-05-23T20:43:29.760-0400: 40614.854: [GC concurrent-mark-abort]
> 2017-05-23T20:43:29.763-0400: 40614.856: Application time: 0.0029444
> seconds
> > On Jul 21, 2017, at 5:34 PM, Thomas Schatzl <thomas.schatzl at oracle.com>
> wrote:
> >
> > Hi Kirk,
> >
> > On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote:
> >> Hi all,
> >>
> >> A while back I mentioned to Erik at JFokus that I was seeing a
> >> puzzling behavior in the G1 where without any obvious failure, heap
> >> occupancy after collections would spike which would trigger a full
> >> which would (unexpectedly) completely recover everything down to the
> >> expected live set. Yesterday while working with Simone Bordet on the
> >> problem we came to the realization that we were seeing a pattern
> >> prior to the ramp up to the Full, Survivor space would be
> >> ergonomically resized to 0 -> 0. The only way to reset the situation
> >> was to run a full collection. In our minds this doesn?t make any
> >> sense to reset survivor space to 0. So far this is an observation
> >> from a single GC log but I recall seeing the pattern in many other
> >> logs. Before I go through the exercise of building a super grep to
> >> run over my G1 log repo I?d like to ask; under what conditions would
> >> it make sense to have the survivor space resized to 0? And if not,
> >>  would this be bug in G1? We tried reproducing the behavior in some
> >> test applications but I fear we often only see this happening in
> >> production applications that have been running for several days. It?s
> >> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9.
> >
> >   sounds similar to https://bugs.openjdk.java.net/browse/JDK-8037500.
> > Could you please post the type of collections for a few more gcs before
> > the zero-sized ones? It would be particularly interesting if there is a
> > mixed gc with to-space exhaustion just before this sequence. And if
> > there are log messages with attempts to start marking too.
> >
> > As for why that bug has been closed as "won't fix" because we do not
> > have a reproducer (any more) to test any changes in addition to the
> > stated reasons that the performance impact seemed minor at that time.
> >
> > There have been some changes in how the next gc is calculated in 9 too,
> > so I do not know either if 9 is also affected (particularly one of
> > these young-only gc's would not be issued any more).
> >
> > I can think of at least one more reasons other than stated in the CR
> > why this occurs at least for 8u60+ builds. There is the possibility
> > particularly in conjunction with humongous object allocation that after
> > starting the mutator, immediately afterwards a young gc that reclaims
> > zero space is issued, e.g.:
> >
> > young-gc, has X regions left at the end, starts mutators
> > mutator 1 allocates exactly X regions as humongous objects
> > mutator 2 allocates, finds that there are no regions left, issues
> > young-gc request; in this young-gc eden and survivor are of obviously
> > of zero size
> > [...and so on...]
> >
> > Note that this pattern could repeat multiple times as young gc may
> > reclaim space from humongous objects (eager reclaim!) until at some
> > point it ran into full gc.
> >
> > The logging that shows humongous object allocation (something about
> > reaching threshold and starting marking) could confirm this situation.
> >
> > No guarantees about that being the actual issue though.
> >
> > Thanks,
> >   Thomas
> >
>
> --
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170723/c25c1852/attachment.htm>

From monica.beckwith at gmail.com  Sun Jul 23 19:09:14 2017
From: monica.beckwith at gmail.com (monica beckwith)
Date: Sun, 23 Jul 2017 21:09:14 +0200
Subject: Bug in G1
In-Reply-To: <A7B172E5-185F-43A2-B992-3F724C6DA851@kodewerk.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
 <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
 <1500034180.3458.67.camel@oracle.com>
 <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com>
 <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com>
 <1500647667.2385.33.camel@oracle.com>
 <A7B172E5-185F-43A2-B992-3F724C6DA851@kodewerk.com>
Message-ID: <CAO2QHDHWu-U4u-u40uUFKbVLJi=ZaTA9es5qkEskQr1ypky5qQ@mail.gmail.com>

Hello Kirk and Thomas,

I think the problem is that the heap is not sized to accommodate the
humongous objects. I think this log is post 8 update 40, and that's why you
see those young collections at the lowest young occupancy since it's trying
to reclaim humongous regions. Kirk, can you please show a log prior to 8u40?

Thanks,
Monica

On Jul 23, 2017 5:52 AM, "Kirk Pepperdine" <kirk at kodewerk.com> wrote:

> Thanks for the information. I?ve shared the entire log with you on
> dropbox. Feel free to distribute it as you see fit.
>
> I see the to-space exhausted but there doesn?t appear to be a mixed
> collection involved. Below is a single sequence up to and including the
> Full.
>
> Kind regards,
> Kirk
>
>
> 2017-05-23T20:42:55.303-0400: 40580.396: Application time: 0.8539675
> seconds
> 2017-05-23T20:42:55.304-0400: 40580.398: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 169869312 bytes, new threshold 15 (max 15)
> - age   1:    3278808 bytes,    3278808 total
> - age   2:   71278552 bytes,   74557360 total
> - age   3:     533720 bytes,   75091080 total
> - age   4:   12897544 bytes,   87988624 total
> - age   5:     796672 bytes,   88785296 total
> - age   6:     503288 bytes,   89288584 total
> 2017-05-23T20:42:55.363-0400: 40580.457: [SoftReference, 0 refs, 0.0010011
> secs]2017-05-23T20:42:55.364-0400: 40580.458: [WeakReference, 367 refs,
> 0.0006136 secs]2017-05-23T20:42:55.365-0400: 40580.458: [FinalReference,
> 7659 refs, 0.0014460 secs]2017-05-23T20:42:55.366-0400: 40580.460:
> [PhantomReference, 0 refs, 0 refs, 0.0011060 secs]2017-05-23T20:42:55.367-0400:
> 40580.461: [JNI Weak Reference, 0.0000647 secs], 0.0669684 secs]
>    [Parallel Time: 57.9 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40580398.1, Avg: 40580398.2, Max:
> 40580398.3, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 2.7, Diff: 1.0,
> Sum: 15.2]
>       [Update RS (ms): Min: 15.2, Avg: 15.7, Max: 15.8, Diff: 0.6, Sum:
> 125.4]
>          [Processed Buffers: Min: 44, Avg: 50.1, Max: 62, Diff: 18, Sum:
> 401]
>       [Scan RS (ms): Min: 1.9, Avg: 2.0, Max: 2.1, Diff: 0.2, Sum: 15.9]
>       [Code Root Scanning (ms): Min: 1.6, Avg: 1.7, Max: 1.7, Diff: 0.1,
> Sum: 13.5]
>       [Object Copy (ms): Min: 36.0, Avg: 36.2, Max: 36.2, Diff: 0.2, Sum:
> 289.2]
>       [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum:
> 1.0]
>       [GC Worker Total (ms): Min: 57.4, Avg: 57.5, Max: 57.6, Diff: 0.2,
> Sum: 460.3]
>       [GC Worker End (ms): Min: 40580455.7, Avg: 40580455.7, Max:
> 40580455.8, Diff: 0.1]
>    [Code Root Fixup: 0.2 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.8 ms]
>    [Other: 8.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.7 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 1.9 ms]
>    [Eden: 2484.0M(2484.0M)->0.0B(3544.0M) Survivors: 98.0M->100.0M Heap:
> 5189.0M(7168.0M)->2708.0M(7168.0M)]
>  [Times: user=0.45 sys=0.03, real=0.07 secs]
> 2017-05-23T20:42:55.372-0400: 40580.465: Total time for which application
> threads were stopped: 0.0685303 seconds, Stopping threads took: 0.0001346
> seconds
> 2017-05-23T20:42:59.372-0400: 40584.465: Application time: 4.0004774
> seconds
> 2017-05-23T20:42:59.376-0400: 40584.469: Total time for which application
> threads were stopped: 0.0036324 seconds, Stopping threads took: 0.0023017
> seconds
> 2017-05-23T20:43:05.891-0400: 40590.984: Application time: 6.5149722
> seconds
> 2017-05-23T20:43:05.892-0400: 40590.985: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 239075328 bytes, new threshold 15 (max 15)
> - age   1:    4934368 bytes,    4934368 total
> - age   2:    2633808 bytes,    7568176 total
> - age   3:   71264464 bytes,   78832640 total
> - age   4:     527368 bytes,   79360008 total
> - age   5:   12893400 bytes,   92253408 total
> - age   6:     750128 bytes,   93003536 total
> - age   7:     432784 bytes,   93436320 total
> 2017-05-23T20:43:09.493-0400: 40594.586: [SoftReference, 0 refs, 0.0067938
> secs]2017-05-23T20:43:09.500-0400: 40594.593: [WeakReference, 0 refs,
> 0.0033881 secs]2017-05-23T20:43:09.503-0400: 40594.597: [FinalReference,
> 0 refs, 0.0005787 secs]2017-05-23T20:43:09.504-0400: 40594.597:
> [PhantomReference, 0 refs, 0 refs, 0.0011377 secs]2017-05-23T20:43:09.505-0400:
> 40594.598: [JNI Weak Reference, 0.0000618 secs] (to-space exhausted),
> 4.8672247 secs]
>    [Parallel Time: 3599.9 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40590985.9, Avg: 40590986.0, Max:
> 40590986.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.3, Diff: 0.6,
> Sum: 15.2]
>       [Update RS (ms): Min: 68.3, Avg: 68.4, Max: 68.5, Diff: 0.2, Sum:
> 547.6]
>          [Processed Buffers: Min: 32, Avg: 49.0, Max: 72, Diff: 40, Sum:
> 392]
>       [Scan RS (ms): Min: 2.8, Avg: 2.9, Max: 3.0, Diff: 0.1, Sum: 23.2]
>       [Code Root Scanning (ms): Min: 2.4, Avg: 2.5, Max: 2.5, Diff: 0.1,
> Sum: 19.7]
>       [Object Copy (ms): Min: 3523.7, Avg: 3523.8, Max: 3523.9, Diff: 0.2,
> Sum: 28190.6]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 3599.6, Avg: 3599.7, Max: 3599.8, Diff:
> 0.2, Sum: 28797.6]
>       [GC Worker End (ms): Min: 40594585.6, Avg: 40594585.7, Max:
> 40594585.7, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 1.2 ms]
>    [Other: 1265.8 ms]
>       [Evacuation Failure: 1248.2 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 12.4 ms]
>       [Ref Enq: 0.5 ms]
>       [Redirty Cards: 2.1 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 1.5 ms]
>    [Eden: 3544.0M(3544.0M)->0.0B(976.0M) Survivors: 100.0M->456.0M Heap:
> 6274.3M(7168.0M)->5978.2M(7168.0M)]
>  [Times: user=13.58 sys=0.11, real=4.86 secs]
> 2017-05-23T20:43:10.760-0400: 40595.853: Total time for which application
> threads were stopped: 4.8690628 seconds, Stopping threads took: 0.0002136
> seconds
> 2017-05-23T20:43:11.762-0400: 40596.855: Application time: 1.0019247
> seconds
> 2017-05-23T20:43:11.763-0400: 40596.856: Total time for which application
> threads were stopped: 0.0015356 seconds, Stopping threads took: 0.0003107
> seconds
> 2017-05-23T20:43:11.880-0400: 40596.973: Application time: 0.1164884
> seconds
> 2017-05-23T20:43:11.881-0400: 40596.974: [GC pause (G1 Humongous
> Allocation) (young) (initial-mark)
> Desired survivor size 94371840 bytes, new threshold 1 (max 15)
> - age   1:  477501112 bytes,  477501112 total
> - age   2:     182296 bytes,  477683408 total
> - age   3:      78880 bytes,  477762288 total
> - age   4:      45376 bytes,  477807664 total
> - age   5:      92304 bytes,  477899968 total
> - age   6:      75448 bytes,  477975416 total
> - age   7:      86752 bytes,  478062168 total
> - age   8:      71408 bytes,  478133576 total
> 2017-05-23T20:43:17.335-0400: 40602.428: [SoftReference, 0 refs, 0.0071133
> secs]2017-05-23T20:43:17.342-0400: 40602.435: [WeakReference, 3 refs,
> 0.0007987 secs]2017-05-23T20:43:17.343-0400: 40602.436: [FinalReference,
> 182 refs, 0.0017603 secs]2017-05-23T20:43:17.345-0400: 40602.438:
> [PhantomReference, 0 refs, 0 refs, 0.0015961 secs]2017-05-23T20:43:17.346-0400:
> 40602.440: [JNI Weak Reference, 0.0000730 secs] (to-space exhausted),
> 6.1987667 secs]
>    [Parallel Time: 5446.3 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40596975.6, Avg: 40596975.7, Max:
> 40596975.8, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 2.9, Avg: 3.1, Max: 3.2, Diff: 0.3,
> Sum: 24.4]
>       [Update RS (ms): Min: 10.1, Avg: 10.3, Max: 10.5, Diff: 0.4, Sum:
> 82.6]
>          [Processed Buffers: Min: 33, Avg: 40.2, Max: 51, Diff: 18, Sum:
> 322]
>       [Scan RS (ms): Min: 30.7, Avg: 31.1, Max: 32.4, Diff: 1.8, Sum:
> 249.0]
>       [Code Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.6, Diff: 0.5,
> Sum: 2.8]
>       [Object Copy (ms): Min: 5399.2, Avg: 5400.6, Max: 5400.9, Diff: 1.7,
> Sum: 43204.5]
>       [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.3]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 5445.5, Avg: 5445.6, Max: 5445.7, Diff:
> 0.2, Sum: 43565.0]
>       [GC Worker End (ms): Min: 40602421.3, Avg: 40602421.4, Max:
> 40602421.4, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.8 ms]
>    [Other: 751.4 ms]
>       [Evacuation Failure: 728.5 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 17.8 ms]
>       [Ref Enq: 0.5 ms]
>       [Redirty Cards: 2.1 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.2 ms]
>       [Free CSet: 0.8 ms]
>    [Eden: 878.0M(976.0M)->0.0B(1424.0M) Survivors: 456.0M->8192.0K Heap:
> 6856.2M(7168.0M)->6908.2M(7168.0M)]
>  [Times: user=11.66 sys=1.15, real=6.19 secs]
> 2017-05-23T20:43:18.080-0400: 40603.173: [GC concurrent-root-region-scan-
> start]
> 2017-05-23T20:43:18.080-0400: 40603.173: Total time for which application
> threads were stopped: 6.2005443 seconds, Stopping threads took: 0.0002322
> seconds
> 2017-05-23T20:43:18.080-0400: 40603.174: Application time: 0.0002882
> seconds
> 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-root-region-scan-end,
> 0.0339339 secs]
> 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-mark-start]
> 2017-05-23T20:43:18.142-0400: 40603.235: Total time for which application
> threads were stopped: 0.0613820 seconds, Stopping threads took: 0.0001677
> seconds
> 2017-05-23T20:43:18.142-0400: 40603.236: Application time: 0.0005017
> seconds
> 2017-05-23T20:43:18.144-0400: 40603.237: Total time for which application
> threads were stopped: 0.0013197 seconds, Stopping threads took: 0.0001188
> seconds
> 2017-05-23T20:43:18.144-0400: 40603.237: Application time: 0.0001781
> seconds
> 2017-05-23T20:43:18.144-0400: 40603.238: Total time for which application
> threads were stopped: 0.0005735 seconds, Stopping threads took: 0.0000568
> seconds
> 2017-05-23T20:43:18.728-0400: 40603.821: Application time: 0.5835349
> seconds
> 2017-05-23T20:43:18.730-0400: 40603.823: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> - age   1:    8388248 bytes,    8388248 total
> 2017-05-23T20:43:19.821-0400: 40604.914: [SoftReference, 0 refs, 0.0009673
> secs]2017-05-23T20:43:19.822-0400: 40604.915: [WeakReference, 0 refs,
> 0.0006733 secs]2017-05-23T20:43:19.823-0400: 40604.916: [FinalReference,
> 0 refs, 0.0006260 secs]2017-05-23T20:43:19.823-0400: 40604.917:
> [PhantomReference, 0 refs, 0 refs, 0.0013002 secs]2017-05-23T20:43:19.825-0400:
> 40604.918: [JNI Weak Reference, 0.0000642 secs] (to-space exhausted),
> 1.2567408 secs]
>    [Parallel Time: 1084.5 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40603823.4, Avg: 40603823.5, Max:
> 40603823.6, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.7,
> Sum: 15.3]
>       [Update RS (ms): Min: 23.8, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum:
> 191.7]
>          [Processed Buffers: Min: 49, Avg: 53.5, Max: 60, Diff: 11, Sum:
> 428]
>       [Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 8.6]
>       [Code Root Scanning (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1,
> Sum: 0.8]
>       [Object Copy (ms): Min: 1056.4, Avg: 1056.8, Max: 1057.2, Diff: 0.8,
> Sum: 8454.7]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.0]
>          [Termination Attempts: Min: 1, Avg: 3.8, Max: 7, Diff: 6, Sum: 30]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 1084.0, Avg: 1084.1, Max: 1084.2, Diff:
> 0.2, Sum: 8673.2]
>       [GC Worker End (ms): Min: 40604907.6, Avg: 40604907.7, Max:
> 40604907.7, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 171.7 ms]
>       [Evacuation Failure: 159.4 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 9.9 ms]
>       [Ref Enq: 0.6 ms]
>       [Redirty Cards: 0.6 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.3 ms]
>       [Free CSet: 0.2 ms]
>    [Eden: 230.0M(1424.0M)->0.0B(1432.0M) Survivors: 8192.0K->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=2.33 sys=0.34, real=1.26 secs]
> 2017-05-23T20:43:19.987-0400: 40605.080: Total time for which application
> threads were stopped: 1.2587489 seconds, Stopping threads took: 0.0002182
> seconds
> 2017-05-23T20:43:19.987-0400: 40605.080: Application time: 0.0003101
> seconds
> 2017-05-23T20:43:19.988-0400: 40605.082: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.020-0400: 40605.113: [SoftReference, 0 refs, 0.0008856
> secs]2017-05-23T20:43:20.020-0400: 40605.114: [WeakReference, 0 refs,
> 0.0005588 secs]2017-05-23T20:43:20.021-0400: 40605.114: [FinalReference,
> 0 refs, 0.0006006 secs]2017-05-23T20:43:20.022-0400: 40605.115:
> [PhantomReference, 0 refs, 0 refs, 0.0010837 secs]2017-05-23T20:43:20.023-0400:
> 40605.116: [JNI Weak Reference, 0.0000610 secs], 0.0356212 secs]
>    [Parallel Time: 30.0 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605081.9, Avg: 40605082.0, Max:
> 40605082.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.9, Avg: 2.0, Max: 2.5, Diff: 0.6,
> Sum: 16.1]
>       [Update RS (ms): Min: 27.3, Avg: 27.4, Max: 27.5, Diff: 0.2, Sum:
> 219.3]
>          [Processed Buffers: Min: 82, Avg: 87.4, Max: 92, Diff: 10, Sum:
> 699]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.4]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum:
> 0.4]
>       [GC Worker Total (ms): Min: 29.7, Avg: 29.8, Max: 29.9, Diff: 0.2,
> Sum: 238.5]
>       [GC Worker End (ms): Min: 40605111.8, Avg: 40605111.8, Max:
> 40605111.8, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.2 ms]
>    [Other: 5.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.0 ms]
>       [Ref Enq: 0.2 ms]
>       [Redirty Cards: 0.2 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.2 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.25 sys=0.00, real=0.04 secs]
> 2017-05-23T20:43:20.024-0400: 40605.118: Total time for which application
> threads were stopped: 0.0372043 seconds, Stopping threads took: 0.0001640
> seconds
> 2017-05-23T20:43:20.025-0400: 40605.118: Application time: 0.0002435
> seconds
> 2017-05-23T20:43:20.026-0400: 40605.119: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.030-0400: 40605.123: [SoftReference, 0 refs, 0.0009405
> secs]2017-05-23T20:43:20.031-0400: 40605.124: [WeakReference, 0 refs,
> 0.0005771 secs]2017-05-23T20:43:20.032-0400: 40605.125: [FinalReference,
> 0 refs, 0.0005766 secs]2017-05-23T20:43:20.032-0400: 40605.125:
> [PhantomReference, 0 refs, 0 refs, 0.0011847 secs]2017-05-23T20:43:20.033-0400:
> 40605.127: [JNI Weak Reference, 0.0000549 secs], 0.0087717 secs]
>    [Parallel Time: 3.0 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605119.3, Avg: 40605119.4, Max:
> 40605119.5, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 2.6, Diff: 1.0,
> Sum: 14.8]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
>          [Processed Buffers: Min: 0, Avg: 0.5, Max: 2, Diff: 2, Sum: 4]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>       [Termination (ms): Min: 0.0, Avg: 0.6, Max: 0.6, Diff: 0.6, Sum: 4.4]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 2.6, Avg: 2.6, Max: 2.7, Diff: 0.1, Sum:
> 21.1]
>       [GC Worker End (ms): Min: 40605122.0, Avg: 40605122.1, Max:
> 40605122.1, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 5.2 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.1 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.03 sys=0.00, real=0.01 secs]
> 2017-05-23T20:43:20.035-0400: 40605.128: Total time for which application
> threads were stopped: 0.0102350 seconds, Stopping threads took: 0.0000635
> seconds
> 2017-05-23T20:43:20.035-0400: 40605.128: Application time: 0.0002150
> seconds
> 2017-05-23T20:43:20.036-0400: 40605.129: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.040-0400: 40605.133: [SoftReference, 0 refs, 0.0010156
> secs]2017-05-23T20:43:20.041-0400: 40605.134: [WeakReference, 0 refs,
> 0.0006580 secs]2017-05-23T20:43:20.042-0400: 40605.135: [FinalReference,
> 0 refs, 0.0006435 secs]2017-05-23T20:43:20.042-0400: 40605.136:
> [PhantomReference, 0 refs, 0 refs, 0.0012604 secs]2017-05-23T20:43:20.044-0400:
> 40605.137: [JNI Weak Reference, 0.0000513 secs], 0.0087896 secs]
>    [Parallel Time: 2.7 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605129.6, Avg: 40605129.7, Max:
> 40605129.8, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.5, Diff: 0.8,
> Sum: 14.9]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
>          [Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.3]
>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.5]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum:
> 19.5]
>       [GC Worker End (ms): Min: 40605132.1, Avg: 40605132.2, Max:
> 40605132.2, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 5.5 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.4 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.04 sys=0.00, real=0.01 secs]
> 2017-05-23T20:43:20.045-0400: 40605.138: Total time for which application
> threads were stopped: 0.0101403 seconds, Stopping threads took: 0.0000614
> seconds
> 2017-05-23T20:43:20.045-0400: 40605.139: Application time: 0.0001681
> seconds
> 2017-05-23T20:43:20.046-0400: 40605.140: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.050-0400: 40605.144: [SoftReference, 0 refs, 0.0008321
> secs]2017-05-23T20:43:20.051-0400: 40605.145: [WeakReference, 0 refs,
> 0.0006103 secs]2017-05-23T20:43:20.052-0400: 40605.145: [FinalReference,
> 0 refs, 0.0007194 secs]2017-05-23T20:43:20.053-0400: 40605.146:
> [PhantomReference, 0 refs, 0 refs, 0.0010705 secs]2017-05-23T20:43:20.054-0400:
> 40605.147: [JNI Weak Reference, 0.0000508 secs], 0.0084107 secs]
>    [Parallel Time: 2.7 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605139.9, Avg: 40605140.0, Max:
> 40605140.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.8,
> Sum: 15.1]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
>          [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.2]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
> 0.5]
>       [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum:
> 19.2]
>       [GC Worker End (ms): Min: 40605142.4, Avg: 40605142.4, Max:
> 40605142.5, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.2 ms]
>    [Other: 5.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.1 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.2 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.03 sys=0.01, real=0.01 secs]
> 2017-05-23T20:43:20.055-0400: 40605.148: Total time for which application
> threads were stopped: 0.0097185 seconds, Stopping threads took: 0.0001029
> seconds
> 2017-05-23T20:43:20.055-0400: 40605.148: Application time: 0.0001505
> seconds
> 2017-05-23T20:43:20.056-0400: 40605.149: [Full GC (Allocation Failure)
> 2017-05-23T20:43:22.446-0400: 40607.540: [SoftReference, 1667 refs,
> 0.0003772 secs]2017-05-23T20:43:22.447-0400: 40607.541: [WeakReference,
> 5626 refs, 0.0008068 secs]2017-05-23T20:43:22.448-0400: 40607.541:
> [FinalReference, 4015 refs, 0.0015169 secs]2017-05-23T20:43:22.450-0400:
> 40607.543: [PhantomReference, 1 refs, 372 refs, 0.0001585
> secs]2017-05-23T20:43:22.450-0400: 40607.543: [JNI Weak Reference,
> 0.0000963 secs] 7139M->2327M(7168M), 9.7036499 secs]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap:
> 7139.5M(7168.0M)->2327.6M(7168.0M)], [Metaspace:
> 108907K->108428K(1150976K)]
>  [Times: user=13.22 sys=0.00, real=9.70 secs]
> 2017-05-23T20:43:29.760-0400: 40614.853: Total time for which application
> threads were stopped: 9.7047785 seconds, Stopping threads took: 0.0000566
> seconds
> 2017-05-23T20:43:29.760-0400: 40614.854: [GC concurrent-mark-abort]
> 2017-05-23T20:43:29.763-0400: 40614.856: Application time: 0.0029444
> seconds
> > On Jul 21, 2017, at 5:34 PM, Thomas Schatzl <thomas.schatzl at oracle.com>
> wrote:
> >
> > Hi Kirk,
> >
> > On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote:
> >> Hi all,
> >>
> >> A while back I mentioned to Erik at JFokus that I was seeing a
> >> puzzling behavior in the G1 where without any obvious failure, heap
> >> occupancy after collections would spike which would trigger a full
> >> which would (unexpectedly) completely recover everything down to the
> >> expected live set. Yesterday while working with Simone Bordet on the
> >> problem we came to the realization that we were seeing a pattern
> >> prior to the ramp up to the Full, Survivor space would be
> >> ergonomically resized to 0 -> 0. The only way to reset the situation
> >> was to run a full collection. In our minds this doesn?t make any
> >> sense to reset survivor space to 0. So far this is an observation
> >> from a single GC log but I recall seeing the pattern in many other
> >> logs. Before I go through the exercise of building a super grep to
> >> run over my G1 log repo I?d like to ask; under what conditions would
> >> it make sense to have the survivor space resized to 0? And if not,
> >>  would this be bug in G1? We tried reproducing the behavior in some
> >> test applications but I fear we often only see this happening in
> >> production applications that have been running for several days. It?s
> >> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9.
> >
> >   sounds similar to https://bugs.openjdk.java.net/browse/JDK-8037500.
> > Could you please post the type of collections for a few more gcs before
> > the zero-sized ones? It would be particularly interesting if there is a
> > mixed gc with to-space exhaustion just before this sequence. And if
> > there are log messages with attempts to start marking too.
> >
> > As for why that bug has been closed as "won't fix" because we do not
> > have a reproducer (any more) to test any changes in addition to the
> > stated reasons that the performance impact seemed minor at that time.
> >
> > There have been some changes in how the next gc is calculated in 9 too,
> > so I do not know either if 9 is also affected (particularly one of
> > these young-only gc's would not be issued any more).
> >
> > I can think of at least one more reasons other than stated in the CR
> > why this occurs at least for 8u60+ builds. There is the possibility
> > particularly in conjunction with humongous object allocation that after
> > starting the mutator, immediately afterwards a young gc that reclaims
> > zero space is issued, e.g.:
> >
> > young-gc, has X regions left at the end, starts mutators
> > mutator 1 allocates exactly X regions as humongous objects
> > mutator 2 allocates, finds that there are no regions left, issues
> > young-gc request; in this young-gc eden and survivor are of obviously
> > of zero size
> > [...and so on...]
> >
> > Note that this pattern could repeat multiple times as young gc may
> > reclaim space from humongous objects (eager reclaim!) until at some
> > point it ran into full gc.
> >
> > The logging that shows humongous object allocation (something about
> > reaching threshold and starting marking) could confirm this situation.
> >
> > No guarantees about that being the actual issue though.
> >
> > Thanks,
> >   Thomas
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170723/8ed0c5e0/attachment.htm>

From kirk at kodewerk.com  Mon Jul 24 19:43:59 2017
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Mon, 24 Jul 2017 21:43:59 +0200
Subject: Bug in G1
In-Reply-To: <CAO2QHDHWu-U4u-u40uUFKbVLJi=ZaTA9es5qkEskQr1ypky5qQ@mail.gmail.com>
References: <1500024904.3458.8.camel@oracle.com>
 <6c0d1cca-6c08-0bda-f980-d3fe20e663ff@redhat.com>
 <1500029912.3458.26.camel@oracle.com>
 <f894624c-1ba3-5642-72cf-41d1ef2801c2@redhat.com>
 <990f6578-14d4-322d-7f51-9b93d92f8b20@redhat.com>
 <1500034180.3458.67.camel@oracle.com>
 <1e6c2b24-63fe-cf5e-1635-990852c63a65@redhat.com>
 <7991D723-8B1D-43A3-A9D4-E7D38B1D10E4@kodewerk.com>
 <1500647667.2385.33.camel@oracle.com>
 <A7B172E5-185F-43A2-B992-3F724C6DA851@kodewerk.com>
 <CAO2QHDHWu-U4u-u40uUFKbVLJi=ZaTA9es5qkEskQr1ypky5qQ@mail.gmail.com>
Message-ID: <2479C8EB-F38C-4804-94E5-EC613BC6457E@kodewerk.com>

Hi Monica et all?

I see this bug in all versions of 7 and 8. I can put up more GC logs once I get to a more stable internet connection.

Kind regards,
Kirk

> On Jul 23, 2017, at 9:09 PM, monica beckwith <monica.beckwith at gmail.com> wrote:
> 
> Hello Kirk and Thomas,
> 
> I think the problem is that the heap is not sized to accommodate the humongous objects. I think this log is post 8 update 40, and that's why you see those young collections at the lowest young occupancy since it's trying to reclaim humongous regions. Kirk, can you please show a log prior to 8u40?
> 
> Thanks,
> Monica
> 
> On Jul 23, 2017 5:52 AM, "Kirk Pepperdine" <kirk at kodewerk.com <mailto:kirk at kodewerk.com>> wrote:
> Thanks for the information. I?ve shared the entire log with you on dropbox. Feel free to distribute it as you see fit.
> 
> I see the to-space exhausted but there doesn?t appear to be a mixed collection involved. Below is a single sequence up to and including the Full.
> 
> Kind regards,
> Kirk
> 
> 
> 2017-05-23T20:42:55.303-0400: 40580.396: Application time: 0.8539675 seconds
> 2017-05-23T20:42:55.304-0400: 40580.398: [GC pause (G1 Evacuation Pause) (young)
> Desired survivor size 169869312 bytes, new threshold 15 (max 15)
> - age   1:    3278808 bytes,    3278808 total
> - age   2:   71278552 bytes,   74557360 total
> - age   3:     533720 bytes,   75091080 total
> - age   4:   12897544 bytes,   87988624 total
> - age   5:     796672 bytes,   88785296 total
> - age   6:     503288 bytes,   89288584 total
> 2017-05-23T20:42:55.363-0400: 40580.457: [SoftReference, 0 refs, 0.0010011 secs]2017-05-23T20:42:55.364-0400: 40580.458: [WeakReference, 367 refs, 0.0006136 secs]2017-05-23T20:42:55.365-0400: 40580.458: [FinalReference, 7659 refs, 0.0014460 secs]2017-05-23T20:42:55.366-0400: 40580.460: [PhantomReference, 0 refs, 0 refs, 0.0011060 secs]2017-05-23T20:42:55.367-0400: 40580.461: [JNI Weak Reference, 0.0000647 secs], 0.0669684 secs]
>    [Parallel Time: 57.9 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40580398.1, Avg: 40580398.2, Max: 40580398.3, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 2.7, Diff: 1.0, Sum: 15.2]
>       [Update RS (ms): Min: 15.2, Avg: 15.7, Max: 15.8, Diff: 0.6, Sum: 125.4]
>          [Processed Buffers: Min: 44, Avg: 50.1, Max: 62, Diff: 18, Sum: 401]
>       [Scan RS (ms): Min: 1.9, Avg: 2.0, Max: 2.1, Diff: 0.2, Sum: 15.9]
>       [Code Root Scanning (ms): Min: 1.6, Avg: 1.7, Max: 1.7, Diff: 0.1, Sum: 13.5]
>       [Object Copy (ms): Min: 36.0, Avg: 36.2, Max: 36.2, Diff: 0.2, Sum: 289.2]
>       [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.1, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 1.0]
>       [GC Worker Total (ms): Min: 57.4, Avg: 57.5, Max: 57.6, Diff: 0.2, Sum: 460.3]
>       [GC Worker End (ms): Min: 40580455.7, Avg: 40580455.7, Max: 40580455.8, Diff: 0.1]
>    [Code Root Fixup: 0.2 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.8 ms]
>    [Other: 8.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.7 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 1.9 ms]
>    [Eden: 2484.0M(2484.0M)->0.0B(3544.0M) Survivors: 98.0M->100.0M Heap: 5189.0M(7168.0M)->2708.0M(7168.0M)]
>  [Times: user=0.45 sys=0.03, real=0.07 secs]
> 2017-05-23T20:42:55.372-0400: 40580.465: Total time for which application threads were stopped: 0.0685303 seconds, Stopping threads took: 0.0001346 seconds
> 2017-05-23T20:42:59.372-0400: 40584.465: Application time: 4.0004774 seconds
> 2017-05-23T20:42:59.376-0400: 40584.469: Total time for which application threads were stopped: 0.0036324 seconds, Stopping threads took: 0.0023017 seconds
> 2017-05-23T20:43:05.891-0400: 40590.984: Application time: 6.5149722 seconds
> 2017-05-23T20:43:05.892-0400: 40590.985: [GC pause (G1 Evacuation Pause) (young)
> Desired survivor size 239075328 bytes, new threshold 15 (max 15)
> - age   1:    4934368 bytes,    4934368 total
> - age   2:    2633808 bytes,    7568176 total
> - age   3:   71264464 bytes,   78832640 total
> - age   4:     527368 bytes,   79360008 total
> - age   5:   12893400 bytes,   92253408 total
> - age   6:     750128 bytes,   93003536 total
> - age   7:     432784 bytes,   93436320 total
> 2017-05-23T20:43:09.493-0400: 40594.586: [SoftReference, 0 refs, 0.0067938 secs]2017-05-23T20:43:09.500-0400: 40594.593: [WeakReference, 0 refs, 0.0033881 secs]2017-05-23T20:43:09.503-0400: 40594.597: [FinalReference, 0 refs, 0.0005787 secs]2017-05-23T20:43:09.504-0400: 40594.597: [PhantomReference, 0 refs, 0 refs, 0.0011377 secs]2017-05-23T20:43:09.505-0400: 40594.598: [JNI Weak Reference, 0.0000618 secs] (to-space exhausted), 4.8672247 secs]
>    [Parallel Time: 3599.9 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40590985.9, Avg: 40590986.0, Max: 40590986.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.3, Diff: 0.6, Sum: 15.2]
>       [Update RS (ms): Min: 68.3, Avg: 68.4, Max: 68.5, Diff: 0.2, Sum: 547.6]
>          [Processed Buffers: Min: 32, Avg: 49.0, Max: 72, Diff: 40, Sum: 392]
>       [Scan RS (ms): Min: 2.8, Avg: 2.9, Max: 3.0, Diff: 0.1, Sum: 23.2]
>       [Code Root Scanning (ms): Min: 2.4, Avg: 2.5, Max: 2.5, Diff: 0.1, Sum: 19.7]
>       [Object Copy (ms): Min: 3523.7, Avg: 3523.8, Max: 3523.9, Diff: 0.2, Sum: 28190.6]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.5]
>       [GC Worker Total (ms): Min: 3599.6, Avg: 3599.7, Max: 3599.8, Diff: 0.2, Sum: 28797.6]
>       [GC Worker End (ms): Min: 40594585.6, Avg: 40594585.7, Max: 40594585.7, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 1.2 ms]
>    [Other: 1265.8 ms]
>       [Evacuation Failure: 1248.2 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 12.4 ms]
>       [Ref Enq: 0.5 ms]
>       [Redirty Cards: 2.1 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 1.5 ms]
>    [Eden: 3544.0M(3544.0M)->0.0B(976.0M) Survivors: 100.0M->456.0M Heap: 6274.3M(7168.0M)->5978.2M(7168.0M)]
>  [Times: user=13.58 sys=0.11, real=4.86 secs]
> 2017-05-23T20:43:10.760-0400: 40595.853: Total time for which application threads were stopped: 4.8690628 seconds, Stopping threads took: 0.0002136 seconds
> 2017-05-23T20:43:11.762-0400: 40596.855: Application time: 1.0019247 seconds
> 2017-05-23T20:43:11.763-0400: 40596.856: Total time for which application threads were stopped: 0.0015356 seconds, Stopping threads took: 0.0003107 seconds
> 2017-05-23T20:43:11.880-0400: 40596.973: Application time: 0.1164884 seconds
> 2017-05-23T20:43:11.881-0400: 40596.974: [GC pause (G1 Humongous Allocation) (young) (initial-mark)
> Desired survivor size 94371840 bytes, new threshold 1 (max 15)
> - age   1:  477501112 bytes,  477501112 total
> - age   2:     182296 bytes,  477683408 total
> - age   3:      78880 bytes,  477762288 total
> - age   4:      45376 bytes,  477807664 total
> - age   5:      92304 bytes,  477899968 total
> - age   6:      75448 bytes,  477975416 total
> - age   7:      86752 bytes,  478062168 total
> - age   8:      71408 bytes,  478133576 total
> 2017-05-23T20:43:17.335-0400: 40602.428: [SoftReference, 0 refs, 0.0071133 secs]2017-05-23T20:43:17.342-0400: 40602.435: [WeakReference, 3 refs, 0.0007987 secs]2017-05-23T20:43:17.343-0400: 40602.436: [FinalReference, 182 refs, 0.0017603 secs]2017-05-23T20:43:17.345-0400: 40602.438: [PhantomReference, 0 refs, 0 refs, 0.0015961 secs]2017-05-23T20:43:17.346-0400: 40602.440: [JNI Weak Reference, 0.0000730 secs] (to-space exhausted), 6.1987667 secs]
>    [Parallel Time: 5446.3 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40596975.6, Avg: 40596975.7, Max: 40596975.8, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 2.9, Avg: 3.1, Max: 3.2, Diff: 0.3, Sum: 24.4]
>       [Update RS (ms): Min: 10.1, Avg: 10.3, Max: 10.5, Diff: 0.4, Sum: 82.6]
>          [Processed Buffers: Min: 33, Avg: 40.2, Max: 51, Diff: 18, Sum: 322]
>       [Scan RS (ms): Min: 30.7, Avg: 31.1, Max: 32.4, Diff: 1.8, Sum: 249.0]
>       [Code Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.6, Diff: 0.5, Sum: 2.8]
>       [Object Copy (ms): Min: 5399.2, Avg: 5400.6, Max: 5400.9, Diff: 1.7, Sum: 43204.5]
>       [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.3]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
>       [GC Worker Total (ms): Min: 5445.5, Avg: 5445.6, Max: 5445.7, Diff: 0.2, Sum: 43565.0]
>       [GC Worker End (ms): Min: 40602421.3, Avg: 40602421.4, Max: 40602421.4, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.8 ms]
>    [Other: 751.4 ms]
>       [Evacuation Failure: 728.5 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 17.8 ms]
>       [Ref Enq: 0.5 ms]
>       [Redirty Cards: 2.1 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.2 ms]
>       [Free CSet: 0.8 ms]
>    [Eden: 878.0M(976.0M)->0.0B(1424.0M) Survivors: 456.0M->8192.0K Heap: 6856.2M(7168.0M)->6908.2M(7168.0M)]
>  [Times: user=11.66 sys=1.15, real=6.19 secs]
> 2017-05-23T20:43:18.080-0400: 40603.173: [GC concurrent-root-region-scan-start]
> 2017-05-23T20:43:18.080-0400: 40603.173: Total time for which application threads were stopped: 6.2005443 seconds, Stopping threads took: 0.0002322 seconds
> 2017-05-23T20:43:18.080-0400: 40603.174: Application time: 0.0002882 seconds
> 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-root-region-scan-end, 0.0339339 secs]
> 2017-05-23T20:43:18.114-0400: 40603.207: [GC concurrent-mark-start]
> 2017-05-23T20:43:18.142-0400: 40603.235: Total time for which application threads were stopped: 0.0613820 seconds, Stopping threads took: 0.0001677 seconds
> 2017-05-23T20:43:18.142-0400: 40603.236: Application time: 0.0005017 seconds
> 2017-05-23T20:43:18.144-0400: 40603.237: Total time for which application threads were stopped: 0.0013197 seconds, Stopping threads took: 0.0001188 seconds
> 2017-05-23T20:43:18.144-0400: 40603.237: Application time: 0.0001781 seconds
> 2017-05-23T20:43:18.144-0400: 40603.238: Total time for which application threads were stopped: 0.0005735 seconds, Stopping threads took: 0.0000568 seconds
> 2017-05-23T20:43:18.728-0400: 40603.821: Application time: 0.5835349 seconds
> 2017-05-23T20:43:18.730-0400: 40603.823: [GC pause (G1 Evacuation Pause) (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> - age   1:    8388248 bytes,    8388248 total
> 2017-05-23T20:43:19.821-0400: 40604.914: [SoftReference, 0 refs, 0.0009673 secs]2017-05-23T20:43:19.822-0400: 40604.915: [WeakReference, 0 refs, 0.0006733 secs]2017-05-23T20:43:19.823-0400: 40604.916: [FinalReference, 0 refs, 0.0006260 secs]2017-05-23T20:43:19.823-0400: 40604.917: [PhantomReference, 0 refs, 0 refs, 0.0013002 secs]2017-05-23T20:43:19.825-0400: 40604.918: [JNI Weak Reference, 0.0000642 secs] (to-space exhausted), 1.2567408 secs]
>    [Parallel Time: 1084.5 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40603823.4, Avg: 40603823.5, Max: 40603823.6, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.7, Sum: 15.3]
>       [Update RS (ms): Min: 23.8, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum: 191.7]
>          [Processed Buffers: Min: 49, Avg: 53.5, Max: 60, Diff: 11, Sum: 428]
>       [Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 8.6]
>       [Code Root Scanning (ms): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.8]
>       [Object Copy (ms): Min: 1056.4, Avg: 1056.8, Max: 1057.2, Diff: 0.8, Sum: 8454.7]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.0]
>          [Termination Attempts: Min: 1, Avg: 3.8, Max: 7, Diff: 6, Sum: 30]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.5]
>       [GC Worker Total (ms): Min: 1084.0, Avg: 1084.1, Max: 1084.2, Diff: 0.2, Sum: 8673.2]
>       [GC Worker End (ms): Min: 40604907.6, Avg: 40604907.7, Max: 40604907.7, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 171.7 ms]
>       [Evacuation Failure: 159.4 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 9.9 ms]
>       [Ref Enq: 0.6 ms]
>       [Redirty Cards: 0.6 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.3 ms]
>       [Free CSet: 0.2 ms]
>    [Eden: 230.0M(1424.0M)->0.0B(1432.0M) Survivors: 8192.0K->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=2.33 sys=0.34, real=1.26 secs]
> 2017-05-23T20:43:19.987-0400: 40605.080: Total time for which application threads were stopped: 1.2587489 seconds, Stopping threads took: 0.0002182 seconds
> 2017-05-23T20:43:19.987-0400: 40605.080: Application time: 0.0003101 seconds
> 2017-05-23T20:43:19.988-0400: 40605.082: [GC pause (G1 Evacuation Pause) (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.020-0400: 40605.113: [SoftReference, 0 refs, 0.0008856 secs]2017-05-23T20:43:20.020-0400: 40605.114: [WeakReference, 0 refs, 0.0005588 secs]2017-05-23T20:43:20.021-0400: 40605.114: [FinalReference, 0 refs, 0.0006006 secs]2017-05-23T20:43:20.022-0400: 40605.115: [PhantomReference, 0 refs, 0 refs, 0.0010837 secs]2017-05-23T20:43:20.023-0400: 40605.116: [JNI Weak Reference, 0.0000610 secs], 0.0356212 secs]
>    [Parallel Time: 30.0 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605081.9, Avg: 40605082.0, Max: 40605082.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.9, Avg: 2.0, Max: 2.5, Diff: 0.6, Sum: 16.1]
>       [Update RS (ms): Min: 27.3, Avg: 27.4, Max: 27.5, Diff: 0.2, Sum: 219.3]
>          [Processed Buffers: Min: 82, Avg: 87.4, Max: 92, Diff: 10, Sum: 699]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.4]
>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.0, Sum: 0.4]
>       [GC Worker Total (ms): Min: 29.7, Avg: 29.8, Max: 29.9, Diff: 0.2, Sum: 238.5]
>       [GC Worker End (ms): Min: 40605111.8, Avg: 40605111.8, Max: 40605111.8, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.2 ms]
>    [Other: 5.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.0 ms]
>       [Ref Enq: 0.2 ms]
>       [Redirty Cards: 0.2 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.2 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.25 sys=0.00, real=0.04 secs]
> 2017-05-23T20:43:20.024-0400: 40605.118: Total time for which application threads were stopped: 0.0372043 seconds, Stopping threads took: 0.0001640 seconds
> 2017-05-23T20:43:20.025-0400: 40605.118: Application time: 0.0002435 seconds
> 2017-05-23T20:43:20.026-0400: 40605.119: [GC pause (G1 Evacuation Pause) (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.030-0400: 40605.123: [SoftReference, 0 refs, 0.0009405 secs]2017-05-23T20:43:20.031-0400: 40605.124: [WeakReference, 0 refs, 0.0005771 secs]2017-05-23T20:43:20.032-0400: 40605.125: [FinalReference, 0 refs, 0.0005766 secs]2017-05-23T20:43:20.032-0400: 40605.125: [PhantomReference, 0 refs, 0 refs, 0.0011847 secs]2017-05-23T20:43:20.033-0400: 40605.127: [JNI Weak Reference, 0.0000549 secs], 0.0087717 secs]
>    [Parallel Time: 3.0 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605119.3, Avg: 40605119.4, Max: 40605119.5, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 2.6, Diff: 1.0, Sum: 14.8]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
>          [Processed Buffers: Min: 0, Avg: 0.5, Max: 2, Diff: 2, Sum: 4]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>       [Termination (ms): Min: 0.0, Avg: 0.6, Max: 0.6, Diff: 0.6, Sum: 4.4]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
>       [GC Worker Total (ms): Min: 2.6, Avg: 2.6, Max: 2.7, Diff: 0.1, Sum: 21.1]
>       [GC Worker End (ms): Min: 40605122.0, Avg: 40605122.1, Max: 40605122.1, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 5.2 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.1 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.03 sys=0.00, real=0.01 secs]
> 2017-05-23T20:43:20.035-0400: 40605.128: Total time for which application threads were stopped: 0.0102350 seconds, Stopping threads took: 0.0000635 seconds
> 2017-05-23T20:43:20.035-0400: 40605.128: Application time: 0.0002150 seconds
> 2017-05-23T20:43:20.036-0400: 40605.129: [GC pause (G1 Evacuation Pause) (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.040-0400: 40605.133: [SoftReference, 0 refs, 0.0010156 secs]2017-05-23T20:43:20.041-0400: 40605.134: [WeakReference, 0 refs, 0.0006580 secs]2017-05-23T20:43:20.042-0400: 40605.135: [FinalReference, 0 refs, 0.0006435 secs]2017-05-23T20:43:20.042-0400: 40605.136: [PhantomReference, 0 refs, 0 refs, 0.0012604 secs]2017-05-23T20:43:20.044-0400: 40605.137: [JNI Weak Reference, 0.0000513 secs], 0.0087896 secs]
>    [Parallel Time: 2.7 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605129.6, Avg: 40605129.7, Max: 40605129.8, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.5, Diff: 0.8, Sum: 14.9]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
>          [Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.3]
>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.5]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
>       [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: 19.5]
>       [GC Worker End (ms): Min: 40605132.1, Avg: 40605132.2, Max: 40605132.2, Diff: 0.0]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.3 ms]
>    [Other: 5.5 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.4 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.04 sys=0.00, real=0.01 secs]
> 2017-05-23T20:43:20.045-0400: 40605.138: Total time for which application threads were stopped: 0.0101403 seconds, Stopping threads took: 0.0000614 seconds
> 2017-05-23T20:43:20.045-0400: 40605.139: Application time: 0.0001681 seconds
> 2017-05-23T20:43:20.046-0400: 40605.140: [GC pause (G1 Evacuation Pause) (young)
> Desired survivor size 94371840 bytes, new threshold 15 (max 15)
> 2017-05-23T20:43:20.050-0400: 40605.144: [SoftReference, 0 refs, 0.0008321 secs]2017-05-23T20:43:20.051-0400: 40605.145: [WeakReference, 0 refs, 0.0006103 secs]2017-05-23T20:43:20.052-0400: 40605.145: [FinalReference, 0 refs, 0.0007194 secs]2017-05-23T20:43:20.053-0400: 40605.146: [PhantomReference, 0 refs, 0 refs, 0.0010705 secs]2017-05-23T20:43:20.054-0400: 40605.147: [JNI Weak Reference, 0.0000508 secs], 0.0084107 secs]
>    [Parallel Time: 2.7 ms, GC Workers: 8]
>       [GC Worker Start (ms): Min: 40605139.9, Avg: 40605140.0, Max: 40605140.1, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 1.7, Avg: 1.9, Max: 2.4, Diff: 0.8, Sum: 15.1]
>       [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
>          [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3]
>       [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.2]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
>       [GC Worker Total (ms): Min: 2.3, Avg: 2.4, Max: 2.5, Diff: 0.2, Sum: 19.2]
>       [GC Worker End (ms): Min: 40605142.4, Avg: 40605142.4, Max: 40605142.5, Diff: 0.1]
>    [Code Root Fixup: 0.3 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.2 ms]
>    [Other: 5.1 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 4.1 ms]
>       [Ref Enq: 0.3 ms]
>       [Redirty Cards: 0.2 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.1 ms]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->7139.5M(7168.0M)]
>  [Times: user=0.03 sys=0.01, real=0.01 secs]
> 2017-05-23T20:43:20.055-0400: 40605.148: Total time for which application threads were stopped: 0.0097185 seconds, Stopping threads took: 0.0001029 seconds
> 2017-05-23T20:43:20.055-0400: 40605.148: Application time: 0.0001505 seconds
> 2017-05-23T20:43:20.056-0400: 40605.149: [Full GC (Allocation Failure) 2017-05-23T20:43:22.446-0400: 40607.540: [SoftReference, 1667 refs, 0.0003772 secs]2017-05-23T20:43:22.447-0400: 40607.541: [WeakReference, 5626 refs, 0.0008068 secs]2017-05-23T20:43:22.448-0400: 40607.541: [FinalReference, 4015 refs, 0.0015169 secs]2017-05-23T20:43:22.450-0400: 40607.543: [PhantomReference, 1 refs, 372 refs, 0.0001585 secs]2017-05-23T20:43:22.450-0400: 40607.543: [JNI Weak Reference, 0.0000963 secs] 7139M->2327M(7168M), 9.7036499 secs]
>    [Eden: 0.0B(1432.0M)->0.0B(1432.0M) Survivors: 0.0B->0.0B Heap: 7139.5M(7168.0M)->2327.6M(7168.0M)], [Metaspace: 108907K->108428K(1150976K)]
>  [Times: user=13.22 sys=0.00, real=9.70 secs]
> 2017-05-23T20:43:29.760-0400: 40614.853: Total time for which application threads were stopped: 9.7047785 seconds, Stopping threads took: 0.0000566 seconds
> 2017-05-23T20:43:29.760-0400: 40614.854: [GC concurrent-mark-abort]
> 2017-05-23T20:43:29.763-0400: 40614.856: Application time: 0.0029444 seconds
> > On Jul 21, 2017, at 5:34 PM, Thomas Schatzl <thomas.schatzl at oracle.com <mailto:thomas.schatzl at oracle.com>> wrote:
> >
> > Hi Kirk,
> >
> > On Fri, 2017-07-21 at 10:34 +0300, Kirk Pepperdine wrote:
> >> Hi all,
> >>
> >> A while back I mentioned to Erik at JFokus that I was seeing a
> >> puzzling behavior in the G1 where without any obvious failure, heap
> >> occupancy after collections would spike which would trigger a full
> >> which would (unexpectedly) completely recover everything down to the
> >> expected live set. Yesterday while working with Simone Bordet on the
> >> problem we came to the realization that we were seeing a pattern
> >> prior to the ramp up to the Full, Survivor space would be
> >> ergonomically resized to 0 -> 0. The only way to reset the situation
> >> was to run a full collection. In our minds this doesn?t make any
> >> sense to reset survivor space to 0. So far this is an observation
> >> from a single GC log but I recall seeing the pattern in many other
> >> logs. Before I go through the exercise of building a super grep to
> >> run over my G1 log repo I?d like to ask; under what conditions would
> >> it make sense to have the survivor space resized to 0? And if not,
> >>  would this be bug in G1? We tried reproducing the behavior in some
> >> test applications but I fear we often only see this happening in
> >> production applications that have been running for several days. It?s
> >> a behavior that I?ve seen in 1.7.0 and 1.8.0. No word on 9.
> >
> >   sounds similar to https://bugs.openjdk.java.net/browse/JDK-8037500 <https://bugs.openjdk.java.net/browse/JDK-8037500>.
> > Could you please post the type of collections for a few more gcs before
> > the zero-sized ones? It would be particularly interesting if there is a
> > mixed gc with to-space exhaustion just before this sequence. And if
> > there are log messages with attempts to start marking too.
> >
> > As for why that bug has been closed as "won't fix" because we do not
> > have a reproducer (any more) to test any changes in addition to the
> > stated reasons that the performance impact seemed minor at that time.
> >
> > There have been some changes in how the next gc is calculated in 9 too,
> > so I do not know either if 9 is also affected (particularly one of
> > these young-only gc's would not be issued any more).
> >
> > I can think of at least one more reasons other than stated in the CR
> > why this occurs at least for 8u60+ builds. There is the possibility
> > particularly in conjunction with humongous object allocation that after
> > starting the mutator, immediately afterwards a young gc that reclaims
> > zero space is issued, e.g.:
> >
> > young-gc, has X regions left at the end, starts mutators
> > mutator 1 allocates exactly X regions as humongous objects
> > mutator 2 allocates, finds that there are no regions left, issues
> > young-gc request; in this young-gc eden and survivor are of obviously
> > of zero size
> > [...and so on...]
> >
> > Note that this pattern could repeat multiple times as young gc may
> > reclaim space from humongous objects (eager reclaim!) until at some
> > point it ran into full gc.
> >
> > The logging that shows humongous object allocation (something about
> > reaching threshold and starting marking) could confirm this situation.
> >
> > No guarantees about that being the actual issue though.
> >
> > Thanks,
> >   Thomas
> >
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170724/ee64d917/attachment.htm>

From kim.barrett at oracle.com  Mon Jul 24 20:41:02 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 24 Jul 2017 16:41:02 -0400
Subject: [PATCH] JDK-8176571: Fine bitmaps should be allocated as
 belonging to mtGC
In-Reply-To: <CAC+6wjr8PwpqdU5uhBdj5yxPgW76-5fvSZE_njRFqoKf+p711g@mail.gmail.com>
References: <CAC+6wjoNV-Tr=35cA_R0f4ZKTQnVHnOduZHWsti6GayUaOUXCw@mail.gmail.com>
 <1500536234.2924.0.camel@oracle.com>
 <CAC+6wjr8PwpqdU5uhBdj5yxPgW76-5fvSZE_njRFqoKf+p711g@mail.gmail.com>
Message-ID: <E3FAFCFB-759E-4483-A477-4E1BE3B81BB8@oracle.com>

> On Jul 23, 2017, at 4:31 AM, Milan Mimica <milan.mimica at gmail.com> wrote:
> 
> ?et, 20. srp 2017. u 09:37 Thomas Schatzl <thomas.schatzl at oracle.com> napisao je:
> 
>   great!
> 
> Looks good. I can sponsor as soon as Kim or anybody else gives his
> okay.
> 
> Hi
> 
> I just noticed my heapBitMap_nmt.diff includes the other one. Find the corrected one in attachment.

Thomas passed off the sponsoring to me.

I noticed that problem as well, and had adjusted for it.  Unfortunately, I ran into a test
failure, which I haven?t had time yet to really investigate.  I doubt it?s related to these
changes, but won?t really know until I get time to dig into it, which might be a couple
of days.


From rkennke at redhat.com  Tue Jul 25 10:15:00 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 25 Jul 2017 12:15:00 +0200
Subject: RFR: 8184751: Provide thread pool for parallel safepoint cleanup
In-Reply-To: <8ec1092c-b01e-80a9-23dd-8447e30c675e@redhat.com>
References: <8ec1092c-b01e-80a9-23dd-8447e30c675e@redhat.com>
Message-ID: <afe0f773-87fd-f6d1-27fe-94de11b69920@redhat.com>

I have discussed this with Robbin Ehn offline. There is not much
interest in this change from Oracle engineering to have this upstream.
Unless somebody speaks up, I will close the bug and withdraw the review
by the end of today.

I will build this into Shenandoah-only instead in this case.

Roman

> This is a follow-up to 8180932: Parallelize safepoint cleanup, which
> should land in JDK10 real soon now.
>
> In order to actually be able to parallelize safepoint cleanup, we now
> need the GC to provide some worker threads.
>
> In this change, I propose to create one globally (i.e. for all GCs) in
> CollectedHeap, if ParallelSafepointCleanupThreads>1. The flag defaults
> to 0, which means it's doing cleanup using the VMThread (i.e. exactly
> current behaviour).
>
> We have already discussed this, and came to the conclusion that it does
> not really make sense to share the GC's worker threads here, because
> they may not be idle, but only suspended from concurrent work (i.e. by
> SuspendibleThreadSet::synchronize() or similar).
>
> http://cr.openjdk.java.net/~rkennke/8184751/webrev.00/
> <http://cr.openjdk.java.net/%7Erkennke/8184751/webrev.00/>
>
> What do you think?
>
>
> Roman
>
>


From erik.osterlund at oracle.com  Tue Jul 25 11:29:33 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 25 Jul 2017 13:29:33 +0200
Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling
Message-ID: <59772B9D.9000100@oracle.com>

Hi,

Bug:
https://bugs.openjdk.java.net/browse/JDK-8185141

Webrev:
http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/

There seems to be different ways of handling scavengeable nmethod roots 
in hotspot.

The primary way of dealing with them is to use the CodeCache scavenge 
root nmethod list that maintains a list of all nmethods with 
scavengeable nmethods.
However, G1 does not use this list as it has its own mechanism of 
keeping track of nmethods with scavengeable roots pointing into the heap.
To handle this, the current CodeCache code is full of special cases for 
G1. In multiple cases we check if (UseG1GC) and then return.
,m
We seemingly need a better way of communicating to the GC what 
scavengeable nmethod roots there are to be able to get rid of the if 
(UseG1GC)... code.

As a solution, I propose to make CollectedHeap::register_nmethod the 
primary way of registering to the GC that there might be a new nmethod 
to keep track of. It is then up to the specific GC to take appropriate 
action. The default appropriate action of CollectedHeap is to add the 
nmethod to the shared scavenge root nmethod list if it is not already on 
the list and it detected the existence of a scavengeable root oop in the 
nmethod. G1 on the other hand, will use its closures to figure out what 
remembered set it should be added to.

When using G1, the CodeCache scavenge list will be empty, and so a lot 
of G1-centric code for exiting before we walk the list of nmethods on 
the list can be removed where the list is processed in a for loop. 
Because since the list is empty, it does not matter that G1 runs this 
code too - it will just iterate 0 times in the loop since it is empty. 
But that's because the list was empty, not because we are using G1 - it 
just happens to be that the list is always empty when we use G1.

Testing: JPRT with hotspot testset, RBT hs-tier3.

Thanks,
/Erik


From rkennke at redhat.com  Tue Jul 25 11:36:08 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 25 Jul 2017 13:36:08 +0200
Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling
In-Reply-To: <59772B9D.9000100@oracle.com>
References: <59772B9D.9000100@oracle.com>
Message-ID: <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com>

Hi Erik,

the change looks mostly good to me. This really needed cleanup.

However, I question to do the default impl in CollectedHeap, and rely on
G1 to override it. Shenandoah's not using the scavenge roots list
either. It seems odd to have a default impl in the superclass that is
used by only 2 subclasses (GCH and PSH), and 2 other subclasses not
using it. And potential future implementors require to override it to
not do that stuff. Think Epsilon GC too: it doesn't need it, and must
add code to not do it. It just seems wrong. I'd just add the impl to
both GCH and PSH, and leave the superclass empty.

Roman

Am 25.07.2017 um 13:29 schrieb Erik ?sterlund:
> Hi,
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8185141
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/
>
> There seems to be different ways of handling scavengeable nmethod
> roots in hotspot.
>
> The primary way of dealing with them is to use the CodeCache scavenge
> root nmethod list that maintains a list of all nmethods with
> scavengeable nmethods.
> However, G1 does not use this list as it has its own mechanism of
> keeping track of nmethods with scavengeable roots pointing into the heap.
> To handle this, the current CodeCache code is full of special cases
> for G1. In multiple cases we check if (UseG1GC) and then return.
> ,m
> We seemingly need a better way of communicating to the GC what
> scavengeable nmethod roots there are to be able to get rid of the if
> (UseG1GC)... code.
>
> As a solution, I propose to make CollectedHeap::register_nmethod the
> primary way of registering to the GC that there might be a new nmethod
> to keep track of. It is then up to the specific GC to take appropriate
> action. The default appropriate action of CollectedHeap is to add the
> nmethod to the shared scavenge root nmethod list if it is not already
> on the list and it detected the existence of a scavengeable root oop
> in the nmethod. G1 on the other hand, will use its closures to figure
> out what remembered set it should be added to.
>
> When using G1, the CodeCache scavenge list will be empty, and so a lot
> of G1-centric code for exiting before we walk the list of nmethods on
> the list can be removed where the list is processed in a for loop.
> Because since the list is empty, it does not matter that G1 runs this
> code too - it will just iterate 0 times in the loop since it is empty.
> But that's because the list was empty, not because we are using G1 -
> it just happens to be that the list is always empty when we use G1.
>
> Testing: JPRT with hotspot testset, RBT hs-tier3.
>
> Thanks,
> /Erik


From erik.osterlund at oracle.com  Tue Jul 25 12:34:04 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 25 Jul 2017 14:34:04 +0200
Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling
In-Reply-To: <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com>
References: <59772B9D.9000100@oracle.com>
 <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com>
Message-ID: <59773ABC.1020506@oracle.com>

Hi Roman,

I see your point. From my perspective, the default for any GC is to use 
the shared CodeCache scavenge root list, and anything else 
(G1/Shenandoah) is an exception and can override to do something else 
instead.

Having said that, I agree we could easily move that default 
implementation to CodeCache from CollectedHeap and call it explicitly 
where it is used so that we do not accidentally mess up when we build a 
new GC.

However, then I think we should also move verify_nmethod_roots() into 
those GCs then, as it is closely related to which list it is on.

New full webrev:
http://cr.openjdk.java.net/~eosterlund/8185141/webrev.01/

New incremental webrev:
http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00_01/

What do you think?

Thanks,
/Erik

On 2017-07-25 13:36, Roman Kennke wrote:
> Hi Erik,
>
> the change looks mostly good to me. This really needed cleanup.
>
> However, I question to do the default impl in CollectedHeap, and rely on
> G1 to override it. Shenandoah's not using the scavenge roots list
> either. It seems odd to have a default impl in the superclass that is
> used by only 2 subclasses (GCH and PSH), and 2 other subclasses not
> using it. And potential future implementors require to override it to
> not do that stuff. Think Epsilon GC too: it doesn't need it, and must
> add code to not do it. It just seems wrong. I'd just add the impl to
> both GCH and PSH, and leave the superclass empty.
>
> Roman
>
> Am 25.07.2017 um 13:29 schrieb Erik ?sterlund:
>> Hi,
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8185141
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/
>>
>> There seems to be different ways of handling scavengeable nmethod
>> roots in hotspot.
>>
>> The primary way of dealing with them is to use the CodeCache scavenge
>> root nmethod list that maintains a list of all nmethods with
>> scavengeable nmethods.
>> However, G1 does not use this list as it has its own mechanism of
>> keeping track of nmethods with scavengeable roots pointing into the heap.
>> To handle this, the current CodeCache code is full of special cases
>> for G1. In multiple cases we check if (UseG1GC) and then return.
>> ,m
>> We seemingly need a better way of communicating to the GC what
>> scavengeable nmethod roots there are to be able to get rid of the if
>> (UseG1GC)... code.
>>
>> As a solution, I propose to make CollectedHeap::register_nmethod the
>> primary way of registering to the GC that there might be a new nmethod
>> to keep track of. It is then up to the specific GC to take appropriate
>> action. The default appropriate action of CollectedHeap is to add the
>> nmethod to the shared scavenge root nmethod list if it is not already
>> on the list and it detected the existence of a scavengeable root oop
>> in the nmethod. G1 on the other hand, will use its closures to figure
>> out what remembered set it should be added to.
>>
>> When using G1, the CodeCache scavenge list will be empty, and so a lot
>> of G1-centric code for exiting before we walk the list of nmethods on
>> the list can be removed where the list is processed in a for loop.
>> Because since the list is empty, it does not matter that G1 runs this
>> code too - it will just iterate 0 times in the loop since it is empty.
>> But that's because the list was empty, not because we are using G1 -
>> it just happens to be that the list is always empty when we use G1.
>>
>> Testing: JPRT with hotspot testset, RBT hs-tier3.
>>
>> Thanks,
>> /Erik
>


From rkennke at redhat.com  Tue Jul 25 13:28:20 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 25 Jul 2017 15:28:20 +0200
Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling
In-Reply-To: <59773ABC.1020506@oracle.com>
References: <59772B9D.9000100@oracle.com>
 <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com>
 <59773ABC.1020506@oracle.com>
Message-ID: <8c133671-a0b2-6aac-d474-0aed1a52b931@redhat.com>

Much better! Good to go for me.

Roman

> Hi Roman,
>
> I see your point. From my perspective, the default for any GC is to
> use the shared CodeCache scavenge root list, and anything else
> (G1/Shenandoah) is an exception and can override to do something else
> instead.
>
> Having said that, I agree we could easily move that default
> implementation to CodeCache from CollectedHeap and call it explicitly
> where it is used so that we do not accidentally mess up when we build
> a new GC.
>
> However, then I think we should also move verify_nmethod_roots() into
> those GCs then, as it is closely related to which list it is on.
>
> New full webrev:
> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.01/
>
> New incremental webrev:
> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00_01/
>
> What do you think?
>
> Thanks,
> /Erik
>
> On 2017-07-25 13:36, Roman Kennke wrote:
>> Hi Erik,
>>
>> the change looks mostly good to me. This really needed cleanup.
>>
>> However, I question to do the default impl in CollectedHeap, and rely on
>> G1 to override it. Shenandoah's not using the scavenge roots list
>> either. It seems odd to have a default impl in the superclass that is
>> used by only 2 subclasses (GCH and PSH), and 2 other subclasses not
>> using it. And potential future implementors require to override it to
>> not do that stuff. Think Epsilon GC too: it doesn't need it, and must
>> add code to not do it. It just seems wrong. I'd just add the impl to
>> both GCH and PSH, and leave the superclass empty.
>>
>> Roman
>>
>> Am 25.07.2017 um 13:29 schrieb Erik ?sterlund:
>>> Hi,
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8185141
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/
>>>
>>> There seems to be different ways of handling scavengeable nmethod
>>> roots in hotspot.
>>>
>>> The primary way of dealing with them is to use the CodeCache scavenge
>>> root nmethod list that maintains a list of all nmethods with
>>> scavengeable nmethods.
>>> However, G1 does not use this list as it has its own mechanism of
>>> keeping track of nmethods with scavengeable roots pointing into the
>>> heap.
>>> To handle this, the current CodeCache code is full of special cases
>>> for G1. In multiple cases we check if (UseG1GC) and then return.
>>> ,m
>>> We seemingly need a better way of communicating to the GC what
>>> scavengeable nmethod roots there are to be able to get rid of the if
>>> (UseG1GC)... code.
>>>
>>> As a solution, I propose to make CollectedHeap::register_nmethod the
>>> primary way of registering to the GC that there might be a new nmethod
>>> to keep track of. It is then up to the specific GC to take appropriate
>>> action. The default appropriate action of CollectedHeap is to add the
>>> nmethod to the shared scavenge root nmethod list if it is not already
>>> on the list and it detected the existence of a scavengeable root oop
>>> in the nmethod. G1 on the other hand, will use its closures to figure
>>> out what remembered set it should be added to.
>>>
>>> When using G1, the CodeCache scavenge list will be empty, and so a lot
>>> of G1-centric code for exiting before we walk the list of nmethods on
>>> the list can be removed where the list is processed in a for loop.
>>> Because since the list is empty, it does not matter that G1 runs this
>>> code too - it will just iterate 0 times in the loop since it is empty.
>>> But that's because the list was empty, not because we are using G1 -
>>> it just happens to be that the list is always empty when we use G1.
>>>
>>> Testing: JPRT with hotspot testset, RBT hs-tier3.
>>>
>>> Thanks,
>>> /Erik
>>
>


From erik.osterlund at oracle.com  Tue Jul 25 13:47:38 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 25 Jul 2017 15:47:38 +0200
Subject: RFR (S): 8185141: Generalize scavengeable nmethod root handling
In-Reply-To: <8c133671-a0b2-6aac-d474-0aed1a52b931@redhat.com>
References: <59772B9D.9000100@oracle.com>
 <88b57631-10cf-40ec-2f71-485df0f4180e@redhat.com>
 <59773ABC.1020506@oracle.com>
 <8c133671-a0b2-6aac-d474-0aed1a52b931@redhat.com>
Message-ID: <59774BFA.50700@oracle.com>

Hi,

Thanks for the review Roman!

/Erik

On 2017-07-25 15:28, Roman Kennke wrote:
> Much better! Good to go for me.
>
> Roman
>
>> Hi Roman,
>>
>> I see your point. From my perspective, the default for any GC is to
>> use the shared CodeCache scavenge root list, and anything else
>> (G1/Shenandoah) is an exception and can override to do something else
>> instead.
>>
>> Having said that, I agree we could easily move that default
>> implementation to CodeCache from CollectedHeap and call it explicitly
>> where it is used so that we do not accidentally mess up when we build
>> a new GC.
>>
>> However, then I think we should also move verify_nmethod_roots() into
>> those GCs then, as it is closely related to which list it is on.
>>
>> New full webrev:
>> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.01/
>>
>> New incremental webrev:
>> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00_01/
>>
>> What do you think?
>>
>> Thanks,
>> /Erik
>>
>> On 2017-07-25 13:36, Roman Kennke wrote:
>>> Hi Erik,
>>>
>>> the change looks mostly good to me. This really needed cleanup.
>>>
>>> However, I question to do the default impl in CollectedHeap, and rely on
>>> G1 to override it. Shenandoah's not using the scavenge roots list
>>> either. It seems odd to have a default impl in the superclass that is
>>> used by only 2 subclasses (GCH and PSH), and 2 other subclasses not
>>> using it. And potential future implementors require to override it to
>>> not do that stuff. Think Epsilon GC too: it doesn't need it, and must
>>> add code to not do it. It just seems wrong. I'd just add the impl to
>>> both GCH and PSH, and leave the superclass empty.
>>>
>>> Roman
>>>
>>> Am 25.07.2017 um 13:29 schrieb Erik ?sterlund:
>>>> Hi,
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8185141
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8185141/webrev.00/
>>>>
>>>> There seems to be different ways of handling scavengeable nmethod
>>>> roots in hotspot.
>>>>
>>>> The primary way of dealing with them is to use the CodeCache scavenge
>>>> root nmethod list that maintains a list of all nmethods with
>>>> scavengeable nmethods.
>>>> However, G1 does not use this list as it has its own mechanism of
>>>> keeping track of nmethods with scavengeable roots pointing into the
>>>> heap.
>>>> To handle this, the current CodeCache code is full of special cases
>>>> for G1. In multiple cases we check if (UseG1GC) and then return.
>>>> ,m
>>>> We seemingly need a better way of communicating to the GC what
>>>> scavengeable nmethod roots there are to be able to get rid of the if
>>>> (UseG1GC)... code.
>>>>
>>>> As a solution, I propose to make CollectedHeap::register_nmethod the
>>>> primary way of registering to the GC that there might be a new nmethod
>>>> to keep track of. It is then up to the specific GC to take appropriate
>>>> action. The default appropriate action of CollectedHeap is to add the
>>>> nmethod to the shared scavenge root nmethod list if it is not already
>>>> on the list and it detected the existence of a scavengeable root oop
>>>> in the nmethod. G1 on the other hand, will use its closures to figure
>>>> out what remembered set it should be added to.
>>>>
>>>> When using G1, the CodeCache scavenge list will be empty, and so a lot
>>>> of G1-centric code for exiting before we walk the list of nmethods on
>>>> the list can be removed where the list is processed in a for loop.
>>>> Because since the list is empty, it does not matter that G1 runs this
>>>> code too - it will just iterate 0 times in the loop since it is empty.
>>>> But that's because the list was empty, not because we are using G1 -
>>>> it just happens to be that the list is always empty when we use G1.
>>>>
>>>> Testing: JPRT with hotspot testset, RBT hs-tier3.
>>>>
>>>> Thanks,
>>>> /Erik


From alexander.harlap at oracle.com  Tue Jul 25 14:24:18 2017
From: alexander.harlap at oracle.com (Alexander Harlap)
Date: Tue, 25 Jul 2017 10:24:18 -0400
Subject: Need sponsor to push attached 8184045 into jdk10/hs/hostspt
Message-ID: <c559fa67-601d-c337-a166-1fcf4d24778a@oracle.com>

I need a sponsor to push attached 8184045.patch - .

Patch should go into jdk10/hs/hotspot

Reviewed by Daniel D. Daugherty and Erik Helin.

Thank you,

Alex

-------------- next part --------------
# HG changeset patch
# User aharlap
# Date 1500992129 14400
# Node ID a780a9bf31f1ded1d008964d5c079892c0a97590
# Parent  0a22e4ef496e290dc1f4d87b87763c551f72cf23
8184045: TestSystemGCWithG1.java times out on Solaris SPARC
Summary: Avoid extra round of stressing
Reviewed-by: dcubed, ehelin

diff -r 0a22e4ef496e -r a780a9bf31f1 test/gc/stress/systemgc/TestSystemGC.java
--- a/test/gc/stress/systemgc/TestSystemGC.java	Mon Jul 24 22:56:43 2017 +0000
+++ b/test/gc/stress/systemgc/TestSystemGC.java	Tue Jul 25 10:15:29 2017 -0400
@@ -182,9 +182,11 @@
     }
 
     public static void main(String[] args) {
-        // First allocate the long lived objects and then run all phases twice.
+        // First allocate the long lived objects and then run all phases.
         populateLongLived();
         runAllPhases();
-        runAllPhases();
+        if (args.length > 0 && args[0].equals("long")) {
+            runAllPhases();
+        }
     }
 }

From alexander.harlap at oracle.com  Tue Jul 25 17:37:41 2017
From: alexander.harlap at oracle.com (Alexander Harlap)
Date: Tue, 25 Jul 2017 13:37:41 -0400
Subject: Need sponsor to push attached 8183973 into jdk10/hs/hostspt
Message-ID: <406c98a0-42a2-729b-1a95-a105c52f5dc5@oracle.com>

I need a sponsor to push attached 8183973.patch - gc/TestFullGCALot.java 
fails in JDK10-hs nightly

Patch should go into jdk10/hs/hotspot

Reviewed by  Mikael Gerdin and Erik Osterlund.

Thank you,

Alex

-------------- next part --------------
# HG changeset patch
# User aharlap
# Date 1501003694 14400
# Node ID 04c3d66bb13df8553920ec275fb246f96190783a
# Parent  0a22e4ef496e290dc1f4d87b87763c551f72cf23
8183973: gc/TestFullGCALot.java fails in JDK10-hs nightly
Summary: Provide extra NewSize to avoid failure in running test with UseDeterministicG1GC option.
Reviewed-by: mgerdin, eosterlund

diff -r 0a22e4ef496e -r 04c3d66bb13d test/gc/TestFullGCALot.java
--- a/test/gc/TestFullGCALot.java	Mon Jul 24 22:56:43 2017 +0000
+++ b/test/gc/TestFullGCALot.java	Tue Jul 25 13:28:14 2017 -0400
@@ -25,9 +25,9 @@
  * @test TestFullGCALot
  * @key gc
  * @bug 4187687
- * @summary Ensure no acess violation when using FullGCALot
+ * @summary Ensure no access violation when using FullGCALot
  * @requires vm.debug
- * @run main/othervm -XX:+FullGCALot -XX:FullGCALotInterval=120 TestFullGCALot
+ * @run main/othervm -XX:NewSize=10m -XX:+FullGCALot -XX:FullGCALotInterval=120 TestFullGCALot
  */
 
 public class TestFullGCALot {

From Derek.White at cavium.com  Tue Jul 25 22:08:24 2017
From: Derek.White at cavium.com (White, Derek)
Date: Tue, 25 Jul 2017 22:08:24 +0000
Subject: RFR: 8184751: Provide thread pool for parallel safepoint cleanup
In-Reply-To: <afe0f773-87fd-f6d1-27fe-94de11b69920@redhat.com>
References: <8ec1092c-b01e-80a9-23dd-8447e30c675e@redhat.com>
 <afe0f773-87fd-f6d1-27fe-94de11b69920@redhat.com>
Message-ID: <CY1PR0701MB1632894BBDCFE90C4EAA5C8084B80@CY1PR0701MB1632.namprd07.prod.outlook.com>

Hi Roman,

We might be interested in seeing this in Par GC and/or G1 at some point, but we can push that when the time comes.

Thanks for working this issue though Roman, looking forward to trying it out in Shenandoah.

- Derek White, Cavium

> -----Original Message-----
> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-
> bounces at openjdk.java.net] On Behalf Of Roman Kennke
> Sent: Tuesday, July 25, 2017 6:15 AM
> To: hotspot-gc-dev openjdk.java.net <hotspot-gc-dev at openjdk.java.net>
> Cc: hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR: 8184751: Provide thread pool for parallel safepoint cleanup
> 
> I have discussed this with Robbin Ehn offline. There is not much interest in
> this change from Oracle engineering to have this upstream.
> Unless somebody speaks up, I will close the bug and withdraw the review by
> the end of today.
> 
> I will build this into Shenandoah-only instead in this case.
> 
> Roman
> 
> > This is a follow-up to 8180932: Parallelize safepoint cleanup, which
> > should land in JDK10 real soon now.
> >
> > In order to actually be able to parallelize safepoint cleanup, we now
> > need the GC to provide some worker threads.
> >
> > In this change, I propose to create one globally (i.e. for all GCs) in
> > CollectedHeap, if ParallelSafepointCleanupThreads>1. The flag defaults
> > to 0, which means it's doing cleanup using the VMThread (i.e. exactly
> > current behaviour).
> >
> > We have already discussed this, and came to the conclusion that it
> > does not really make sense to share the GC's worker threads here,
> > because they may not be idle, but only suspended from concurrent work
> > (i.e. by
> > SuspendibleThreadSet::synchronize() or similar).
> >
> > http://cr.openjdk.java.net/~rkennke/8184751/webrev.00/
> > <http://cr.openjdk.java.net/%7Erkennke/8184751/webrev.00/>
> >
> > What do you think?
> >
> >
> > Roman
> >
> >


From mark.reinhold at oracle.com  Wed Jul 26 21:10:48 2017
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Wed, 26 Jul 2017 14:10:48 -0700 (PDT)
Subject: JEP 307: Parallel Full GC for G1
Message-ID: <20170726211048.5A5EA983B1@eggemoggin.niobe.net>

New JEP Candidate: http://openjdk.java.net/jeps/307

- Mark


From mark.reinhold at oracle.com  Wed Jul 26 21:11:46 2017
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Wed, 26 Jul 2017 14:11:46 -0700 (PDT)
Subject: JEP 308: G1 Ergonomics
Message-ID: <20170726211146.8E6E4983B7@eggemoggin.niobe.net>

New JEP Candidate: http://openjdk.java.net/jeps/308

- Mark


From kirk at kodewerk.com  Thu Jul 27 07:45:46 2017
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Thu, 27 Jul 2017 09:45:46 +0200
Subject: JEP 308: G1 Ergonomics
In-Reply-To: <20170726211146.8E6E4983B7@eggemoggin.niobe.net>
References: <20170726211146.8E6E4983B7@eggemoggin.niobe.net>
Message-ID: <6E6896B5-7658-4ABB-8B6B-B8C63FC64872@kodewerk.com>

Hi,

Great to see more work being done on improving G1 heuristics. From the data we?ve collected this year I can say that when G1 has needed to be tuned, one of the most useful levers has been -XX:G1NewSizePercent. 5% is often too small which then prematurely pushes data into tenured spaces. The next lever has been increasing reserved size from 5% to something bigger. This seems to help the collector cope with applications that seem to have bursty humongous allocation behavior (aka JSON serialization). Third would be G1MixedGCLiveThresholdPercent as even at 85% that can sometimes be too low a setting. Finally, balancing out mixed collection counts often helps stabilize pause times. Quite frequently the mixed collection count is 1 for just about every collection. Getting that to be mostly 8.. better.

Kind regards,
Kirk Pepperdine


From milan.mimica at gmail.com  Thu Jul 27 08:15:49 2017
From: milan.mimica at gmail.com (Milan Mimica)
Date: Thu, 27 Jul 2017 08:15:49 +0000
Subject: JEP 307: Parallel Full GC for G1
In-Reply-To: <20170726211048.5A5EA983B1@eggemoggin.niobe.net>
References: <20170726211048.5A5EA983B1@eggemoggin.niobe.net>
Message-ID: <CAC+6wjpSRTQ3-u3rJ_nnashDZyEQBcOE8rU_zh+WEo7ChzNMJw@mail.gmail.com>

Hi

Can I have just a short explanation why G1 Full GC wasn't implemented as
parallel in the first place, given "the assumption that nothing in the
fundamental design of G1 prevents a parallel full GC."?


sri, 26. srp 2017. u 23:11 <mark.reinhold at oracle.com> napisao je:

> New JEP Candidate: http://openjdk.java.net/jeps/307
>
> - Mark
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/92d7be75/attachment.htm>

From rkennke at redhat.com  Thu Jul 27 08:30:14 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 27 Jul 2017 10:30:14 +0200
Subject: JEP 307: Parallel Full GC for G1
In-Reply-To: <CAC+6wjpSRTQ3-u3rJ_nnashDZyEQBcOE8rU_zh+WEo7ChzNMJw@mail.gmail.com>
References: <20170726211048.5A5EA983B1@eggemoggin.niobe.net>
 <CAC+6wjpSRTQ3-u3rJ_nnashDZyEQBcOE8rU_zh+WEo7ChzNMJw@mail.gmail.com>
Message-ID: <dea25982-0363-0a56-790a-c68f2a8994fd@redhat.com>

Hi Milan,

I cannot give an authoritative answer to that, but since Shenandoah is
very similar in this respect, and from my experience with Shenandoah, I
think that full GC is not a very high priority. It is meant as a
last-ditch collection, when all else fails to free enough space. In a
good world, with perfect GC heuristics and well behaving applications,
it should never happen, and thus performance shouldn't matter much.

However, this world is not ideal, and full GC performance does matter,
especially when you got a large heap, and run into it and lose *seconds*
(or even minutes) on it.

That being said, we do have a parallel full GC in Shenandoah, and its
performance gets close to, and even sometimes exceeds, parallel GC.
Maybe it's worth to adopt it for G1? It should be relatively
straightforward, because both G1 and Shenandoah are region based. It
does compact objects towards the bottom of the heap, while mostly
retaining their relative order.

Roman

Am 27.07.2017 um 10:15 schrieb Milan Mimica:
> Hi
>
> Can I have just a short explanation why G1 Full GC wasn't implemented
> as parallel in the first place, given "the assumption that nothing in
> the fundamental design of G1 prevents a parallel full GC."?
>
>
>
> sri, 26. srp 2017. u 23:11 <mark.reinhold at oracle.com
> <mailto:mark.reinhold at oracle.com>> napisao je:
>
>     New JEP Candidate: http://openjdk.java.net/jeps/307
>
>     - Mark
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/3e5499d2/attachment.htm>

From hohensee at amazon.com  Thu Jul 27 15:45:51 2017
From: hohensee at amazon.com (Hohensee, Paul)
Date: Thu, 27 Jul 2017 15:45:51 +0000
Subject: JEP 307: Parallel Full GC for G1
In-Reply-To: <dea25982-0363-0a56-790a-c68f2a8994fd@redhat.com>
References: <20170726211048.5A5EA983B1@eggemoggin.niobe.net>
 <CAC+6wjpSRTQ3-u3rJ_nnashDZyEQBcOE8rU_zh+WEo7ChzNMJw@mail.gmail.com>
 <dea25982-0363-0a56-790a-c68f2a8994fd@redhat.com>
Message-ID: <29745620-D9B1-4CA7-826D-43D09CCE00BE@amazon.com>

Imo, all existing collectors can be replaced by variations on G1. The first step was replacing CMS (though admittedly there?s still some way to go with that). The second is to replace the parallel collector with a throughput oriented G1 mode, which requires a parallel STW full GC. Full collections should indeed equal or exceed parallel GC performance, because the old gen is mostly compacted already, so you don?t have to do anything with most old gen regions. You just run the equivalent of a mixed collection that includes all not-mostly-full old regions and promote the entire young gen. If you set throughput mode at VM startup, you shouldn?t need remembered sets either, just the card table. The third step is concurrent/parallel evacuation and continuous concurrent/parallel collection. Shenandoah is almost there, Azul?s C4 is completely there.

You can see this progression in Android too, btw. O-dessert (ships next month) includes a concurrent/parallel region-based GC that replaces the previous variation-on-CMS collector.

Paul

From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> on behalf of Roman Kennke <rkennke at redhat.com>
Date: Thursday, July 27, 2017 at 1:30 AM
To: Milan Mimica <milan.mimica at gmail.com>, "hotspot-gc-dev at openjdk.java.net openjdk.java.net" <hotspot-gc-dev at openjdk.java.net>
Subject: Re: JEP 307: Parallel Full GC for G1

Hi Milan,

I cannot give an authoritative answer to that, but since Shenandoah is very similar in this respect, and from my experience with Shenandoah, I think that full GC is not a very high priority. It is meant as a last-ditch collection, when all else fails to free enough space. In a good world, with perfect GC heuristics and well behaving applications, it should never happen, and thus performance shouldn't matter much.

However, this world is not ideal, and full GC performance does matter, especially when you got a large heap, and run into it and lose *seconds* (or even minutes) on it.

That being said, we do have a parallel full GC in Shenandoah, and its performance gets close to, and even sometimes exceeds, parallel GC. Maybe it's worth to adopt it for G1? It should be relatively straightforward, because both G1 and Shenandoah are region based. It does compact objects towards the bottom of the heap, while mostly retaining their relative order.

Roman

Am 27.07.2017 um 10:15 schrieb Milan Mimica:
Hi
Can I have just a short explanation why G1 Full GC wasn't implemented as parallel in the first place, given "the assumption that nothing in the fundamental design of G1 prevents a parallel full GC."?


sri, 26. srp 2017. u 23:11 <mark.reinhold at oracle.com<mailto:mark.reinhold at oracle.com>> napisao je:
New JEP Candidate: http://openjdk.java.net/jeps/307

- Mark


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/0e3387b0/attachment.htm>

From kirk at kodewerk.com  Tue Jul 11 11:28:09 2017
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Tue, 11 Jul 2017 11:28:09 -0000
Subject: Spikes in G1
Message-ID: <F9705A4C-8335-4CC8-9A36-EC7C46921682@kodewerk.com>

Hi all,

This is this mysterious G1 behavior that I?ve briefly mentioned to Erik that I keep seeing over and over again. I?ve just seen it again and this time I managed to get enough context to come with a hypothesis of why this is happening.
For quite some time I?ve noted that the G1 has a tendency to get into a condition where collections start to fail and occupancy spikes to the point where the condition can only be resolved with a Full GC. The Full GC will typically recover all of the memory consumed by the spike (and then some). This is a a bit unexpected for if the data is referenced, which it appears to be as other (mixed) attempts to collect do fail., then the full gc should fail to collect and occupancy should remain high. In this case weak references appear to be in involved in the sequence of events that lead up to the Full GC. You can see in this case that the number of weak references process do spike during the full. I need to go back and review other logs to see if this is the same for past occurrences.

I?m curious to understand if there is some unintended interplay between G1GC and WeakReferences that is ultimately responsible for heap occupancy to suddenly spike only to be completely reclaimed by a Full (even though mixed collections are running prior to the full).

Kind regards
Kirk

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gc.log.20170709.150502.zip
Type: application/zip
Size: 4957843 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170711/9cba5b20/gc.log.20170709.150502.zip>

From jiangli.zhou at oracle.com  Thu Jul 27 19:00:19 2017
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Thu, 27 Jul 2017 12:00:19 -0700
Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache
 resolved_reference arrays in CDS archive
Message-ID: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com>

Hi,

Please help review the changes for JDK-8179302 <https://bugs.openjdk.java.net/browse/JDK-8179302> (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required.

The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself.

RFE:        JDK-8179302 <https://bugs.openjdk.java.net/browse/JDK-8179302>
hotspot:   http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ <http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/>
whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ <http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/>

Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants.
Types of Pinned G1 Heap Regions
The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type.
00100 0 [ 8] Pinned Mask 
01000 0 [16] Old Mask
10000 0 [32] Archive Mask 
11100 0 [56] Open Archive:  ArchiveMask | PinnedMask | OldMask
11100 1 [57] Archive           : ArchiveMask | PinnedMask | OldMask + 1
                
Pinned Regions
Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed.
Archive Regions
The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive.
An archive region is also an old region by design.
Open Archive (GC-RW) Regions
Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC.
Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC.
Adjustable Outgoing Pointers
As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly.
Closed Archive (GC-RO) Regions
The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region.
In JDK 9 we support archive Strings with the archive regions.
The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page.
Dormant Objects
Dormant objects are unreachable java objects within the open archive heap region. 
A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. 
Object State Transition
All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object.
Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching.
Caching Java Objects at Archive Dump Time
The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location.
Caching Constant Pool resolved_reference Array
The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime.
All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated.
At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. 
Runtime Java Heap With Cached Java Objects
 

The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. 

Preliminary test execution and status:

JPRT: passed
Tier2-rt: passed
Tier2-gc: passed
Tier2-comp: passed
Tier3-rt: passed
Tier3-gc: passed
Tier3-comp: passed
Tier4-rt: passed
Tier4-gc: passed
Tier4-comp:6 jobs timed out, all other tests passed
Tier5-rt: one test failed but passed when running locally, all other tests passed
Tier5-gc: passed
Tier5-comp: running
hotspot_gc: two jobs timed out, all other tests passed
hotspot_gc in CDS mode: two jobs timed out, all other tests passed
vm.gc: passed
vm.gc in CDS mode: passed
Kichensink: passed
Kichensink in CDS mode: passed

Thanks,
Jiangli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/e4aad6c3/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Heap%20Regions-2.jpeg
Type: image/jpeg
Size: 14517 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/e4aad6c3/Heap20Regions-2.jpeg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Runtime%20Java%20Heap%20with%20Cached%20Objects.jpeg
Type: image/jpeg
Size: 20448 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/e4aad6c3/Runtime20Java20Heap20with20Cached20Objects.jpeg>

From jiangli.zhou at oracle.com  Thu Jul 27 20:37:16 2017
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Thu, 27 Jul 2017 13:37:16 -0700
Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache
 resolved_reference arrays in CDS archive
In-Reply-To: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com>
References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com>
Message-ID: <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com>

Sorry, the mail didn?t handle the rich text well. I fixed the format below.

Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required.

The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself.
RFE:        https://bugs.openjdk.java.net/browse/JDK-8179302
hotspot:   http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/
whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/

Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants.

Types of Pinned G1 Heap Regions

The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type.

00100 0 [ 8] Pinned Mask 
01000 0 [16] Old Mask
10000 0 [32] Archive Mask 
11100 0 [56] Open Archive:   ArchiveMask | PinnedMask | OldMask
11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1


Pinned Regions

Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed.

Archive Regions

The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive.

An archive region is also an old region by design.

Open Archive (GC-RW) Regions

Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC.
Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC.

Adjustable Outgoing Pointers

As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly.

Closed Archive (GC-RO) Regions

The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region.
In JDK 9 we support archive Strings with the archive regions.

The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page.

Dormant Objects

Dormant objects are unreachable java objects within the open archive heap region. 
A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. 

Object State Transition

All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object.

Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching.

Caching Java Objects at Archive Dump Time

The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location.

Caching Constant Pool resolved_references Array

The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime.

All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated.
At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. 

Runtime Java Heap With Cached Java Objects


The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. 

Preliminary test execution and status:

JPRT: passed
Tier2-rt: passed
Tier2-gc: passed
Tier2-comp: passed
Tier3-rt: passed
Tier3-gc: passed
Tier3-comp: passed
Tier4-rt: passed
Tier4-gc: passed
Tier4-comp:6 jobs timed out, all other tests passed
Tier5-rt: one test failed but passed when running locally, all other tests passed
Tier5-gc: passed
Tier5-comp: running
hotspot_gc: two jobs timed out, all other tests passed
hotspot_gc in CDS mode: two jobs timed out, all other tests passed
vm.gc: passed
vm.gc in CDS mode: passed
Kichensink: passed
Kichensink in CDS mode: passed

Thanks,
Jiangli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/e9175df3/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Heap%20Regions-2.jpeg
Type: image/jpeg
Size: 14517 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/e9175df3/Heap20Regions-2.jpeg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Runtime%20Java%20Heap%20with%20Cached%20Objects.jpeg
Type: image/jpeg
Size: 20448 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170727/e9175df3/Runtime20Java20Heap20with20Cached20Objects.jpeg>

From mikael.gerdin at oracle.com  Fri Jul 28 12:50:46 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Fri, 28 Jul 2017 14:50:46 +0200
Subject: RFR (9) 8185133: Reference pending list root might not get marked
Message-ID: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>

Hi all,

Please review this fix to a tricky reference processing / conc marking 
bug affecting G1 in 9.

The bug occurs when a weak reference WR is promoted to old and 
discovered during an initial mark pause. The WR is the referent of a 
soft reference SR. The concurrent reference processor determines that SR 
should be treated as a weak reference due to shortage of memory and now 
WR is reachable only from the reference pending list but not explicitly 
marked in the bitmap since objects promoted during the initial mark 
pause are not marked immediately.

The reason we are not saved by the SATB pre-barrier here is that 
clearing of the referent field of a reference object does not trigger 
the pre-barrier (and that would kind of defeat its purpose).

Before JDK-8156500 this worked because the reference pending list was a 
static field in the Reference class and the reference class was scanned 
during concurrent marking, so we would never lose track of the pending 
list head.

My suggested fix is to explicitly mark the reference pending list head 
oop during initial mark, after the reference enqueue phase.
This mirrors how other roots are handled in initial mark, see 
G1Mark::G1MarkPromotedFromRoots.

Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0
Bug: https://bugs.openjdk.java.net/browse/JDK-8185133

Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test.

Many thanks to Kim and Erik ? for discussions around this issue!

Thanks
/Mikael


From erik.osterlund at oracle.com  Fri Jul 28 13:00:23 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Fri, 28 Jul 2017 15:00:23 +0200
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
Message-ID: <597B3567.50307@oracle.com>

Hi Mikael,

Looks good.

/Erik

On 2017-07-28 14:50, Mikael Gerdin wrote:
> Hi all,
>
> Please review this fix to a tricky reference processing / conc marking 
> bug affecting G1 in 9.
>
> The bug occurs when a weak reference WR is promoted to old and 
> discovered during an initial mark pause. The WR is the referent of a 
> soft reference SR. The concurrent reference processor determines that 
> SR should be treated as a weak reference due to shortage of memory and 
> now WR is reachable only from the reference pending list but not 
> explicitly marked in the bitmap since objects promoted during the 
> initial mark pause are not marked immediately.
>
> The reason we are not saved by the SATB pre-barrier here is that 
> clearing of the referent field of a reference object does not trigger 
> the pre-barrier (and that would kind of defeat its purpose).
>
> Before JDK-8156500 this worked because the reference pending list was 
> a static field in the Reference class and the reference class was 
> scanned during concurrent marking, so we would never lose track of the 
> pending list head.
>
> My suggested fix is to explicitly mark the reference pending list head 
> oop during initial mark, after the reference enqueue phase.
> This mirrors how other roots are handled in initial mark, see 
> G1Mark::G1MarkPromotedFromRoots.
>
> Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0
> Bug: https://bugs.openjdk.java.net/browse/JDK-8185133
>
> Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test.
>
> Many thanks to Kim and Erik ? for discussions around this issue!
>
> Thanks
> /Mikael


From rkennke at redhat.com  Fri Jul 28 14:53:57 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 28 Jul 2017 16:53:57 +0200
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
Message-ID: <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com>

Hi Mikael,

I don't really understand what the problem is. The WR ends up on the
RPL, with its referent cleared, i.e. no longer pointing to the SR? But
we want to keep the SR alive?

Also, Universe::oops_do() already marks the RPL head, doesn't it?

Roman

> Hi all,
>
> Please review this fix to a tricky reference processing / conc marking
> bug affecting G1 in 9.
>
> The bug occurs when a weak reference WR is promoted to old and
> discovered during an initial mark pause. The WR is the referent of a
> soft reference SR. The concurrent reference processor determines that
> SR should be treated as a weak reference due to shortage of memory and
> now WR is reachable only from the reference pending list but not
> explicitly marked in the bitmap since objects promoted during the
> initial mark pause are not marked immediately.
>
> The reason we are not saved by the SATB pre-barrier here is that
> clearing of the referent field of a reference object does not trigger
> the pre-barrier (and that would kind of defeat its purpose).
>
> Before JDK-8156500 this worked because the reference pending list was
> a static field in the Reference class and the reference class was
> scanned during concurrent marking, so we would never lose track of the
> pending list head.
>
> My suggested fix is to explicitly mark the reference pending list head
> oop during initial mark, after the reference enqueue phase.
> This mirrors how other roots are handled in initial mark, see
> G1Mark::G1MarkPromotedFromRoots.
>
> Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0
> Bug: https://bugs.openjdk.java.net/browse/JDK-8185133
>
> Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test.
>
> Many thanks to Kim and Erik ? for discussions around this issue!
>
> Thanks
> /Mikael


From sangheon.kim at oracle.com  Fri Jul 28 16:10:20 2017
From: sangheon.kim at oracle.com (sangheon)
Date: Fri, 28 Jul 2017 09:10:20 -0700
Subject: Need sponsor to push attached 8184045 into jdk10/hs/hostspt
In-Reply-To: <c559fa67-601d-c337-a166-1fcf4d24778a@oracle.com>
References: <c559fa67-601d-c337-a166-1fcf4d24778a@oracle.com>
Message-ID: <097a40d0-b3e7-2a05-e59f-0b9b1b5965b7@oracle.com>

Hi Alex,

I can sponsor this patch.

Thanks,
Sangheon


On 07/25/2017 07:24 AM, Alexander Harlap wrote:
> I need a sponsor to push attached 8184045.patch - .
>
> Patch should go into jdk10/hs/hotspot
>
> Reviewed by Daniel D. Daugherty and Erik Helin.
>
> Thank you,
>
> Alex
>


From sangheon.kim at oracle.com  Fri Jul 28 16:11:19 2017
From: sangheon.kim at oracle.com (sangheon)
Date: Fri, 28 Jul 2017 09:11:19 -0700
Subject: Need sponsor to push attached 8183973 into jdk10/hs/hostspt
In-Reply-To: <406c98a0-42a2-729b-1a95-a105c52f5dc5@oracle.com>
References: <406c98a0-42a2-729b-1a95-a105c52f5dc5@oracle.com>
Message-ID: <08d27861-e447-4f77-cb57-e0c90ba62b72@oracle.com>

Hi Alex,

I can sponsor this too.

Thanks,
Sangheon


On 07/25/2017 10:37 AM, Alexander Harlap wrote:
> I need a sponsor to push attached 8183973.patch - 
> gc/TestFullGCALot.java fails in JDK10-hs nightly
>
> Patch should go into jdk10/hs/hotspot
>
> Reviewed by  Mikael Gerdin and Erik Osterlund.
>
> Thank you,
>
> Alex
>


From erik.osterlund at oracle.com  Fri Jul 28 17:20:02 2017
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Fri, 28 Jul 2017 19:20:02 +0200
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
 <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com>
Message-ID: <5420167D-79DF-4D66-8BC6-82B6D55A428D@oracle.com>

Hi Roman,

> On 28 Jul 2017, at 16:53, Roman Kennke <rkennke at redhat.com> wrote:
> 
> Hi Mikael,
> 
> I don't really understand what the problem is. The WR ends up on the
> RPL, with its referent cleared, i.e. no longer pointing to the SR? But
> we want to keep the SR alive?

No. The WR gets promoted to old during the initial mark evacuation as it was strongly reachable by a SR in young. The referent of the WR died, and therefore it gets discovered. The assumption is then that since it was strongly reachable from the SR in young, the WR will be found during concurrent marking due to SATB. This is normally a safe assumption.

However, just before finishing the initial mark pause and letting concurrent marking start trace through the heap, soft references may change strength to suddenly become weak. Therefore, the WR in old never gets marked during concurrent marking unless the GC is made aware of the existence of this new strong edge to the pending list head.

This is a problem, because the pending list was in this scenario exposed to Java land through the pending list head, without the concurrent marking knowing about it, violating GC completeness.

> Also, Universe::oops_do() already marks the RPL head, doesn't it?

Reference processing is done after root processing. Therefore the edge to the pending list, created during reference processing, is not yet made available at that time.

Hope that made sense.

Thanks,
/Erik

> Roman
> 
>> Hi all,
>> 
>> Please review this fix to a tricky reference processing / conc marking
>> bug affecting G1 in 9.
>> 
>> The bug occurs when a weak reference WR is promoted to old and
>> discovered during an initial mark pause. The WR is the referent of a
>> soft reference SR. The concurrent reference processor determines that
>> SR should be treated as a weak reference due to shortage of memory and
>> now WR is reachable only from the reference pending list but not
>> explicitly marked in the bitmap since objects promoted during the
>> initial mark pause are not marked immediately.
>> 
>> The reason we are not saved by the SATB pre-barrier here is that
>> clearing of the referent field of a reference object does not trigger
>> the pre-barrier (and that would kind of defeat its purpose).
>> 
>> Before JDK-8156500 this worked because the reference pending list was
>> a static field in the Reference class and the reference class was
>> scanned during concurrent marking, so we would never lose track of the
>> pending list head.
>> 
>> My suggested fix is to explicitly mark the reference pending list head
>> oop during initial mark, after the reference enqueue phase.
>> This mirrors how other roots are handled in initial mark, see
>> G1Mark::G1MarkPromotedFromRoots.
>> 
>> Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185133
>> 
>> Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test.
>> 
>> Many thanks to Kim and Erik ? for discussions around this issue!
>> 
>> Thanks
>> /Mikael
> 
> 


From kim.barrett at oracle.com  Fri Jul 28 18:52:52 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 28 Jul 2017 14:52:52 -0400
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
Message-ID: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com>

> On Jul 28, 2017, at 8:50 AM, Mikael Gerdin <mikael.gerdin at oracle.com> wrote:
> 
> Hi all,
> 
> Please review this fix to a tricky reference processing / conc marking bug affecting G1 in 9.
> 
> The bug occurs when a weak reference WR is promoted to old and discovered during an initial mark pause. The WR is the referent of a soft reference SR. The concurrent reference processor determines that SR should be treated as a weak reference due to shortage of memory and now WR is reachable only from the reference pending list but not explicitly marked in the bitmap since objects promoted during the initial mark pause are not marked immediately.
> 
> The reason we are not saved by the SATB pre-barrier here is that clearing of the referent field of a reference object does not trigger the pre-barrier (and that would kind of defeat its purpose).
> 
> Before JDK-8156500 this worked because the reference pending list was a static field in the Reference class and the reference class was scanned during concurrent marking, so we would never lose track of the pending list head.
> 
> My suggested fix is to explicitly mark the reference pending list head oop during initial mark, after the reference enqueue phase.
> This mirrors how other roots are handled in initial mark, see G1Mark::G1MarkPromotedFromRoots.
> 
> Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0
> Bug: https://bugs.openjdk.java.net/browse/JDK-8185133
> 
> Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test.
> 
> Many thanks to Kim and Erik ? for discussions around this issue!
> 
> Thanks
> /Mikael

------------------------------------------------------------------------------ 
src/share/vm/memory/universe.cpp
 499 oop Universe::reference_pending_list() {
 500   if (Thread::current()->is_VM_thread()) {
 501     assert_pll_locked(is_locked);
 502   } else {
 503     assert_pll_ownership();
 504   }
 505   return _reference_pending_list;
 506 }

I was afraid that conditionalization might be needed.

I think I'd like distinct readers for the different locking context
use cases.  However, I'd be fine with such a distinction being
deferred to JDK 10.

------------------------------------------------------------------------------

Looks good.


From kim.barrett at oracle.com  Fri Jul 28 19:53:43 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 28 Jul 2017 15:53:43 -0400
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <5420167D-79DF-4D66-8BC6-82B6D55A428D@oracle.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
 <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com>
 <5420167D-79DF-4D66-8BC6-82B6D55A428D@oracle.com>
Message-ID: <2971B73D-8175-4E1B-9C73-BD83454CC024@oracle.com>

> On Jul 28, 2017, at 1:20 PM, Erik Osterlund <erik.osterlund at oracle.com> wrote:
> 
> Hi Roman,
> 
>> On 28 Jul 2017, at 16:53, Roman Kennke <rkennke at redhat.com> wrote:
>> 
>> Hi Mikael,
>> 
>> I don't really understand what the problem is. The WR ends up on the
>> RPL, with its referent cleared, i.e. no longer pointing to the SR? But
>> we want to keep the SR alive?
> 
> No. The WR gets promoted to old during the initial mark evacuation as it was strongly reachable by a SR in young. The referent of the WR died, and therefore it gets discovered. The assumption is then that since it was strongly reachable from the SR in young, the WR will be found during concurrent marking due to SATB. This is normally a safe assumption.
> 
> However, just before finishing the initial mark pause and letting concurrent marking start trace through the heap, soft references may change strength to suddenly become weak. Therefore, the WR in old never gets marked during concurrent marking unless the GC is made aware of the existence of this new strong edge to the pending list head.
> 
> This is a problem, because the pending list was in this scenario exposed to Java land through the pending list head, without the concurrent marking knowing about it, violating GC completeness.

I think SR also needs to be promoted by the initial-mark pause.  If SR
is young and not promoted, then it will be a survivor of the
initial-mark pause, and so will be scanned by scan_root_regions.
scan_root_regions doesn't do reference processing, so the scan of the
survivor SR will mark WR.

Here's my understanding of the problem scenario:

(1) initial state

SR => WR => O
WR, and O are young
WR and O are unreachable except through the chain from SR
SR has not expired

(2) initial_mark

SR and WR are both promoted to oldgen.
SR is not discovered, because it has not expired.
WR is discovered and enqueued, because O is unreachable.
WR ends up at the head of the pending list.  This happens after the
initial root scan has examined the head of the pending list.

(3) SR expires

We now have an oldgen WR in the pending list, and no certain path by
which concurrent marking will reach it, even though it is accessible.
(The Java reference processing thread might process and discard it
before any damage is actually done, but that's far from certain.)

So it requires a fairly unlikely sequence of events.

Note: If WR ends up anywhere other than at the head of the pending
list, it will eventually be visited, either by scan_root_region or
normal concurrent marking, depending on its predecessor in the list.
(Assuming its predecessor is not another similar case that *did* end
up at the head of the list.)


From kim.barrett at oracle.com  Fri Jul 28 19:56:10 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 28 Jul 2017 15:56:10 -0400
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
 <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com>
Message-ID: <03C36633-1BC2-4787-A541-725ED87C15BE@oracle.com>

> On Jul 28, 2017, at 2:52 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> Looks good.

Remember to update copyrights.


From ioi.lam at oracle.com  Mon Jul 31 04:07:50 2017
From: ioi.lam at oracle.com (Ioi Lam)
Date: Sun, 30 Jul 2017 21:07:50 -0700
Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache
 resolved_reference arrays in CDS archive
In-Reply-To: <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com>
References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com>
 <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com>
Message-ID: <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com>

Hi Jiangli,

Here are my comments. I've not reviewed the GC code and I'll leave that 
to the GC experts :-)

stringTable.cpp: StringTable::archive_string

     add assert for DumpSharedSpaces only

filemap.cpp

  525 void 
FileMapInfo::write_archive_heap_regions(GrowableArray<MemRegion> *regions,
  526                                              int first_region, int 
num_regions) {

When I first read this function, I found it hard to follow, especially 
this part that coalesces the trailing regions:

  537         int len = regions->length();
  538         if (len > 1) {
  539           start = (char*)regions->at(1).start();
  540           size = (char*)regions->at(len - 1).end() - start;
  541         }
  542       }

The rest of filemap.cpp always perform identical operations on MemRegion 
arrays, which are either 1 or 2 in size. However, this function doesn't 
follow that pattern; it also has a very different notion of "region", 
and the confusing part is regions->size() is not the same as num_regions.

How about we change the API to something like the following? Before 
calling this API, the caller needs to coalesce the trailing G1 regions 
into a single MemRegion.

      FileMapInfo::write_archive_heap_regions(MemRegion *regions, int 
first, int num_regions) {
         if (first == MetaspaceShared::first_string) {
            assert(num_regons <=  MetaspaceShared::max_strings, "...");
         } else {
            assert(first == 
MetaspaceShared::first_open_archive_heap_region, "...");
            assert(num_regons <= 
MetaspaceShared::max_open_archive_heap_region, "...");
}
         ....


  756   if (!string_data_mapped) {
  757     StringTable::ignore_shared_strings(true);
  758     assert(string_ranges == NULL && num_string_ranges == 0, "sanity");
  759   }
  760
  761   if (open_archive_heap_data_mapped) {
  762     MetaspaceShared::set_open_archive_heap_region_mapped();
  763   } else {
  764     assert(open_archive_heap_ranges == NULL && 
num_open_archive_heap_ranges == 0, "sanity");
  765   }

Maybe the two "if" statements should be more consistent? Instead of 
StringTable::ignore_shared_strings, how about 
StringTable::set_shared_strings_region_mapped()?

FileMapInfo::map_heap_data() --

  818     char* addr = (char*)regions[i].start();
  819     char* base = os::map_memory(_fd, _full_path, si->_file_offset,
  820                                 addr, regions[i].byte_size(), 
si->_read_only,
  821                                 si->_allow_exec);

What happens when the first region succeeds to map but the second region 
fails to map? Will both regions be unmapped? I don't see where you store 
the return value (base) from os::map_memory(). Does it mean the code 
assumes that (addr == base). If so, we need an assert here.

constantPool.cpp

      Handle refs_handle;
      ...
      refs_handle = Handle(THREAD, (oop)archived);

This will first create a NULL handle, then construct a temporary handle, 
and then assign the temp handle back to the null handle. This means two 
handles will be pushed onto THREAD->metadata_handles()

I think it's more efficient if you merge these into a single statement

      Handle refs_handle(THREAD, (oop)archived);

Is this experimental code? Maybe it should be removed?

  664     if (tag_at(index).is_unresolved_klass()) {
  665 #if 0
  666       CPSlot entry = cp->slot_at(index);
  667       Symbol* name = entry.get_symbol();
  668       Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD);
  669       if (k != NULL) {
  670         klass_at_put(index, k);
  671       }
  672 #endif
  673     } else

cpCache.hpp:

      u8   _archived_references

shouldn't this be declared as an narrowOop to avoid the type casts when 
it's used?

cpCache.cpp:

     add assert so that one of these is used only at dump time and the 
other only at run time?

  610 oop ConstantPoolCache::archived_references() {
  611   return oopDesc::decode_heap_oop((narrowOop)_archived_references);
  612 }
  613
  614 void ConstantPoolCache::set_archived_references(oop o) {
  615   _archived_references = (u8)oopDesc::encode_heap_oop(o);
  616 }

Thanks!
- Ioi

On 7/27/17 1:37 PM, Jiangli Zhou wrote:
> Sorry, the mail didn?t handle the rich text well. I fixed the format below.
>
> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required.
>
> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself.
> RFE:        https://bugs.openjdk.java.net/browse/JDK-8179302
> hotspot:   http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/
> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/
>
> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants.
>
> Types of Pinned G1 Heap Regions
>
> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type.
>
> 00100 0 [ 8] Pinned Mask
> 01000 0 [16] Old Mask
> 10000 0 [32] Archive Mask
> 11100 0 [56] Open Archive:   ArchiveMask | PinnedMask | OldMask
> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1
>
>
> Pinned Regions
>
> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed.
>
> Archive Regions
>
> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive.
>
> An archive region is also an old region by design.
>
> Open Archive (GC-RW) Regions
>
> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC.
> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC.
>
> Adjustable Outgoing Pointers
>
> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly.
>
> Closed Archive (GC-RO) Regions
>
> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region.
> In JDK 9 we support archive Strings with the archive regions.
>
> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page.
>
> Dormant Objects
>
> Dormant objects are unreachable java objects within the open archive heap region.
> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object.
>
> Object State Transition
>
> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object.
>
> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching.
>
> Caching Java Objects at Archive Dump Time
>
> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location.
>
> Caching Constant Pool resolved_references Array
>
> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime.
>
> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated.
> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles.
>
> Runtime Java Heap With Cached Java Objects
>
>
> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base.
>
> Preliminary test execution and status:
>
> JPRT: passed
> Tier2-rt: passed
> Tier2-gc: passed
> Tier2-comp: passed
> Tier3-rt: passed
> Tier3-gc: passed
> Tier3-comp: passed
> Tier4-rt: passed
> Tier4-gc: passed
> Tier4-comp:6 jobs timed out, all other tests passed
> Tier5-rt: one test failed but passed when running locally, all other tests passed
> Tier5-gc: passed
> Tier5-comp: running
> hotspot_gc: two jobs timed out, all other tests passed
> hotspot_gc in CDS mode: two jobs timed out, all other tests passed
> vm.gc: passed
> vm.gc in CDS mode: passed
> Kichensink: passed
> Kichensink in CDS mode: passed
>
> Thanks,
> Jiangli


From mikael.gerdin at oracle.com  Mon Jul 31 05:40:19 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 31 Jul 2017 07:40:19 +0200
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <2971B73D-8175-4E1B-9C73-BD83454CC024@oracle.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
 <61ef3599-6295-7444-7b3c-e731c52c10fe@redhat.com>
 <5420167D-79DF-4D66-8BC6-82B6D55A428D@oracle.com>
 <2971B73D-8175-4E1B-9C73-BD83454CC024@oracle.com>
Message-ID: <cee188e5-9973-f29e-2f2f-2c471a544afc@oracle.com>

Hi Kim,

On 2017-07-28 21:53, Kim Barrett wrote:
>> On Jul 28, 2017, at 1:20 PM, Erik Osterlund <erik.osterlund at oracle.com> wrote:
>>
>> Hi Roman,
>>
>>> On 28 Jul 2017, at 16:53, Roman Kennke <rkennke at redhat.com> wrote:
>>>
>>> Hi Mikael,
>>>
>>> I don't really understand what the problem is. The WR ends up on the
>>> RPL, with its referent cleared, i.e. no longer pointing to the SR? But
>>> we want to keep the SR alive?
>>
>> No. The WR gets promoted to old during the initial mark evacuation as it was strongly reachable by a SR in young. The referent of the WR died, and therefore it gets discovered. The assumption is then that since it was strongly reachable from the SR in young, the WR will be found during concurrent marking due to SATB. This is normally a safe assumption.
>>
>> However, just before finishing the initial mark pause and letting concurrent marking start trace through the heap, soft references may change strength to suddenly become weak. Therefore, the WR in old never gets marked during concurrent marking unless the GC is made aware of the existence of this new strong edge to the pending list head.
>>
>> This is a problem, because the pending list was in this scenario exposed to Java land through the pending list head, without the concurrent marking knowing about it, violating GC completeness.
> 
> I think SR also needs to be promoted by the initial-mark pause.  If SR
> is young and not promoted, then it will be a survivor of the
> initial-mark pause, and so will be scanned by scan_root_regions.
> scan_root_regions doesn't do reference processing, so the scan of the
> survivor SR will mark WR.
> 
> Here's my understanding of the problem scenario:
> 
> (1) initial state
> 
> SR => WR => O
> WR, and O are young
> WR and O are unreachable except through the chain from SR
> SR has not expired
> 
> (2) initial_mark
> 
> SR and WR are both promoted to oldgen.
> SR is not discovered, because it has not expired.
> WR is discovered and enqueued, because O is unreachable.
> WR ends up at the head of the pending list.  This happens after the
> initial root scan has examined the head of the pending list.
> 
> (3) SR expires
> 
> We now have an oldgen WR in the pending list, and no certain path by
> which concurrent marking will reach it, even though it is accessible.
> (The Java reference processing thread might process and discard it
> before any damage is actually done, but that's far from certain.)
> 
> So it requires a fairly unlikely sequence of events.
> 
> Note: If WR ends up anywhere other than at the head of the pending
> list, it will eventually be visited, either by scan_root_region or
> normal concurrent marking, depending on its predecessor in the list.
> (Assuming its predecessor is not another similar case that *did* end
> up at the head of the list.)

Thanks for this detailed explanation.
/Mikael

> 


From mikael.gerdin at oracle.com  Mon Jul 31 05:40:38 2017
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 31 Jul 2017 07:40:38 +0200
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <03C36633-1BC2-4787-A541-725ED87C15BE@oracle.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
 <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com>
 <03C36633-1BC2-4787-A541-725ED87C15BE@oracle.com>
Message-ID: <817a3f9f-c78f-7093-59c4-164f63a495d9@oracle.com>

Hi Kim,

On 2017-07-28 21:56, Kim Barrett wrote:
>> On Jul 28, 2017, at 2:52 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>> Looks good.
> 
> Remember to update copyrights.
> 

Will do, thanks for the review!


/Mikael


From thomas.schatzl at oracle.com  Mon Jul 31 13:25:23 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 31 Jul 2017 15:25:23 +0200
Subject: RFR (9) 8185133: Reference pending list root might not get marked
In-Reply-To: <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com>
References: <d7151a34-36ff-b00d-2a84-4782264b1090@oracle.com>
 <5A6A5D4F-16F4-4843-8539-1164558DF333@oracle.com>
Message-ID: <1501507523.2394.2.camel@oracle.com>

Hi,

On Fri, 2017-07-28 at 14:52 -0400, Kim Barrett wrote:
> > 
> > On Jul 28, 2017, at 8:50 AM, Mikael Gerdin <mikael.gerdin at oracle.co
> > m> wrote:
> > 
> > Hi all,
> > 
> > Please review this fix to a tricky reference processing / conc
> > marking bug affecting G1 in 9.
> > 
> > The bug occurs when a weak reference WR is promoted to old
> > and[...] My suggested fix is to explicitly mark the reference
> > pending list
> > head oop during initial mark, after the reference enqueue phase.
> > This mirrors how other roots are handled in initial mark, see
> > G1Mark::G1MarkPromotedFromRoots.
> > 
> > Webrev: http://cr.openjdk.java.net/~mgerdin/8185133/webrev.0
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8185133
> > 
> > Testing: JPRT, tier2-5 gc tests, a LOT of runs of the failing test.
> > 
> > Many thanks to Kim and Erik ? for discussions around this issue!
> > 
> > Thanks
> > /Mikael
> -------------------------------------------------------------------
> -----------?
> src/share/vm/memory/universe.cpp
> ?499 oop Universe::reference_pending_list() {
> ?500???if (Thread::current()->is_VM_thread()) {
> ?501?????assert_pll_locked(is_locked);
> ?502???} else {
> ?503?????assert_pll_ownership();
> ?504???}
> ?505???return _reference_pending_list;
> ?506 }
> 
> I was afraid that conditionalization might be needed.
> 
> I think I'd like distinct readers for the different locking context
> use cases.??However, I'd be fine with such a distinction being
> deferred to JDK 10.
> 

Agree with that this code looks ugly, I agree with Kim that fixing this
can wait. Looks good, and great work.

Thomas


From daniel.daugherty at oracle.com  Mon Jul 31 14:24:31 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 08:24:31 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
Message-ID: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>

Greetings,

I have a fix for the following P1 JDK10-hs integration_blocker bug:

     8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly
     https://bugs.openjdk.java.net/browse/JDK-8185273

The fix is 2 lines and the comment describing the fix is 4 lines:

src/share/vm/runtime/thread.cpp:

L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) {
<snip>
L3395:   // This function is used by ParallelSPCleanupTask in safepoint.cpp
L3396:   // for cleaning up JavaThreads, but we have to keep the VMThread's
L3397:   // _oops_do_parity field in sync so we don't miss a parallel GC on
L3398:   // the VMThread.
L3399:   VMThread* vmt = VMThread::vm_thread();
L3400:   (void)vmt->claim_oops_do(true, cp);

I'm also including some new logging for the VMThread (tag == 'vmthread')
that came in useful during this bug hunt. Lastly, I've fixed a few minor
typos that I ran across in the areas where I was hunting.

Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/

There's lots of discussion in the bug. The evaluation comment that I added
on Sunday, July 30 is probably the most complete and hopefully the most 
clear.

For context, here's the webrev for 8180932 and another bug fix:

http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
http://cr.openjdk.java.net/~rkennke/8185102/webrev.01/


Testing:
   - JPRT
   - Test8004741.java has been running in a forever loop with 'fastdebug'
     bits (17200+ iterations) and 'slowdebug' bits (13400+ iterations)

Comments, questions and feedback are welcome.

Dan

P.S.
Roman and I were also thinking about updating
Threads::assert_all_threads_claimed() to verify
that the VMThread is also claimed... Obviously
that's not part of the current patch...


From shade at redhat.com  Mon Jul 31 14:35:28 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 31 Jul 2017 16:35:28 +0200
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
Message-ID: <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>

On 07/31/2017 04:24 PM, Daniel D. Daugherty wrote:
> Greetings,
> 
> I have a fix for the following P1 JDK10-hs integration_blocker bug:
> 
>     8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly
>     https://bugs.openjdk.java.net/browse/JDK-8185273
> 
> The fix is 2 lines and the comment describing the fix is 4 lines:
> 
> src/share/vm/runtime/thread.cpp:
> 
> L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) {
> <snip>
> L3395:   // This function is used by ParallelSPCleanupTask in safepoint.cpp
> L3396:   // for cleaning up JavaThreads, but we have to keep the VMThread's
> L3397:   // _oops_do_parity field in sync so we don't miss a parallel GC on
> L3398:   // the VMThread.
> L3399:   VMThread* vmt = VMThread::vm_thread();
> L3400:   (void)vmt->claim_oops_do(true, cp);
> 
> I'm also including some new logging for the VMThread (tag == 'vmthread')
> that came in useful during this bug hunt. Lastly, I've fixed a few minor
> typos that I ran across in the areas where I was hunting.
> 
> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/

Those changes make sense, thanks.

It is probably worth mentioning that Threads::parallel_java_threads_do should be in sync with
Threads::possibly_parallel_oops_do? It gets easier to point out the symmetry: possibly_parallel_...
claims all Java threads and the VMThread, so this should also claim the VMThread.

-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170731/b4a9789e/signature.asc>

From rkennke at redhat.com  Mon Jul 31 14:42:21 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 31 Jul 2017 16:42:21 +0200
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
Message-ID: <a176cd11-5376-4e7d-f29a-431dc1290334@redhat.com>

Hi Dan,

You could also do_thread() on the VMThread, and let the ThreadClosurer
filter it. I believe the ThreadClosure in safepoint.cpp (currently only
consumer) already filters it. This would make it consistent with
Threads::possibly_parallel_oops_do() (and infact, that latter method
could just use the new Threads::parallel_java_threads_do() but this is
beyond the scope). I leave that to you to decide though.

I'd also include the fix for assert_all_threads_claimed() because it's
related (and the cause for me not noticing this slip). But that is up to
you too. ;-)

In other words, thumbs up, unless you want to add the above points.

And sorry for making such a mess!

Roman

> Greetings,
>
> I have a fix for the following P1 JDK10-hs integration_blocker bug:
>
>     8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly
>     https://bugs.openjdk.java.net/browse/JDK-8185273
>
> The fix is 2 lines and the comment describing the fix is 4 lines:
>
> src/share/vm/runtime/thread.cpp:
>
> L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) {
> <snip>
> L3395:   // This function is used by ParallelSPCleanupTask in
> safepoint.cpp
> L3396:   // for cleaning up JavaThreads, but we have to keep the
> VMThread's
> L3397:   // _oops_do_parity field in sync so we don't miss a parallel
> GC on
> L3398:   // the VMThread.
> L3399:   VMThread* vmt = VMThread::vm_thread();
> L3400:   (void)vmt->claim_oops_do(true, cp);
>
> I'm also including some new logging for the VMThread (tag == 'vmthread')
> that came in useful during this bug hunt. Lastly, I've fixed a few minor
> typos that I ran across in the areas where I was hunting.
>
> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
>
> There's lots of discussion in the bug. The evaluation comment that I
> added
> on Sunday, July 30 is probably the most complete and hopefully the
> most clear.
>
> For context, here's the webrev for 8180932 and another bug fix:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
> http://cr.openjdk.java.net/~rkennke/8185102/webrev.01/
>
>
> Testing:
>   - JPRT
>   - Test8004741.java has been running in a forever loop with 'fastdebug'
>     bits (17200+ iterations) and 'slowdebug' bits (13400+ iterations)
>
> Comments, questions and feedback are welcome.
>
> Dan
>
> P.S.
> Roman and I were also thinking about updating
> Threads::assert_all_threads_claimed() to verify
> that the VMThread is also claimed... Obviously
> that's not part of the current patch...


From daniel.daugherty at oracle.com  Mon Jul 31 15:09:38 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 09:09:38 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
Message-ID: <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>

On 7/31/17 8:35 AM, Aleksey Shipilev wrote:
> On 07/31/2017 04:24 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a fix for the following P1 JDK10-hs integration_blocker bug:
>>
>>      8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly
>>      https://bugs.openjdk.java.net/browse/JDK-8185273
>>
>> The fix is 2 lines and the comment describing the fix is 4 lines:
>>
>> src/share/vm/runtime/thread.cpp:
>>
>> L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) {
>> <snip>
>> L3395:   // This function is used by ParallelSPCleanupTask in safepoint.cpp
>> L3396:   // for cleaning up JavaThreads, but we have to keep the VMThread's
>> L3397:   // _oops_do_parity field in sync so we don't miss a parallel GC on
>> L3398:   // the VMThread.
>> L3399:   VMThread* vmt = VMThread::vm_thread();
>> L3400:   (void)vmt->claim_oops_do(true, cp);
>>
>> I'm also including some new logging for the VMThread (tag == 'vmthread')
>> that came in useful during this bug hunt. Lastly, I've fixed a few minor
>> typos that I ran across in the areas where I was hunting.
>>
>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
> Those changes make sense, thanks.

Thanks for the fast review!


> It is probably worth mentioning that Threads::parallel_java_threads_do should be in sync with
> Threads::possibly_parallel_oops_do? It gets easier to point out the symmetry: possibly_parallel_...
> claims all Java threads and the VMThread, so this should also claim the VMThread.

We would have to be careful about how we phrase that.
Threads::possibly_parallel_oops_do() claims and applies
the closure to all the threads it claims.

Threads::parallel_java_threads_do() is missing the claim
for the VMThread (this bug), but does not apply the
closure to the VMThread.

I think we'll be in good shape once
Threads::assert_all_threads_claimed() is updated to make
sure that the VMThread is claimed. Once that happens, anyone
that uses StrongRootsScope to manage the "claim" protocol
will have a sanity check in place.

Dan


>
> -Aleksey
>


From daniel.daugherty at oracle.com  Mon Jul 31 15:14:45 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 09:14:45 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <a176cd11-5376-4e7d-f29a-431dc1290334@redhat.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a176cd11-5376-4e7d-f29a-431dc1290334@redhat.com>
Message-ID: <a02ccad2-d4c8-908a-1af5-fa49106ae76a@oracle.com>

On 7/31/17 8:42 AM, Roman Kennke wrote:
> Hi Dan,
>
> You could also do_thread() on the VMThread, and let the ThreadClosurer
> filter it. I believe the ThreadClosure in safepoint.cpp (currently only
> consumer) already filters it. This would make it consistent with
> Threads::possibly_parallel_oops_do() (and infact, that latter method
> could just use the new Threads::parallel_java_threads_do() but this is
> beyond the scope). I leave that to you to decide though.

I'm good with just adding the missing part of the "claims" protocol.
I'm not comfortable with applying the closure to the VMThread since
I'm just visiting the GC sandbox as it were... :-)

> I'd also include the fix for assert_all_threads_claimed() because it's
> related (and the cause for me not noticing this slip). But that is up to
> you too. ;-)

Yes, I plan to kick off another JPRT run with the additional fix for
assert_all_threads_claimed()... If that goes well, then I'll include
it...

>
> In other words, thumbs up, unless you want to add the above points.

Thanks for the review!


> And sorry for making such a mess!

No worries. We have it covered.

Dan

P.S.
Reminder: you're supposed to be on vacation! (But I do appreciate
you taking the time to chime in here...)


>
> Roman
>
>> Greetings,
>>
>> I have a fix for the following P1 JDK10-hs integration_blocker bug:
>>
>>      8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly
>>      https://bugs.openjdk.java.net/browse/JDK-8185273
>>
>> The fix is 2 lines and the comment describing the fix is 4 lines:
>>
>> src/share/vm/runtime/thread.cpp:
>>
>> L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) {
>> <snip>
>> L3395:   // This function is used by ParallelSPCleanupTask in
>> safepoint.cpp
>> L3396:   // for cleaning up JavaThreads, but we have to keep the
>> VMThread's
>> L3397:   // _oops_do_parity field in sync so we don't miss a parallel
>> GC on
>> L3398:   // the VMThread.
>> L3399:   VMThread* vmt = VMThread::vm_thread();
>> L3400:   (void)vmt->claim_oops_do(true, cp);
>>
>> I'm also including some new logging for the VMThread (tag == 'vmthread')
>> that came in useful during this bug hunt. Lastly, I've fixed a few minor
>> typos that I ran across in the areas where I was hunting.
>>
>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
>>
>> There's lots of discussion in the bug. The evaluation comment that I
>> added
>> on Sunday, July 30 is probably the most complete and hopefully the
>> most clear.
>>
>> For context, here's the webrev for 8180932 and another bug fix:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
>> http://cr.openjdk.java.net/~rkennke/8185102/webrev.01/
>>
>>
>> Testing:
>>    - JPRT
>>    - Test8004741.java has been running in a forever loop with 'fastdebug'
>>      bits (17200+ iterations) and 'slowdebug' bits (13400+ iterations)
>>
>> Comments, questions and feedback are welcome.
>>
>> Dan
>>
>> P.S.
>> Roman and I were also thinking about updating
>> Threads::assert_all_threads_claimed() to verify
>> that the VMThread is also claimed... Obviously
>> that's not part of the current patch...
>


From shade at redhat.com  Mon Jul 31 15:43:42 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 31 Jul 2017 17:43:42 +0200
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
Message-ID: <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>

On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote:
> On 7/31/17 8:35 AM, Aleksey Shipilev wrote:
>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
>> Those changes make sense, thanks.
> 
> Thanks for the fast review!
> 
> 
>> It is probably worth mentioning that Threads::parallel_java_threads_do should be in sync with
>> Threads::possibly_parallel_oops_do? It gets easier to point out the symmetry: possibly_parallel_...
>> claims all Java threads and the VMThread, so this should also claim the VMThread.
> 
> We would have to be careful about how we phrase that.
> Threads::possibly_parallel_oops_do() claims and applies
> the closure to all the threads it claims.
> 
> Threads::parallel_java_threads_do() is missing the claim
> for the VMThread (this bug), but does not apply the
> closure to the VMThread.

Yeah. It's just I had to work upwards from the gory details explained in the comment to the actual
setup for the bug to appear. I think details about ParallelSPCleanupTask, safepoint.cpp, parity,
etc. are too low-level here, and capture only the current state of affairs. E.g. what if there are
more callers to parallel_java_threads_do in future? What if Parallel SP cleanup ceases to call it?
Would the comment get outdated? Does Threads::parallel_java_threads_do make sense without Parallel
SP cleanup? Yes, it does. Would it make sense to cherry-pick it somewhere else with that comment as
stated? Not really.

AFAIU, the high-level bug is because we have to claim the same subset of threads on all paths. From
that, it becomes obvious that if possibly_parallel_java_threads_do claims VMThread, all other paths
should claim it too.

Something like this:

Threads::parallel_java_threads_do(ThreadClosure* tc) {
   ...

   // Thread claiming protocol requires us to claim the same interesting threads
   // on all paths. Notably, Threads::possibly_parallel_threads_do claims all
   // Java threads *and* the VMThread. To avoid breaking the claiming protocol,
   // we have to appear to claim VMThread on this path too, even if we would not
   // process the VMThread oops.
   VMThread* vmt = VMThread::vm_thread();
   (void)vmt->claim_oops_do(true, cp);

...and then the assert fix would seal the deal.

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170731/e90de6cf/signature.asc>

From daniel.daugherty at oracle.com  Mon Jul 31 16:42:53 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 10:42:53 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <a02ccad2-d4c8-908a-1af5-fa49106ae76a@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a176cd11-5376-4e7d-f29a-431dc1290334@redhat.com>
 <a02ccad2-d4c8-908a-1af5-fa49106ae76a@oracle.com>
Message-ID: <3c7259d1-22ab-c407-3259-a59a71893b13@oracle.com>

On 7/31/17 9:14 AM, Daniel D. Daugherty wrote:
> On 7/31/17 8:42 AM, Roman Kennke wrote:
>> Hi Dan,
>>
>> You could also do_thread() on the VMThread, and let the ThreadClosurer
>> filter it. I believe the ThreadClosure in safepoint.cpp (currently only
>> consumer) already filters it. This would make it consistent with
>> Threads::possibly_parallel_oops_do() (and infact, that latter method
>> could just use the new Threads::parallel_java_threads_do() but this is
>> beyond the scope). I leave that to you to decide though.
>
> I'm good with just adding the missing part of the "claims" protocol.
> I'm not comfortable with applying the closure to the VMThread since
> I'm just visiting the GC sandbox as it were... :-)
>
>> I'd also include the fix for assert_all_threads_claimed() because it's
>> related (and the cause for me not noticing this slip). But that is up to
>> you too. ;-)
>
> Yes, I plan to kick off another JPRT run with the additional fix for
> assert_all_threads_claimed()... If that goes well, then I'll include
> it...

Here's the addition of the assert:

$ diff -C 6 src/share/vm/runtime/thread.cpp.cr0 
src/share/vm/runtime/thread.cpp
*** src/share/vm/runtime/thread.cpp.cr0 Sun Jul 30 18:49:06 2017
--- src/share/vm/runtime/thread.cpp     Mon Jul 31 08:22:47 2017
***************
*** 4360,4371 ****
--- 4360,4375 ----
   void Threads::assert_all_threads_claimed() {
     ALL_JAVA_THREADS(p) {
       const int thread_parity = p->oops_do_parity();
       assert((thread_parity == _thread_claim_parity),
              "Thread " PTR_FORMAT " has incorrect parity %d != %d", 
p2i(p), thread_parity, _thread_claim_parity);
     }
+   VMThread* vmt = VMThread::vm_thread();
+   const int thread_parity = vmt->oops_do_parity();
+   assert((thread_parity == _thread_claim_parity),
+          "VMThread " PTR_FORMAT " has incorrect parity %d != %d", 
p2i(vmt), thread_parity, _thread_claim_parity);
   }
   #endif // ASSERT

   void Threads::possibly_parallel_oops_do(bool is_par, OopClosure* f, 
CodeBlobClosure* cf) {
     int cp = Threads::thread_claim_parity();
     ALL_JAVA_THREADS(p) {


I ran a test JPRT job and there were no problems.

Aleksey and Roman, are you two good with the assert?


Dan


>
>>
>> In other words, thumbs up, unless you want to add the above points.
>
> Thanks for the review!
>
>
>> And sorry for making such a mess!
>
> No worries. We have it covered.
>
> Dan
>
> P.S.
> Reminder: you're supposed to be on vacation! (But I do appreciate
> you taking the time to chime in here...)
>
>
>>
>> Roman
>>
>>> Greetings,
>>>
>>> I have a fix for the following P1 JDK10-hs integration_blocker bug:
>>>
>>>      8185273 Test8004741.java crashes with SIGSEGV in JDK10-hs nightly
>>>      https://bugs.openjdk.java.net/browse/JDK-8185273
>>>
>>> The fix is 2 lines and the comment describing the fix is 4 lines:
>>>
>>> src/share/vm/runtime/thread.cpp:
>>>
>>> L3388: void Threads::parallel_java_threads_do(ThreadClosure* tc) {
>>> <snip>
>>> L3395:   // This function is used by ParallelSPCleanupTask in
>>> safepoint.cpp
>>> L3396:   // for cleaning up JavaThreads, but we have to keep the
>>> VMThread's
>>> L3397:   // _oops_do_parity field in sync so we don't miss a parallel
>>> GC on
>>> L3398:   // the VMThread.
>>> L3399:   VMThread* vmt = VMThread::vm_thread();
>>> L3400:   (void)vmt->claim_oops_do(true, cp);
>>>
>>> I'm also including some new logging for the VMThread (tag == 
>>> 'vmthread')
>>> that came in useful during this bug hunt. Lastly, I've fixed a few 
>>> minor
>>> typos that I ran across in the areas where I was hunting.
>>>
>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
>>>
>>> There's lots of discussion in the bug. The evaluation comment that I
>>> added
>>> on Sunday, July 30 is probably the most complete and hopefully the
>>> most clear.
>>>
>>> For context, here's the webrev for 8180932 and another bug fix:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
>>> http://cr.openjdk.java.net/~rkennke/8185102/webrev.01/
>>>
>>>
>>> Testing:
>>>    - JPRT
>>>    - Test8004741.java has been running in a forever loop with 
>>> 'fastdebug'
>>>      bits (17200+ iterations) and 'slowdebug' bits (13400+ iterations)
>>>
>>> Comments, questions and feedback are welcome.
>>>
>>> Dan
>>>
>>> P.S.
>>> Roman and I were also thinking about updating
>>> Threads::assert_all_threads_claimed() to verify
>>> that the VMThread is also claimed... Obviously
>>> that's not part of the current patch...
>>
>
>


From rkennke at redhat.com  Mon Jul 31 16:46:14 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 31 Jul 2017 18:46:14 +0200
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <a02ccad2-d4c8-908a-1af5-fa49106ae76a@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a176cd11-5376-4e7d-f29a-431dc1290334@redhat.com>
 <a02ccad2-d4c8-908a-1af5-fa49106ae76a@oracle.com>
Message-ID: <2b846656-a319-a2b3-9c32-554498e5f7ba@redhat.com>

Am 31.07.2017 um 17:14 schrieb Daniel D. Daugherty:
> On 7/31/17 8:42 AM, Roman Kennke wrote:
>> Hi Dan,
>>
>> You could also do_thread() on the VMThread, and let the ThreadClosurer
>> filter it. I believe the ThreadClosure in safepoint.cpp (currently only
>> consumer) already filters it. This would make it consistent with
>> Threads::possibly_parallel_oops_do() (and infact, that latter method
>> could just use the new Threads::parallel_java_threads_do() but this is
>> beyond the scope). I leave that to you to decide though.
>
> I'm good with just adding the missing part of the "claims" protocol.
> I'm not comfortable with applying the closure to the VMThread since
> I'm just visiting the GC sandbox as it were... :-)

Ok. I'll do it in a followup then. IMO it would be best if there is
*one* place that does the claiming protocol (i.e.
parallel_java_threads_do() which should probably be renamed to
parallel_threads_do() ), and have possibly_parallel_oops_do() use that
via a private ThreadClosure. Best to do it asap, as long as there's only
1 user it's easy to see that it's correct ;-)


>> I'd also include the fix for assert_all_threads_claimed() because it's
>> related (and the cause for me not noticing this slip). But that is up to
>> you too. ;-)
>
> Yes, I plan to kick off another JPRT run with the additional fix for
> assert_all_threads_claimed()... If that goes well, then I'll include
> it...
Great!

>
> P.S.
> Reminder: you're supposed to be on vacation! (But I do appreciate
> you taking the time to chime in here...)
Yeah, I should be at the Atlantic already, but my son got sick and we
have to delay travel a little bit...

And in reply to Aleksey: yes there will be more callers of
Threads::parallel_java_threads_do() in the future :-) We've got one in
Shenandoah already...

Thanks for doing this!
Cheers,
Roman


From daniel.daugherty at oracle.com  Mon Jul 31 16:47:02 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 10:47:02 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
Message-ID: <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>

On 7/31/17 9:43 AM, Aleksey Shipilev wrote:
> On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote:
>> On 7/31/17 8:35 AM, Aleksey Shipilev wrote:
>>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
>>> Those changes make sense, thanks.
>> Thanks for the fast review!
>>
>>
>>> It is probably worth mentioning that Threads::parallel_java_threads_do should be in sync with
>>> Threads::possibly_parallel_oops_do? It gets easier to point out the symmetry: possibly_parallel_...
>>> claims all Java threads and the VMThread, so this should also claim the VMThread.
>> We would have to be careful about how we phrase that.
>> Threads::possibly_parallel_oops_do() claims and applies
>> the closure to all the threads it claims.
>>
>> Threads::parallel_java_threads_do() is missing the claim
>> for the VMThread (this bug), but does not apply the
>> closure to the VMThread.
> Yeah. It's just I had to work upwards from the gory details explained in the comment to the actual
> setup for the bug to appear. I think details about ParallelSPCleanupTask, safepoint.cpp, parity,
> etc. are too low-level here, and capture only the current state of affairs. E.g. what if there are
> more callers to parallel_java_threads_do in future? What if Parallel SP cleanup ceases to call it?
> Would the comment get outdated? Does Threads::parallel_java_threads_do make sense without Parallel
> SP cleanup? Yes, it does. Would it make sense to cherry-pick it somewhere else with that comment as
> stated? Not really.
>
> AFAIU, the high-level bug is because we have to claim the same subset of threads on all paths. From
> that, it becomes obvious that if possibly_parallel_java_threads_do claims VMThread, all other paths
> should claim it too.
>
> Something like this:
>
> Threads::parallel_java_threads_do(ThreadClosure* tc) {
>     ...
>
>     // Thread claiming protocol requires us to claim the same interesting threads
>     // on all paths. Notably, Threads::possibly_parallel_threads_do claims all
>     // Java threads *and* the VMThread. To avoid breaking the claiming protocol,
>     // we have to appear to claim VMThread on this path too, even if we would not
>     // process the VMThread oops.
>     VMThread* vmt = VMThread::vm_thread();
>     (void)vmt->claim_oops_do(true, cp);

I like your comment better than mine, with a slight tweak:

    // Thread claiming protocol requires us to claim the same 
interesting threads
    // on all paths. Notably, Threads::possibly_parallel_threads_do 
claims all
    // Java threads *and* the VMThread. To avoid breaking the claiming 
protocol,
    // we have to claim VMThread on this path too, even if we do not 
apply the
    // closure to the VMThread.

>
> ...and then the assert fix would seal the deal.

The assert diffs are applied and tested via JPRT. Please see my
other e-mail on this thread...

Dan


>
> Thanks,
> -Aleksey
>


From daniel.daugherty at oracle.com  Mon Jul 31 16:50:24 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 10:50:24 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <2b846656-a319-a2b3-9c32-554498e5f7ba@redhat.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a176cd11-5376-4e7d-f29a-431dc1290334@redhat.com>
 <a02ccad2-d4c8-908a-1af5-fa49106ae76a@oracle.com>
 <2b846656-a319-a2b3-9c32-554498e5f7ba@redhat.com>
Message-ID: <4eb7bf5f-bf1c-3bf6-41fa-59a3c4e8f69b@oracle.com>

On 7/31/17 10:46 AM, Roman Kennke wrote:
> Am 31.07.2017 um 17:14 schrieb Daniel D. Daugherty:
>> On 7/31/17 8:42 AM, Roman Kennke wrote:
>>> Hi Dan,
>>>
>>> You could also do_thread() on the VMThread, and let the ThreadClosurer
>>> filter it. I believe the ThreadClosure in safepoint.cpp (currently only
>>> consumer) already filters it. This would make it consistent with
>>> Threads::possibly_parallel_oops_do() (and infact, that latter method
>>> could just use the new Threads::parallel_java_threads_do() but this is
>>> beyond the scope). I leave that to you to decide though.
>> I'm good with just adding the missing part of the "claims" protocol.
>> I'm not comfortable with applying the closure to the VMThread since
>> I'm just visiting the GC sandbox as it were... :-)
> Ok. I'll do it in a followup then. IMO it would be best if there is
> *one* place that does the claiming protocol (i.e.
> parallel_java_threads_do() which should probably be renamed to
> parallel_threads_do() ), and have possibly_parallel_oops_do() use that
> via a private ThreadClosure. Best to do it asap, as long as there's only
> 1 user it's easy to see that it's correct ;-)

Thanks. I can file a follow up bug:

     cleanup parallel_java_threads_do() and possibly_parallel_oops_do()

and assign it to you if you like...


>>> I'd also include the fix for assert_all_threads_claimed() because it's
>>> related (and the cause for me not noticing this slip). But that is up to
>>> you too. ;-)
>> Yes, I plan to kick off another JPRT run with the additional fix for
>> assert_all_threads_claimed()... If that goes well, then I'll include
>> it...
> Great!

Done. And I sent out the diffs...


>> P.S.
>> Reminder: you're supposed to be on vacation! (But I do appreciate
>> you taking the time to chime in here...)
> Yeah, I should be at the Atlantic already, but my son got sick and we
> have to delay travel a little bit...

Hope your son gets well soon...


> And in reply to Aleksey: yes there will be more callers of
> Threads::parallel_java_threads_do() in the future :-) We've got one in
> Shenandoah already...
>
> Thanks for doing this!
> Cheers,
> Roman
>

Dan


From daniel.daugherty at oracle.com  Mon Jul 31 17:07:14 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 11:07:14 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
Message-ID: <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>

Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/

Only src/share/vm/runtime/thread.cpp is changed relative to round 0:

- Revised the comment in Threads::parallel_java_threads_do.
- Added the assert to Threads::assert_all_threads_claimed().

Comments, questions and feedback are welcome.

Dan


On 7/31/17 10:47 AM, Daniel D. Daugherty wrote:
> On 7/31/17 9:43 AM, Aleksey Shipilev wrote:
>> On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote:
>>> On 7/31/17 8:35 AM, Aleksey Shipilev wrote:
>>>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
>>>> Those changes make sense, thanks.
>>> Thanks for the fast review!
>>>
>>>
>>>> It is probably worth mentioning that 
>>>> Threads::parallel_java_threads_do should be in sync with
>>>> Threads::possibly_parallel_oops_do? It gets easier to point out the 
>>>> symmetry: possibly_parallel_...
>>>> claims all Java threads and the VMThread, so this should also claim 
>>>> the VMThread.
>>> We would have to be careful about how we phrase that.
>>> Threads::possibly_parallel_oops_do() claims and applies
>>> the closure to all the threads it claims.
>>>
>>> Threads::parallel_java_threads_do() is missing the claim
>>> for the VMThread (this bug), but does not apply the
>>> closure to the VMThread.
>> Yeah. It's just I had to work upwards from the gory details explained 
>> in the comment to the actual
>> setup for the bug to appear. I think details about 
>> ParallelSPCleanupTask, safepoint.cpp, parity,
>> etc. are too low-level here, and capture only the current state of 
>> affairs. E.g. what if there are
>> more callers to parallel_java_threads_do in future? What if Parallel 
>> SP cleanup ceases to call it?
>> Would the comment get outdated? Does 
>> Threads::parallel_java_threads_do make sense without Parallel
>> SP cleanup? Yes, it does. Would it make sense to cherry-pick it 
>> somewhere else with that comment as
>> stated? Not really.
>>
>> AFAIU, the high-level bug is because we have to claim the same subset 
>> of threads on all paths. From
>> that, it becomes obvious that if possibly_parallel_java_threads_do 
>> claims VMThread, all other paths
>> should claim it too.
>>
>> Something like this:
>>
>> Threads::parallel_java_threads_do(ThreadClosure* tc) {
>>     ...
>>
>>     // Thread claiming protocol requires us to claim the same 
>> interesting threads
>>     // on all paths. Notably, Threads::possibly_parallel_threads_do 
>> claims all
>>     // Java threads *and* the VMThread. To avoid breaking the 
>> claiming protocol,
>>     // we have to appear to claim VMThread on this path too, even if 
>> we would not
>>     // process the VMThread oops.
>>     VMThread* vmt = VMThread::vm_thread();
>>     (void)vmt->claim_oops_do(true, cp);
>
> I like your comment better than mine, with a slight tweak:
>
>    // Thread claiming protocol requires us to claim the same 
> interesting threads
>    // on all paths. Notably, Threads::possibly_parallel_threads_do 
> claims all
>    // Java threads *and* the VMThread. To avoid breaking the claiming 
> protocol,
>    // we have to claim VMThread on this path too, even if we do not 
> apply the
>    // closure to the VMThread.
>
>>
>> ...and then the assert fix would seal the deal.
>
> The assert diffs are applied and tested via JPRT. Please see my
> other e-mail on this thread...
>
> Dan
>
>
>>
>> Thanks,
>> -Aleksey
>>
>
>


From rkennke at redhat.com  Mon Jul 31 17:10:54 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 31 Jul 2017 19:10:54 +0200
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
Message-ID: <436d8741-a387-8442-1133-cb0a3e78f8c3@redhat.com>

Looks good!

Roman (not an official reviewer)

PS: I've filed JDK-8185580: Refactor
Threads::possibly_parallel_oops_do() to use
Threads::parallel_java_threads_do()
<https://bugs.openjdk.java.net/browse/JDK-8185580> to take care of the
rest for when I get back from vacation.


> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
>
> Only src/share/vm/runtime/thread.cpp is changed relative to round 0:
>
> - Revised the comment in Threads::parallel_java_threads_do.
> - Added the assert to Threads::assert_all_threads_claimed().
>
> Comments, questions and feedback are welcome.
>
> Dan
>
>
> On 7/31/17 10:47 AM, Daniel D. Daugherty wrote:
>> On 7/31/17 9:43 AM, Aleksey Shipilev wrote:
>>> On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote:
>>>> On 7/31/17 8:35 AM, Aleksey Shipilev wrote:
>>>>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
>>>>> Those changes make sense, thanks.
>>>> Thanks for the fast review!
>>>>
>>>>
>>>>> It is probably worth mentioning that
>>>>> Threads::parallel_java_threads_do should be in sync with
>>>>> Threads::possibly_parallel_oops_do? It gets easier to point out
>>>>> the symmetry: possibly_parallel_...
>>>>> claims all Java threads and the VMThread, so this should also
>>>>> claim the VMThread.
>>>> We would have to be careful about how we phrase that.
>>>> Threads::possibly_parallel_oops_do() claims and applies
>>>> the closure to all the threads it claims.
>>>>
>>>> Threads::parallel_java_threads_do() is missing the claim
>>>> for the VMThread (this bug), but does not apply the
>>>> closure to the VMThread.
>>> Yeah. It's just I had to work upwards from the gory details
>>> explained in the comment to the actual
>>> setup for the bug to appear. I think details about
>>> ParallelSPCleanupTask, safepoint.cpp, parity,
>>> etc. are too low-level here, and capture only the current state of
>>> affairs. E.g. what if there are
>>> more callers to parallel_java_threads_do in future? What if Parallel
>>> SP cleanup ceases to call it?
>>> Would the comment get outdated? Does
>>> Threads::parallel_java_threads_do make sense without Parallel
>>> SP cleanup? Yes, it does. Would it make sense to cherry-pick it
>>> somewhere else with that comment as
>>> stated? Not really.
>>>
>>> AFAIU, the high-level bug is because we have to claim the same
>>> subset of threads on all paths. From
>>> that, it becomes obvious that if possibly_parallel_java_threads_do
>>> claims VMThread, all other paths
>>> should claim it too.
>>>
>>> Something like this:
>>>
>>> Threads::parallel_java_threads_do(ThreadClosure* tc) {
>>>     ...
>>>
>>>     // Thread claiming protocol requires us to claim the same
>>> interesting threads
>>>     // on all paths. Notably, Threads::possibly_parallel_threads_do
>>> claims all
>>>     // Java threads *and* the VMThread. To avoid breaking the
>>> claiming protocol,
>>>     // we have to appear to claim VMThread on this path too, even if
>>> we would not
>>>     // process the VMThread oops.
>>>     VMThread* vmt = VMThread::vm_thread();
>>>     (void)vmt->claim_oops_do(true, cp);
>>
>> I like your comment better than mine, with a slight tweak:
>>
>>    // Thread claiming protocol requires us to claim the same
>> interesting threads
>>    // on all paths. Notably, Threads::possibly_parallel_threads_do
>> claims all
>>    // Java threads *and* the VMThread. To avoid breaking the claiming
>> protocol,
>>    // we have to claim VMThread on this path too, even if we do not
>> apply the
>>    // closure to the VMThread.
>>
>>>
>>> ...and then the assert fix would seal the deal.
>>
>> The assert diffs are applied and tested via JPRT. Please see my
>> other e-mail on this thread...
>>
>> Dan
>>
>>
>>>
>>> Thanks,
>>> -Aleksey
>>>
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170731/be737591/attachment.htm>

From daniel.daugherty at oracle.com  Mon Jul 31 17:13:25 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 11:13:25 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <436d8741-a387-8442-1133-cb0a3e78f8c3@redhat.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
 <436d8741-a387-8442-1133-cb0a3e78f8c3@redhat.com>
Message-ID: <c1117f81-71e3-5fb1-788e-766ee7a5601d@oracle.com>

Thanks Roman!

Dan


On 7/31/17 11:10 AM, Roman Kennke wrote:
> Looks good!
>
> Roman (not an official reviewer)
>
> PS: I've filed JDK-8185580: Refactor 
> Threads::possibly_parallel_oops_do() to use 
> Threads::parallel_java_threads_do() 
> <https://bugs.openjdk.java.net/browse/JDK-8185580> to take care of the 
> rest for when I get back from vacation.
>
>
>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
>>
>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0:
>>
>> - Revised the comment in Threads::parallel_java_threads_do.
>> - Added the assert to Threads::assert_all_threads_claimed().
>>
>> Comments, questions and feedback are welcome.
>>
>> Dan
>>
>>
>> On 7/31/17 10:47 AM, Daniel D. Daugherty wrote:
>>> On 7/31/17 9:43 AM, Aleksey Shipilev wrote:
>>>> On 07/31/2017 05:09 PM, Daniel D. Daugherty wrote:
>>>>> On 7/31/17 8:35 AM, Aleksey Shipilev wrote:
>>>>>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185273-webrev/0/
>>>>>> Those changes make sense, thanks.
>>>>> Thanks for the fast review!
>>>>>
>>>>>
>>>>>> It is probably worth mentioning that 
>>>>>> Threads::parallel_java_threads_do should be in sync with
>>>>>> Threads::possibly_parallel_oops_do? It gets easier to point out 
>>>>>> the symmetry: possibly_parallel_...
>>>>>> claims all Java threads and the VMThread, so this should also 
>>>>>> claim the VMThread.
>>>>> We would have to be careful about how we phrase that.
>>>>> Threads::possibly_parallel_oops_do() claims and applies
>>>>> the closure to all the threads it claims.
>>>>>
>>>>> Threads::parallel_java_threads_do() is missing the claim
>>>>> for the VMThread (this bug), but does not apply the
>>>>> closure to the VMThread.
>>>> Yeah. It's just I had to work upwards from the gory details 
>>>> explained in the comment to the actual
>>>> setup for the bug to appear. I think details about 
>>>> ParallelSPCleanupTask, safepoint.cpp, parity,
>>>> etc. are too low-level here, and capture only the current state of 
>>>> affairs. E.g. what if there are
>>>> more callers to parallel_java_threads_do in future? What if 
>>>> Parallel SP cleanup ceases to call it?
>>>> Would the comment get outdated? Does 
>>>> Threads::parallel_java_threads_do make sense without Parallel
>>>> SP cleanup? Yes, it does. Would it make sense to cherry-pick it 
>>>> somewhere else with that comment as
>>>> stated? Not really.
>>>>
>>>> AFAIU, the high-level bug is because we have to claim the same 
>>>> subset of threads on all paths. From
>>>> that, it becomes obvious that if possibly_parallel_java_threads_do 
>>>> claims VMThread, all other paths
>>>> should claim it too.
>>>>
>>>> Something like this:
>>>>
>>>> Threads::parallel_java_threads_do(ThreadClosure* tc) {
>>>>     ...
>>>>
>>>>     // Thread claiming protocol requires us to claim the same 
>>>> interesting threads
>>>>     // on all paths. Notably, Threads::possibly_parallel_threads_do 
>>>> claims all
>>>>     // Java threads *and* the VMThread. To avoid breaking the 
>>>> claiming protocol,
>>>>     // we have to appear to claim VMThread on this path too, even 
>>>> if we would not
>>>>     // process the VMThread oops.
>>>>     VMThread* vmt = VMThread::vm_thread();
>>>>     (void)vmt->claim_oops_do(true, cp);
>>>
>>> I like your comment better than mine, with a slight tweak:
>>>
>>>    // Thread claiming protocol requires us to claim the same 
>>> interesting threads
>>>    // on all paths. Notably, Threads::possibly_parallel_threads_do 
>>> claims all
>>>    // Java threads *and* the VMThread. To avoid breaking the 
>>> claiming protocol,
>>>    // we have to claim VMThread on this path too, even if we do not 
>>> apply the
>>>    // closure to the VMThread.
>>>
>>>>
>>>> ...and then the assert fix would seal the deal.
>>>
>>> The assert diffs are applied and tested via JPRT. Please see my
>>> other e-mail on this thread...
>>>
>>> Dan
>>>
>>>
>>>>
>>>> Thanks,
>>>> -Aleksey
>>>>
>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170731/be9c31c1/attachment.htm>

From shade at redhat.com  Mon Jul 31 17:24:11 2017
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 31 Jul 2017 19:24:11 +0200
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
Message-ID: <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com>

On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote:
> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
> 
> Only src/share/vm/runtime/thread.cpp is changed relative to round 0:
> 
> - Revised the comment in Threads::parallel_java_threads_do.
> - Added the assert to Threads::assert_all_threads_claimed().
> 
> Comments, questions and feedback are welcome.

Looks good!

-Aleksey

P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170731/d350191d/signature.asc>

From daniel.daugherty at oracle.com  Mon Jul 31 17:26:11 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 11:26:11 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
 <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com>
Message-ID: <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com>

Thanks for the re-review! (and for the reworded comment...)

Dan

On 7/31/17 11:24 AM, Aleksey Shipilev wrote:
> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote:
>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
>>
>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0:
>>
>> - Revised the comment in Threads::parallel_java_threads_do.
>> - Added the assert to Threads::assert_all_threads_claimed().
>>
>> Comments, questions and feedback are welcome.
> Looks good!
>
> -Aleksey
>
> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs.
>


From vladimir.kozlov at oracle.com  Mon Jul 31 18:39:35 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 31 Jul 2017 11:39:35 -0700
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
 <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com>
 <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com>
Message-ID: <D945B09B-CB51-47DC-A496-55A2DD84EB44@oracle.com>

Dan

Can you put new code which used for assert check under #ifdef ASSERT to avoid side effects in product code?

Thanks
Vladimir

> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
> 
> Thanks for the re-review! (and for the reworded comment...)
> 
> Dan
> 
>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote:
>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote:
>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
>>> 
>>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0:
>>> 
>>> - Revised the comment in Threads::parallel_java_threads_do.
>>> - Added the assert to Threads::assert_all_threads_claimed().
>>> 
>>> Comments, questions and feedback are welcome.
>> Looks good!
>> 
>> -Aleksey
>> 
>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs.
>> 
> 


From daniel.daugherty at oracle.com  Mon Jul 31 18:49:36 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 12:49:36 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <D945B09B-CB51-47DC-A496-55A2DD84EB44@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
 <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com>
 <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com>
 <D945B09B-CB51-47DC-A496-55A2DD84EB44@oracle.com>
Message-ID: <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com>

On 7/31/17 12:39 PM, Vladimir Kozlov wrote:
> Dan
>
> Can you put new code which used for assert check under #ifdef ASSERT to avoid side effects in product code?

That entire function is in a #ifdef ASSERT:

4360 #ifdef ASSERT
4361 void Threads::assert_all_threads_claimed() {
4362   ALL_JAVA_THREADS(p) {
4363     const int thread_parity = p->oops_do_parity();
4364     assert((thread_parity == _thread_claim_parity),
4365            "Thread " PTR_FORMAT " has incorrect parity %d != %d", 
p2i(p), thread_parity, _thread_claim_parity);
4366   }
4367   VMThread* vmt = VMThread::vm_thread();
4368   const int thread_parity = vmt->oops_do_parity();
4369   assert((thread_parity == _thread_claim_parity),
4370          "VMThread " PTR_FORMAT " has incorrect parity %d != %d", 
p2i(vmt), thread_parity, _thread_claim_parity);
4371 }
4372 #endif // ASSERT

Thanks for the review!

Dan


>
> Thanks
> Vladimir
>
>> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>
>> Thanks for the re-review! (and for the reworded comment...)
>>
>> Dan
>>
>>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote:
>>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote:
>>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
>>>>
>>>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0:
>>>>
>>>> - Revised the comment in Threads::parallel_java_threads_do.
>>>> - Added the assert to Threads::assert_all_threads_claimed().
>>>>
>>>> Comments, questions and feedback are welcome.
>>> Looks good!
>>>
>>> -Aleksey
>>>
>>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs.
>>>


From vladimir.kozlov at oracle.com  Mon Jul 31 20:14:46 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 31 Jul 2017 13:14:46 -0700
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
 <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com>
 <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com>
 <D945B09B-CB51-47DC-A496-55A2DD84EB44@oracle.com>
 <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com>
Message-ID: <07323FAA-AACA-4CA4-ADFF-73816102C23C@oracle.com>

This is what happens when you do review on phone ;)
Sorry for noise. Looks good.

Vladimir

Sent from my iPhone

> On Jul 31, 2017, at 11:49 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
> 
>> On 7/31/17 12:39 PM, Vladimir Kozlov wrote:
>> Dan
>> 
>> Can you put new code which used for assert check under #ifdef ASSERT to avoid side effects in product code?
> 
> That entire function is in a #ifdef ASSERT:
> 
> 4360 #ifdef ASSERT
> 4361 void Threads::assert_all_threads_claimed() {
> 4362   ALL_JAVA_THREADS(p) {
> 4363     const int thread_parity = p->oops_do_parity();
> 4364     assert((thread_parity == _thread_claim_parity),
> 4365            "Thread " PTR_FORMAT " has incorrect parity %d != %d", p2i(p), thread_parity, _thread_claim_parity);
> 4366   }
> 4367   VMThread* vmt = VMThread::vm_thread();
> 4368   const int thread_parity = vmt->oops_do_parity();
> 4369   assert((thread_parity == _thread_claim_parity),
> 4370          "VMThread " PTR_FORMAT " has incorrect parity %d != %d", p2i(vmt), thread_parity, _thread_claim_parity);
> 4371 }
> 4372 #endif // ASSERT
> 
> Thanks for the review!
> 
> Dan
> 
> 
>> 
>> Thanks
>> Vladimir
>> 
>>> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>> 
>>> Thanks for the re-review! (and for the reworded comment...)
>>> 
>>> Dan
>>> 
>>>>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote:
>>>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote:
>>>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
>>>>> 
>>>>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0:
>>>>> 
>>>>> - Revised the comment in Threads::parallel_java_threads_do.
>>>>> - Added the assert to Threads::assert_all_threads_claimed().
>>>>> 
>>>>> Comments, questions and feedback are welcome.
>>>> Looks good!
>>>> 
>>>> -Aleksey
>>>> 
>>>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs.
>>>> 
> 


From daniel.daugherty at oracle.com  Mon Jul 31 20:56:20 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 14:56:20 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <07323FAA-AACA-4CA4-ADFF-73816102C23C@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
 <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com>
 <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com>
 <D945B09B-CB51-47DC-A496-55A2DD84EB44@oracle.com>
 <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com>
 <07323FAA-AACA-4CA4-ADFF-73816102C23C@oracle.com>
Message-ID: <3871c04c-a7f2-8a61-cdd0-67aff6ad0bf0@oracle.com>

Thanks!

Dan


On 7/31/17 2:14 PM, Vladimir Kozlov wrote:
> This is what happens when you do review on phone ;)
> Sorry for noise. Looks good.
>
> Vladimir
>
> Sent from my iPhone
>
>> On Jul 31, 2017, at 11:49 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>
>>> On 7/31/17 12:39 PM, Vladimir Kozlov wrote:
>>> Dan
>>>
>>> Can you put new code which used for assert check under #ifdef ASSERT to avoid side effects in product code?
>> That entire function is in a #ifdef ASSERT:
>>
>> 4360 #ifdef ASSERT
>> 4361 void Threads::assert_all_threads_claimed() {
>> 4362   ALL_JAVA_THREADS(p) {
>> 4363     const int thread_parity = p->oops_do_parity();
>> 4364     assert((thread_parity == _thread_claim_parity),
>> 4365            "Thread " PTR_FORMAT " has incorrect parity %d != %d", p2i(p), thread_parity, _thread_claim_parity);
>> 4366   }
>> 4367   VMThread* vmt = VMThread::vm_thread();
>> 4368   const int thread_parity = vmt->oops_do_parity();
>> 4369   assert((thread_parity == _thread_claim_parity),
>> 4370          "VMThread " PTR_FORMAT " has incorrect parity %d != %d", p2i(vmt), thread_parity, _thread_claim_parity);
>> 4371 }
>> 4372 #endif // ASSERT
>>
>> Thanks for the review!
>>
>> Dan
>>
>>
>>> Thanks
>>> Vladimir
>>>
>>>> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>>>
>>>> Thanks for the re-review! (and for the reworded comment...)
>>>>
>>>> Dan
>>>>
>>>>>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote:
>>>>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote:
>>>>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
>>>>>>
>>>>>> Only src/share/vm/runtime/thread.cpp is changed relative to round 0:
>>>>>>
>>>>>> - Revised the comment in Threads::parallel_java_threads_do.
>>>>>> - Added the assert to Threads::assert_all_threads_claimed().
>>>>>>
>>>>>> Comments, questions and feedback are welcome.
>>>>> Looks good!
>>>>>
>>>>> -Aleksey
>>>>>
>>>>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after this lands to jdk10/hs.
>>>>>


From daniel.daugherty at oracle.com  Mon Jul 31 22:57:21 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 31 Jul 2017 16:57:21 -0600
Subject: URGENT RFR (S): fix for Test8004741.java crashes with SIGSEGV in
 JDK10-hs nightly (8185273)
In-Reply-To: <3871c04c-a7f2-8a61-cdd0-67aff6ad0bf0@oracle.com>
References: <6f413b14-e250-cd2f-85ad-e5eb345e9603@oracle.com>
 <a7102a03-ff5e-b1ec-6b58-075a7692522c@redhat.com>
 <46716761-6f2c-aed1-e26e-3541e1fd84dc@oracle.com>
 <e3912735-c573-ccbc-64e7-3fe40edb4eff@redhat.com>
 <666db742-9fae-768b-fb86-93a086381ec6@oracle.com>
 <797922fb-4893-bfbd-8edd-b47b22f64d36@oracle.com>
 <1509ee63-b50f-84d1-9510-cad10b6a4411@redhat.com>
 <56d71fea-a6f2-9103-73be-95cd76cd17e7@oracle.com>
 <D945B09B-CB51-47DC-A496-55A2DD84EB44@oracle.com>
 <18aa8b5e-d128-8279-c5cc-b6effc79714b@oracle.com>
 <07323FAA-AACA-4CA4-ADFF-73816102C23C@oracle.com>
 <3871c04c-a7f2-8a61-cdd0-67aff6ad0bf0@oracle.com>
Message-ID: <8df591e7-60fe-c0f9-bcf2-a974a6d810d1@oracle.com>

Final local testing numbers for this fix:

20062 runs on slowdebug bits; of those, 802 were in the right
     sequence of VM-ops for the crash
28523 runs on fastdebug bits; of those, 1915 were in the right
     sequence of VM-ops for the crash

Dan


On 7/31/17 2:56 PM, Daniel D. Daugherty wrote:
> Thanks!
>
> Dan
>
>
> On 7/31/17 2:14 PM, Vladimir Kozlov wrote:
>> This is what happens when you do review on phone ;)
>> Sorry for noise. Looks good.
>>
>> Vladimir
>>
>> Sent from my iPhone
>>
>>> On Jul 31, 2017, at 11:49 AM, Daniel D. Daugherty 
>>> <daniel.daugherty at oracle.com> wrote:
>>>
>>>> On 7/31/17 12:39 PM, Vladimir Kozlov wrote:
>>>> Dan
>>>>
>>>> Can you put new code which used for assert check under #ifdef 
>>>> ASSERT to avoid side effects in product code?
>>> That entire function is in a #ifdef ASSERT:
>>>
>>> 4360 #ifdef ASSERT
>>> 4361 void Threads::assert_all_threads_claimed() {
>>> 4362   ALL_JAVA_THREADS(p) {
>>> 4363     const int thread_parity = p->oops_do_parity();
>>> 4364     assert((thread_parity == _thread_claim_parity),
>>> 4365            "Thread " PTR_FORMAT " has incorrect parity %d != 
>>> %d", p2i(p), thread_parity, _thread_claim_parity);
>>> 4366   }
>>> 4367   VMThread* vmt = VMThread::vm_thread();
>>> 4368   const int thread_parity = vmt->oops_do_parity();
>>> 4369   assert((thread_parity == _thread_claim_parity),
>>> 4370          "VMThread " PTR_FORMAT " has incorrect parity %d != 
>>> %d", p2i(vmt), thread_parity, _thread_claim_parity);
>>> 4371 }
>>> 4372 #endif // ASSERT
>>>
>>> Thanks for the review!
>>>
>>> Dan
>>>
>>>
>>>> Thanks
>>>> Vladimir
>>>>
>>>>> On Jul 31, 2017, at 10:26 AM, Daniel D. Daugherty 
>>>>> <daniel.daugherty at oracle.com> wrote:
>>>>>
>>>>> Thanks for the re-review! (and for the reworded comment...)
>>>>>
>>>>> Dan
>>>>>
>>>>>>> On 7/31/17 11:24 AM, Aleksey Shipilev wrote:
>>>>>>> On 07/31/2017 07:07 PM, Daniel D. Daugherty wrote:
>>>>>>> Latest webrev: http://cr.openjdk.java.net/~dcubed/8185273-webrev/1/
>>>>>>>
>>>>>>> Only src/share/vm/runtime/thread.cpp is changed relative to 
>>>>>>> round 0:
>>>>>>>
>>>>>>> - Revised the comment in Threads::parallel_java_threads_do.
>>>>>>> - Added the assert to Threads::assert_all_threads_claimed().
>>>>>>>
>>>>>>> Comments, questions and feedback are welcome.
>>>>>> Looks good!
>>>>>>
>>>>>> -Aleksey
>>>>>>
>>>>>> P.S. Roman: I'm going to cherry-pick that to Shenandoah after 
>>>>>> this lands to jdk10/hs.
>>>>>>
>
>