From thomas.schatzl at oracle.com  Tue Oct  1 08:59:30 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 1 Oct 2019 10:59:30 +0200
Subject: G1 patch of elastic Java heap
In-Reply-To: <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>
References: <6270ce59-4a8e-431e-9ccf-f6d2c0f927eb.maoliang.ml@alibaba-inc.com>
 <d82e704831f0afbc61f8a3fb6b69bb1463b7ede8.camel@oracle.com>
 <e4ffd4d9-3ec0-4592-ac8c-d5a77c6b2e75.maoliang.ml@alibaba-inc.com>
 <1267a5dd2cf6cc1d03df64d07a06ba0f45195951.camel@oracle.com>
 <3140197d-8cab-4a86-af92-58431c74cb6b.maoliang.ml@alibaba-inc.com>
 <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>
Message-ID: <73a32f24-3f6e-8396-779f-5f21284200e9@oracle.com>

Hi Liang,

   just to you: I am looking into your changes, I need some time to 
think about what you wrote here and trying to find out how this works in 
the patch.

Thanks,
   Thomas


From shade at redhat.com  Tue Oct  1 10:48:35 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 1 Oct 2019 12:48:35 +0200
Subject: RFR (S) 8231667: Shenandoah: Full GC should take empty regions into
 slices for compaction
Message-ID: <0b56c1ea-00c5-dbdc-7e2a-556c4147f26f@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8231667

Fix:
  https://cr.openjdk.java.net/~shade/8231667/wevrev.01/

There is a problem with current Full GC that makes some tests fail with OOME unnecessarily. See
details in the bug report.

Testing: {x86_64, x86_32} hotspot_gc_shenandoah, affected tests

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Tue Oct  1 12:07:38 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 1 Oct 2019 14:07:38 +0200
Subject: RFR (S) 8231667: Shenandoah: Full GC should take empty regions
 into slices for compaction
In-Reply-To: <0b56c1ea-00c5-dbdc-7e2a-556c4147f26f@redhat.com>
References: <0b56c1ea-00c5-dbdc-7e2a-556c4147f26f@redhat.com>
Message-ID: <a970a586-1376-c10e-725b-a5f361f6f959@redhat.com>

Ok.

Thanks,
Roman

> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8231667
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8231667/wevrev.01/
> 
> There is a problem with current Full GC that makes some tests fail with OOME unnecessarily. See
> details in the bug report.
> 
> Testing: {x86_64, x86_32} hotspot_gc_shenandoah, affected tests
> 


From sangheon.kim at oracle.com  Tue Oct  1 16:43:58 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 1 Oct 2019 09:43:58 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
Message-ID: <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>

Hi Kim and others,

This webrev.2 simplified a bit more after changing 'heap expansion' 
approach.
Previously heap may expand with preferred numa id which means contiguous 
same numa id heap regions may exist but current version is assuming to 
have evenly split heap regions. i.e. 4 numa node system, heap regions 
will be 012301230123, so if we know address or heap region index, we can 
know preferred numa id.

Many codes related to support previous style expansion were removed.


On 9/24/19 6:44 PM, Kim Barrett wrote:
>> On Sep 21, 2019, at 1:19 AM,sangheon.kim at oracle.com  wrote:
>>
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.1
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.1.inc  (this may not help much! :) )
>> Testing: hs-tier 1 ~ 5 (with/without UseNUMA)
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1AllocRegion.hpp
>    96   uint _node_index;
>
> Protected; should be private.
_node_index is used from derived classes.
Are you suggesting to add a getter?

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1Allocator.cpp
>    42   _mutator_alloc_region(NULL),
>
> Should be _mutator_alloc_regions (plural), since it's now an array.
>
> Similarly, these should be pluralized:
>    67 void G1Allocator::init_mutator_alloc_region() {
>    74 void G1Allocator::release_mutator_alloc_region() {
>
> And this
>    48   // The number of MutatorAllocRegions used, one per memory node.
>    49   size_t _num_alloc_region;
Done

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1Allocator.cpp
>    53 G1Allocator::~G1Allocator() {
>    54   for (uint i = 0; i < _num_alloc_region; i++) {
>    55     _mutator_alloc_region[i].~MutatorAllocRegion();
>    56   }
>    57   FREE_C_HEAP_ARRAY(MutatorAllocRegion, _mutator_alloc_region);
>    58 }
>
> --- should also be calling _mutator_alloc_region[i].release() ??
> --- or does destructor do that?
No, release() is never called.
release() is not actually releasing allocated resources but sets null to 
pointers and inc/dec some numbers such as used bytes. So I was thinking 
we don't need to call release().

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1Arguments.cpp
>   161   if (UseNUMA) {
>   162     if (FLAG_IS_DEFAULT(AlwaysPreTouch)) {
>   163       FLAG_SET_DEFAULT(AlwaysPreTouch, true);
>   164     }
>   165     if (!AlwaysPreTouch && FLAG_IS_CMDLINE(AlwaysPreTouch)) {
>   166       warning("Disabling AlwaysPreTouch is incompatible with UseNUMA. Disabling UseNUMA.");
>   167       FLAG_SET_ERGO(UseNUMA, false);
>   168     }
>   169   }
>
> Stefan asked about why AlwaysPreTouch is required when UseNUMA. I have
> a different question. Assuming UseNUMA does require AlwaysPreTouch,
> why is !AlwaysPreTouch winning here? Why not have UseNUMA win if they
> are conflicting?
As webrev.2 removes above code, we can skip this discussion?

> But see discussion below about
> G1RegionsSmallerThanCommitSizeMapper::commit_regions(), which
> suggested AlwaysPreTouch is required.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>    83 G1PageBasedVirtualSpace::~G1PageBasedVirtualSpace() {
> ...
>    92   _numa                   = NULL;
>    93 }
>
> [pre-existing] Destructors are for resource management. Nulling out /
> zeroing out members in a destructor generally isn't useful. This is
> really a comment on the existing code rather than a request to change
> anything. The addition of line 92 is okay in context, just the context
> is not good.
Agreed on pre-existing.
The intent here is to align with existing context, so leave as is?

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp
>   108         _next_node_index(G1MemoryNodeManager::mgr()->num_active_nodes() - 1),
>   109         _max_node_index(G1MemoryNodeManager::mgr()->num_active_nodes()) {
>
> Consider reversing the order of these members and their initializers,
> so the _next_node_index can use _max_node_index rather than another
> call to num_active_nodes().
Good point!
However this newly added part is removed.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp
>   113     uint next_node_index() const {
>   114       return _next_node_index;
>   115     }
>
> I think this is mis-named.  It's the current index for the
> distributor.  I think it should just be called "node_index".
Agree, but this line is also removed.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp
>
> I'm confused by G1RegionsSmallerThanCommitSizeMapper::commit_regions().
>
> For the LargerThan mapper, we have a sequence of regions that
> completely covers a sequence of pages, and we commit all of the
> associated pages using the requested node_index.
>
> For the SmallerThan mapper, we have a sequence of regions split up
> into subsequences that are each contained in a single page.  The first
> such subsequence might be on an already committed page.  Similarly for
> the last subsequence.  Nothing is done to those pages.
>
> In between there may be a series of region sequences, with each region
> sequence on a single page.  If there are more than one of these region
> sequences then more than one page will need to be committed.
>
> As we step through the seuqnce of pages and commit them, we also step
> the numa index to use for each page.
>
> Stefan asked a question in this area about the mechanism by which the
> node stepping is provided, and you responded with what sounds like an
> improvement.  But I have a different question.
>
> Why are we committing different pages on different numa nodes? The
> caller requested these regions be on the requested node. Why are we
> not honoring that request (as much as possible within the constraints
> of possible leading and trailing regions being on already committed
> pages.) The comment for G1NodeDistributor discusses (at a high level)
> what it's doing (e.g. a short summary of the above description), but
> there is no discussion of why that distribution is needed or
> desirable.
If I understand your question correctly, we do honor 'requested node 
index' at G1RegionSmallerThan case.
Please look at 'G1NodeDistributor::next()'.

 ??? void next() {
 ????? if (_requested_node_index == G1MemoryNodeManager::AnyNodeIndex) {
 ??????? _node_index = (_node_index + 1) % _max_node_index;
 ????? } else {
 ??????? _node_index = _requested_node_index;
 ????? }

If _requested_node_index is AnyNodeIndex, we cycle through valid node 
indices.
This code is also removed.
So G1NUMA is responsible to decide preferred node and upper APIs only 
decide whether need to expand or not.

> There might be a good reason for this behavior, in which case your
> response with an improvement sounds good.  But if so, I'm guessing I
> won't be the only one who doesn't know what that reason might be, and
> it would be good to provide an explanatory comment.  And of course, if
> there isn't a good reason...
>
> I think there is also a problem here if AlwaysPreTouch is false. (As
> discussed earlier, maybe it isn't required to be true.) The node index
> for the committed regions gets set (in make_regions_available) via the
> result of the syscall, so we really need pretouch to have been done.
> The alternative would be to assume commit_regions used the requested
> numa node.  But with the request stepping that wouldn't hold.  Of
> course, it also doesn't hold for any leading or trailing regions that
> were covered by already committed pages.
>
> I think this is the basis of your argument that AlwaysPreTouch is
> required for UseNUMA, and I think I'm now agreeing.  Otherwise we may
> think the leading and trailing regions in the sequence are on a
> different node than they actually are, since the associated pages may
> have already been committed on a different node than requested, but
> not yet touched.
The leading regions are only committed, so other regions which belong to 
same page will not actually committed, so touching issue doesn't happen.
SmallerThan class is supposed to handle this situation.

> But I still don't know why we would want to cycle through nodes for
> the "middle" pages.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionManager.cpp
>   127     if (G1VerifyNUMAIdOfHeapRegions) {
>   128       // Read actual node index via system call.
>   129       uint actual_node_index = mgr->index_of_address(hr->bottom());
>   130       if (hr->node_index() != actual_node_index) {
>
> Can we actually do this here?  I thought the system call only gave a
> useful answer if the addressed location has been paged in.  I'm not
> sure that's necessarily happened at this point.
At webrev.1:
We can do this here because webrev.1 assumes AlwaysPreTouch is enabled.
So at the time of commit, we pretouch as soon as commit is finished. And 
we can check actual node id here.

At webrev.2:
AlwaysPreTouch is NOT coupled with UseNUMA. We trust OS that the 
requested memory will be located on preferred node. i.e. we don't 
actually touch the memory.

> I think Stefan suggested the logging of mismatches between requested
> and actual numa node for a region should occur at region retirement.
> We could log mismatches there and correct the region's information.
>
> But see discussion above about
> G1RegionsSmallerThanCommitSizeMapper::commit_regions().  If
> AlwaysPreTouch is indeed required, then this code is okay.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionSet.inline.hpp
>   156   HeapRegion * cur;
>
> s/HeapRegion */HeapRegion*/
Done

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>    38   static const uint InvalidNodeIndex = (uint)os::InvalidId;
>    39   static const uint AnyNodeIndex = (uint)os::AnyId;
>
> I complained about these in my review of webrev.0. These are making
> very strong assumptions about the values of the os Id values, for no
> good reason that I can see. You responded
>
> "But the intend is to make same value after casting for same meaning
> constants instead of randomly chosen ones."
>
> I don't buy that. There aren't any casts that I can see between NUMA
> ids and indexes. Nor should there be any such casts. If there were,
> I'd strongly question them, as being units mismatches.
Okay,
Fixed similar to your previous comment.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>    54   virtual const int* node_ids() const { static int dummy_id = 0; return &dummy_id; }
>
> dummy_id should be const.
>
> I would probably put that definition in the .cpp file. I've run into
> multiple compiler bugs with function scoped static variables in inline
> functions. Not recently, but I'm paranoid.
Good to know.
Done

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>    34 class G1MemoryNodeManager : public CHeapObj<mtGC> {
> ...
>    43   static G1MemoryNodeManager* create();
>
> Given that we have a factory function that should be used for
> creation, the constructor ought to be non-public.  It needs to be
> protected so the derived G1MemoryNodeManager can refer to it.
Changed to protected.

> A different approach would have G1MemoryNodeManager be abstract (with
> all virtuals but the destructor being pure), with hidden (possibly
> private nested) classes for the single-node and multi-node cases.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
>    62   if (UseNUMA SOLARIS_ONLY(&& false)) {
>
> I thought we were only providing Linux support.  This seems like it
> would attempt to work (and probably fail somewhere later) on other
> platforms (anything not Linux or Solaris).
You are right.
Changed to use LINUX_ONLY() macro.
Please correct me if I misunderstood your point. :)

All platforms are allowed to set +UseNUMA and eventually get some 
benefit from UseNUMAInterleaving.
Treating Windows and Mac are easy because those have only one active 
node. However, Solaris may have multiple active nodes, so above line is 
added. I believe that is one of the simplest way to filter out Solaris 
case on top of existing filtering logic(active node check).

But as you pointed out, previous one is not that clear so changed to use 
LINUX_ONLY().

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
>    64     G1NUMA* numa = new G1NUMA();
>    65
>    66     if (numa != NULL) {
>
> numa cannot be NULL here.
Done

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
>    86 G1MemoryMultiNodeManager::~G1MemoryMultiNodeManager() {
>    87   delete _numa;
>    88 }
>
> This is leaving a stale pointer to the G1NUMA object in wherever
> G1NUMA::set_numa stashed it.
G1NUMA::_inst = NULL
is added at the dtor of G1NUMA because G1NUMA::set_numa() sets '_inst'.
Correct me if I misunderstood.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> 1500   _mem_node_mgr(G1MemoryNodeManager::create()),
>
> Maybe this manager should be created in G1CH::initialize() (and
> initialized here to NULL).
No.
We need the number of active node ids at G1Allocator::G1Allocator.
This is the reason why G1MemoryNodeManager is created earlier. 
Previously HeapRegionManager also had dependency but it is removed now.

> Then the page_size could be passed to create, and there wouldn't be a
> need to later set the page size of the manager and pass that along to
> the G1NUMA, instead both getting it as a constructor argument.  Then
> the associated setters go away too.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.hpp
>    95   // Gets a next valid numa id.
>    96   inline int next_numa_id();
>
> Appears to be unused.
Removed.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
>   102 uint G1MemoryMultiNodeManager::index_of_current_thread() const {
>   103   int node_id = os::numa_get_group_id();
>   104   return _numa->index_of_numa_id(node_id);
>   105 }
>
> Other than here, os::numa_xxx usage is encapsulated in G1NUMA, with
> the manager forwarding to the G1NUMA object as needed.  I suggest
> doing that here too.  (Note that this file doesn't #include os.hpp.)
> I think doing so eliminates the need for G1NUMA::index_of_numa_id(),
> which also seems like a good thing.
Done.
Added G1NUMA::index_of_current_thread() to remove os call.
However still we need G1NUMA::index_of_numa_id() which is used at 
G1NUMA::index_of_address(HeapWord*).

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.cpp
>    92 void G1NUMA::touch_memory(address aligned_address, size_t size_in_bytes, uint numa_index) {
>
> Assert aligned_address is page aligned?
> Assert size_in_bytes is a page aligned?
Added 2 assertions as you commented.
The first one 'aligned_address' means page size aligned though.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.cpp
>    42   memset(_numa_id_to_index_map,
>    43          G1MemoryNodeManager::InvalidNodeIndex,
>    44          sizeof(uint) * _len_numa_id_to_index_map);
>
> memset only works here because all bytes of InvalidNodeIndex happen to
> have the same value.  I would prefer an explicit fill loop rather than
> memset here.  Or a static assert on the value, but that's probably
> more code.
Changed to fill during loop.
I'm aware of this and the only reason of changing InvalidNodeIndex from 
0xfffe to 0xffff was to use memset here.
I was thinking you are okay with memset as you commented to use memset 
from your previous email. :)

webrev:
http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
Testing: hs-tier1 ~ 5 +-UseNUMA

Thanks,
Sangheon


> ------------------------------------------------------------------------------
>


From sangheon.kim at oracle.com  Tue Oct  1 16:53:16 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 1 Oct 2019 09:53:16 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
Message-ID: <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>

Hi all,

As JDK-8220310 changed a lot, I'm posting next webrev.
Previous webrev just conflicts.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220311/webrev.1
http://cr.openjdk.java.net/~sangheki/8220311/webrev.1.inc
Testing: hs-tier 1 ~ 5 with +- UseNUMA

Thanks,
Sangheon


On 9/4/19 12:16 AM, sangheon.kim at oracle.com wrote:
> Hi all,
>
> Please review this patch making G1 NUMA aware.
> This is the second part of G1 NUMA implementation:
> - Making Survivor region NUMA aware.
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8220311
> Webrev: http://cr.openjdk.java.net/~sangheki/8220311/webrev.0
> Testing: hs-tier 1 ~ 5 with +- UseNUMA
>
> Thanks,
> Sangheon


From kim.barrett at oracle.com  Wed Oct  2 01:08:32 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 1 Oct 2019 21:08:32 -0400
Subject: RFR: 8231153: Improve concurrent refinement statistics
In-Reply-To: <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
References: <1DADC595-3106-4CE7-BA5D-7B6C7EE0E81E@oracle.com>
 <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
Message-ID: <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>

> On Sep 25, 2019, at 5:40 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8231153
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8231153/open.00/
>> Testing:
>> mach5 tier1-5
>> some local by-hand testing to look at the rate tracking.

I've made some updates to names, per some offline discussion with
Thomas.  The main one is to use a "total_" prefix in various places
for accumulated refinement time and accumulated number of cards
refined.  Also backed out any "num_" prefix removals from the original
change.  Removed the unsynchronized accumulation of the number of
concurrently scanned cards.

I also fixed a bug in the card logging rate calculation.  The function
I was using to get the end time for the last GC pause
(last_known_gc_end_time_sec) doesn't do any such thing, but has a
confusing name (JDK-8231638).


> When reading the change I had the following thoughts to improve readability:
> 
> - maybe some comment somewhere what "scanned" really means compared to "refined". Initially I was surprised with the change at G1RemSet::_num_conc_scanned_cards, but some thinking made me aware of the difference.

No longer relevant, as that's been removed.

> - the change reuses the "processed" term for counted cards in a few places, and it is unclear to me what the difference to just "refined" cards would be in some cases.

Fixed.

> - I would also suggest to add a "num_" prefix to numbers/counts of values.

Using "total_" prefix.

> - in G1Policy::_pending_cards should be renamed to "_pending_cards_at_start_of_gc" since we also now have a "_pending_cards_after_last_gc" to distinguish their use a little better?

Updated names:
pending_cards_at_gc_start
pending_cards_at_prev_gc_end


> - pre-existing: probably rename G1RemSet::_num_conc_scanned_cards and G1RemSetSummary::_conc_scanned_cards to "_concurrent_scanned_cards" to match the "_concurrent_refined_cards?.

Fixed by code deletion.

> - not sure, but I think exposing size() and start() and in G1FreeIdSet seems unnecessary: the only user is G1DirtyCardQueueSet anyway, and it is already owner of G1FreeIdSet. I.e. it knows these values already (and passes it to the initializer of the G1FreeIdSet instance, and already has a getter for the size() value), so getting it back from G1FreeIdSet seems a bit strange to me, but I am okay with current code.

I've backed out the changes to G1FreeIdSet, and instead introduced in
G1DirtyCardQueue a private function providing the start index (always
returning 0) and used it in the two relevant places, along with noting
that there is code elsewhere that is assuming a 0 value.  That can be
cleaned up later (JDK-8231734).

New webrevs:
full: https://cr.openjdk.java.net/~kbarrett/8231153/open.01/
incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.01.inc/

Testing:
mach5 tier1-5
some local by-hand testing to look at the rate tracking.


From thomas.schatzl at oracle.com  Wed Oct  2 09:57:06 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 2 Oct 2019 11:57:06 +0200
Subject: RFR: 8231153: Improve concurrent refinement statistics
In-Reply-To: <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
References: <1DADC595-3106-4CE7-BA5D-7B6C7EE0E81E@oracle.com>
 <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
 <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
Message-ID: <4ac5b37a-ba45-a0ec-359f-e15501af639e@oracle.com>

Hi,

On 02.10.19 03:08, Kim Barrett wrote:
>> On Sep 25, 2019, at 5:40 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8231153
>>> Webrev:
>>> https://cr.openjdk.java.net/~kbarrett/8231153/open.00/
>>> Testing:
>>> mach5 tier1-5
>>> some local by-hand testing to look at the rate tracking.
> 
> I've made some updates to names, per some offline discussion with
> Thomas.  The main one is to use a "total_" prefix in various places
> for accumulated refinement time and accumulated number of cards
> refined.  Also backed out any "num_" prefix removals from the original
> change.  Removed the unsynchronized accumulation of the number of
> concurrently scanned cards.
> 
> I also fixed a bug in the card logging rate calculation.  The function
> I was using to get the end time for the last GC pause
> (last_known_gc_end_time_sec) doesn't do any such thing, but has a
> confusing name (JDK-8231638).
> 
> 
>> When reading the change I had the following thoughts to improve readability:
>>
[...]

Thanks a lot for considering my comments.

> 
>> - not sure, but I think exposing size() and start() and in G1FreeIdSet seems unnecessary: the only user is G1DirtyCardQueueSet anyway, and it is already owner of G1FreeIdSet. I.e. it knows these values already (and passes it to the initializer of the G1FreeIdSet instance, and already has a getter for the size() value), so getting it back from G1FreeIdSet seems a bit strange to me, but I am okay with current code.
> 
> I've backed out the changes to G1FreeIdSet, and instead introduced in
> G1DirtyCardQueue a private function providing the start index (always
> returning 0) and used it in the two relevant places, along with noting
> that there is code elsewhere that is assuming a 0 value.  That can be
> cleaned up later (JDK-8231734).
> 
> New webrevs:
> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.01/
> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.01.inc/
> 
> Testing:
> mach5 tier1-5
> some local by-hand testing to look at the rate tracking.
> 

   looks good.

Thanks,
   Thomas


From sangheon.kim at oracle.com  Wed Oct  2 17:11:26 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Wed, 2 Oct 2019 10:11:26 -0700
Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for
 G1, Logging (3/3)
In-Reply-To: <e7c52f60-a5c7-072a-4e3b-65c608907679@oracle.com>
References: <e7c52f60-a5c7-072a-4e3b-65c608907679@oracle.com>
Message-ID: <e903223b-90a5-9d01-5421-a47011bd5985@oracle.com>

Hi,

Here's the rebased webrev with minor changes.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220312/webrev.1
http://cr.openjdk.java.net/~sangheki/8220312/webrev.1.inc
Testing: hs-tier 1 ~ 5 with +- UseNUMA

FYI, here's the full patch including JDK-8220310, 8220311, 8220312.
http://cr.openjdk.java.net/~sangheki/8220312/webrev.full/

Thanks,
Sangheon


On 9/4/19 12:16 AM, sangheon.kim at oracle.com wrote:
> Hi all,
>
> Please review this patch making G1 NUMA aware.
> This is the last part of G1 NUMA implementation:
> - Adding logs and stat.
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8220312
> Webrev: http://cr.openjdk.java.net/~sangheki/8220312/webrev.0
> Testing: hs-tier 1 ~ 8 with +- UseNUMA
>
> Thanks,
> Sangheon


From per.liden at oracle.com  Wed Oct  2 21:53:16 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 2 Oct 2019 23:53:16 +0200
Subject: RFR: 8231774: ZGC: ZVirtualMemoryManager unmaps incorrect address
Message-ID: <d813bce0-21ef-1a7c-ccc6-c730e50d4ea0@oracle.com>

When failing to map the requested address, map() in 
ZVirtualMemoryManager.cpp, incorrectly calls unmap(start, size) instead 
of unmap(res, size).

Bug: https://bugs.openjdk.java.net/browse/JDK-8231774
Webrev: http://cr.openjdk.java.net/~pliden/8231774/webrev.0

/Per


From per.liden at oracle.com  Wed Oct  2 22:28:26 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 00:28:26 +0200
Subject: RFR: 8231776: ZGC: Fix incorrect address space description
Message-ID: <0942d5a7-d6d7-2e81-97ef-b8fc55880f02@oracle.com>

After JDK-8224820, the space between the Remapped heap view and the 
Marked1 heap view is no longer reserved. The ASCII art describing the 
address space layout should be updated to reflect that.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231776
Webrev: http://cr.openjdk.java.net/~pliden/8231776/webrev.0

/Per


From kim.barrett at oracle.com  Wed Oct  2 23:55:05 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 2 Oct 2019 19:55:05 -0400
Subject: RFR: 8231774: ZGC: ZVirtualMemoryManager unmaps incorrect address
In-Reply-To: <d813bce0-21ef-1a7c-ccc6-c730e50d4ea0@oracle.com>
References: <d813bce0-21ef-1a7c-ccc6-c730e50d4ea0@oracle.com>
Message-ID: <8D5D60EF-5F2C-4C7A-A50C-79ABB1AE0254@oracle.com>

> On Oct 2, 2019, at 5:53 PM, Per Liden <per.liden at oracle.com> wrote:
> 
> When failing to map the requested address, map() in ZVirtualMemoryManager.cpp, incorrectly calls unmap(start, size) instead of unmap(res, size).
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231774
> Webrev: http://cr.openjdk.java.net/~pliden/8231774/webrev.0
> 
> /Per

Looks good.


From per.liden at oracle.com  Thu Oct  3 04:24:55 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 06:24:55 +0200
Subject: RFR: 8231774: ZGC: ZVirtualMemoryManager unmaps incorrect address
In-Reply-To: <8D5D60EF-5F2C-4C7A-A50C-79ABB1AE0254@oracle.com>
References: <8D5D60EF-5F2C-4C7A-A50C-79ABB1AE0254@oracle.com>
Message-ID: <D7FDFB06-6DB3-455E-A31A-7FD3E78DAB21@oracle.com>

Thanks Kim!

/Per

> On 3 Oct 2019, at 01:55, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
> ?
>> 
>> On Oct 2, 2019, at 5:53 PM, Per Liden <per.liden at oracle.com> wrote:
>> 
>> When failing to map the requested address, map() in ZVirtualMemoryManager.cpp, incorrectly calls unmap(start, size) instead of unmap(res, size).
>> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231774
>> Webrev: http://cr.openjdk.java.net/~pliden/8231774/webrev.0
>> 
>> /Per
> 
> Looks good.
> 


From stefan.karlsson at oracle.com  Thu Oct  3 06:27:30 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 3 Oct 2019 08:27:30 +0200
Subject: RFR: 8231774: ZGC: ZVirtualMemoryManager unmaps incorrect address
In-Reply-To: <d813bce0-21ef-1a7c-ccc6-c730e50d4ea0@oracle.com>
References: <d813bce0-21ef-1a7c-ccc6-c730e50d4ea0@oracle.com>
Message-ID: <0c38d6bd-bfb0-a16d-c247-ee28f3902883@oracle.com>

Looks good.

StefanK

On 2019-10-02 23:53, Per Liden wrote:
> When failing to map the requested address, map() in 
> ZVirtualMemoryManager.cpp, incorrectly calls unmap(start, size) instead 
> of unmap(res, size).
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231774
> Webrev: http://cr.openjdk.java.net/~pliden/8231774/webrev.0
> 
> /Per


From per.liden at oracle.com  Thu Oct  3 06:39:46 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 08:39:46 +0200
Subject: RFR: 8231774: ZGC: ZVirtualMemoryManager unmaps incorrect address
In-Reply-To: <0c38d6bd-bfb0-a16d-c247-ee28f3902883@oracle.com>
References: <d813bce0-21ef-1a7c-ccc6-c730e50d4ea0@oracle.com>
 <0c38d6bd-bfb0-a16d-c247-ee28f3902883@oracle.com>
Message-ID: <727f9841-ef20-db44-6edf-d610a8b8d5c7@oracle.com>

Thanks!

/Per

On 10/3/19 8:27 AM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2019-10-02 23:53, Per Liden wrote:
>> When failing to map the requested address, map() in 
>> ZVirtualMemoryManager.cpp, incorrectly calls unmap(start, size) 
>> instead of unmap(res, size).
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231774
>> Webrev: http://cr.openjdk.java.net/~pliden/8231774/webrev.0
>>
>> /Per


From stefan.karlsson at oracle.com  Thu Oct  3 07:48:25 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 3 Oct 2019 09:48:25 +0200
Subject: RFR: 8231563: ZGC: Fails to warn when user sets the max heap size
 to larger than 16TB
In-Reply-To: <f8b4080b-5929-e0c3-0acc-228d808e78c0@oracle.com>
References: <da73ba14-683a-625a-db08-0bca1964901e@oracle.com>
 <f8b4080b-5929-e0c3-0acc-228d808e78c0@oracle.com>
Message-ID: <869788d7-3cf4-6891-81aa-eeb9508f0f5c@oracle.com>

Thanks, Thomas.

StefanK

On 2019-09-27 10:46, Thomas Schatzl wrote:
> Hi,
> 
> On 27.09.19 09:16, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this small patch to fix the max heap size check in ZGC.
>>
>> https://cr.openjdk.java.net/~stefank/8231563/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8231563
>>
>> After this fix the JVM refuses to start if a too high -Xmx is set:
>>
>> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xmx17t -version:
>> Error occurred during initialization of VM
>> Java heap too large
> 
>  ? looks good to me.
> 
> Thomas
> 


From stefan.karlsson at oracle.com  Thu Oct  3 07:48:36 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 3 Oct 2019 09:48:36 +0200
Subject: RFR: 8231563: ZGC: Fails to warn when user sets the max heap size
 to larger than 16TB
In-Reply-To: <18a67b8d-79b1-ccfd-daa5-9f8552ee2f9a@oracle.com>
References: <da73ba14-683a-625a-db08-0bca1964901e@oracle.com>
 <18a67b8d-79b1-ccfd-daa5-9f8552ee2f9a@oracle.com>
Message-ID: <6a72c5b7-f4a9-67a1-5b9e-cc9608c4b44a@oracle.com>

Thanks, Per.

StefanK

On 2019-09-27 15:18, Per Liden wrote:
> Looks good!
> 
> /Per
> 
> On 9/27/19 9:16 AM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this small patch to fix the max heap size check in ZGC.
>>
>> https://cr.openjdk.java.net/~stefank/8231563/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8231563
>>
>> After this fix the JVM refuses to start if a too high -Xmx is set:
>>
>> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xmx17t -version:
>> Error occurred during initialization of VM
>> Java heap too large
>>
>> Thanks,
>> StefanK


From per.liden at oracle.com  Thu Oct  3 08:47:30 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 10:47:30 +0200
Subject: RFR: 8231489: GC watermark_0_1 failed due to "metaspace.gc.Fault: GC
 has happened too rare"
Message-ID: <2aa33fe4-5d04-cd58-f786-d7a12b977ccf@oracle.com>

vmTestbase/metaspace/gc/HighWaterMarkTest relies on timing and fails 
when "Metaspace GC Threshold" isn't handled in a STW pause.

The problem can be reproduced on both G1 and ZGC, but it's hard, as the 
window is small. However, it reproduces every time when injecting a 
100ms delay to prolong the GC cycle a bit. This test used to be disabled 
for G1 with ClassUnloadingWithConcurrentMark, but JDK-8204163 enabled it 
about a year ago.

Fixing the test properly is tricky. As far as I can see, we can either:
1) Disable this test for G1+ClassUnloadingWithConcurrentMark and ZGC, or
2) Add a sleep in the test loop, to make the race less likely to happen, or
3) Remove the test completely, with the rational that it's a buggy low 
value test.

I've gone with 1) here. The test is already disabled for CMS today, with 
code in the test itself (i.e. not using @requires), so I did two 
alternative patches:

A) Follows the existing style to disable the other GCs:
http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt1

B) Adds @requires to the tests using the HighWaterMarkTest class, and 
removes the old check to disable CMS:
http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt2

I prefer B, but I don't have a strong opinion on which way to go.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231489

/Per


From erik.osterlund at oracle.com  Thu Oct  3 08:56:48 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Thu, 3 Oct 2019 10:56:48 +0200
Subject: RFR: 8231776: ZGC: Fix incorrect address space description
In-Reply-To: <0942d5a7-d6d7-2e81-97ef-b8fc55880f02@oracle.com>
References: <0942d5a7-d6d7-2e81-97ef-b8fc55880f02@oracle.com>
Message-ID: <f61e74d2-31bc-3445-8e80-a484f5582942@oracle.com>

Hi Per,

Looks good.

/Erik

On 10/3/19 12:28 AM, Per Liden wrote:
> After JDK-8224820, the space between the Remapped heap view and the 
> Marked1 heap view is no longer reserved. The ASCII art describing the 
> address space layout should be updated to reflect that.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231776
> Webrev: http://cr.openjdk.java.net/~pliden/8231776/webrev.0
>
> /Per


From per.liden at oracle.com  Thu Oct  3 08:59:58 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 10:59:58 +0200
Subject: RFR: 8231776: ZGC: Fix incorrect address space description
In-Reply-To: <f61e74d2-31bc-3445-8e80-a484f5582942@oracle.com>
References: <0942d5a7-d6d7-2e81-97ef-b8fc55880f02@oracle.com>
 <f61e74d2-31bc-3445-8e80-a484f5582942@oracle.com>
Message-ID: <3ccac031-22da-0b21-dfc6-23643eddf42b@oracle.com>

Thanks!

/Per

On 10/3/19 10:56 AM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good.
> 
> /Erik
> 
> On 10/3/19 12:28 AM, Per Liden wrote:
>> After JDK-8224820, the space between the Remapped heap view and the 
>> Marked1 heap view is no longer reserved. The ASCII art describing the 
>> address space layout should be updated to reflect that.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231776
>> Webrev: http://cr.openjdk.java.net/~pliden/8231776/webrev.0
>>
>> /Per
> 


From per.liden at oracle.com  Thu Oct  3 09:34:13 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 11:34:13 +0200
Subject: RFR: 8231825: ZGC: Remove ZMaxHeapSize and ZMaxHeapSizeShift
Message-ID: <acb08828-2eca-3769-fbb1-e2320e72328f@oracle.com>

The global constants ZMaxHeapSize, ZMaxHeapSizeShift and their 
respective platform versions can be removed. The single case where 
ZMaxHeapSize is used can be replaced by ZAddressOffsetMax.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231825
Webrev: http://cr.openjdk.java.net/~pliden/8231825/webrev.0

/Per


From per.liden at oracle.com  Thu Oct  3 09:45:47 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 11:45:47 +0200
Subject: RFR: 8231552: ZGC: Refine address space reservation
Message-ID: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>

We could be slightly more sophisticated and do a better job reserving 
address space in situations where parts of the address space is already 
occupied or when the process is running with address space limitations.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231552
Webrev: http://cr.openjdk.java.net/~pliden/8231552/webrev.0

/Per


From erik.osterlund at oracle.com  Thu Oct  3 10:30:56 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Thu, 3 Oct 2019 12:30:56 +0200
Subject: RFR: 8231825: ZGC: Remove ZMaxHeapSize and ZMaxHeapSizeShift
In-Reply-To: <acb08828-2eca-3769-fbb1-e2320e72328f@oracle.com>
References: <acb08828-2eca-3769-fbb1-e2320e72328f@oracle.com>
Message-ID: <399f8de7-1e5b-5d4c-62f0-4ec4d173f1cf@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

On 10/3/19 11:34 AM, Per Liden wrote:
> The global constants ZMaxHeapSize, ZMaxHeapSizeShift and their 
> respective platform versions can be removed. The single case where 
> ZMaxHeapSize is used can be replaced by ZAddressOffsetMax.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231825
> Webrev: http://cr.openjdk.java.net/~pliden/8231825/webrev.0
>
> /Per


From thomas.schatzl at oracle.com  Thu Oct  3 11:47:20 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 3 Oct 2019 13:47:20 +0200
Subject: RFR: 8231825: ZGC: Remove ZMaxHeapSize and ZMaxHeapSizeShift
In-Reply-To: <acb08828-2eca-3769-fbb1-e2320e72328f@oracle.com>
References: <acb08828-2eca-3769-fbb1-e2320e72328f@oracle.com>
Message-ID: <4041d344-c605-0817-5aae-89f4e9ea48c2@oracle.com>

Hi,

On 03.10.19 11:34, Per Liden wrote:
> The global constants ZMaxHeapSize, ZMaxHeapSizeShift and their 
> respective platform versions can be removed. The single case where 
> ZMaxHeapSize is used can be replaced by ZAddressOffsetMax.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231825
> Webrev: http://cr.openjdk.java.net/~pliden/8231825/webrev.0
> 
> /Per


   looks good.

Thomas


From per.liden at oracle.com  Thu Oct  3 12:17:17 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 14:17:17 +0200
Subject: RFR: 8231825: ZGC: Remove ZMaxHeapSize and ZMaxHeapSizeShift
In-Reply-To: <399f8de7-1e5b-5d4c-62f0-4ec4d173f1cf@oracle.com>
References: <acb08828-2eca-3769-fbb1-e2320e72328f@oracle.com>
 <399f8de7-1e5b-5d4c-62f0-4ec4d173f1cf@oracle.com>
Message-ID: <2756ea20-a408-6038-2592-6248573d4e66@oracle.com>

Thanks Erik!

/Per

On 10/3/19 12:30 PM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 10/3/19 11:34 AM, Per Liden wrote:
>> The global constants ZMaxHeapSize, ZMaxHeapSizeShift and their 
>> respective platform versions can be removed. The single case where 
>> ZMaxHeapSize is used can be replaced by ZAddressOffsetMax.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231825
>> Webrev: http://cr.openjdk.java.net/~pliden/8231825/webrev.0
>>
>> /Per
> 


From per.liden at oracle.com  Thu Oct  3 13:15:27 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 3 Oct 2019 15:15:27 +0200
Subject: RFR: 8231825: ZGC: Remove ZMaxHeapSize and ZMaxHeapSizeShift
In-Reply-To: <4041d344-c605-0817-5aae-89f4e9ea48c2@oracle.com>
References: <acb08828-2eca-3769-fbb1-e2320e72328f@oracle.com>
 <4041d344-c605-0817-5aae-89f4e9ea48c2@oracle.com>
Message-ID: <b5eb9686-6185-398e-52da-9ec3c62c5d3e@oracle.com>

Thanks Thomas!

/Per

On 10/3/19 1:47 PM, Thomas Schatzl wrote:
> Hi,
> 
> On 03.10.19 11:34, Per Liden wrote:
>> The global constants ZMaxHeapSize, ZMaxHeapSizeShift and their 
>> respective platform versions can be removed. The single case where 
>> ZMaxHeapSize is used can be replaced by ZAddressOffsetMax.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231825
>> Webrev: http://cr.openjdk.java.net/~pliden/8231825/webrev.0
>>
>> /Per
> 
> 
>  ? looks good.
> 
> Thomas


From mark.reinhold at oracle.com  Thu Oct  3 22:11:44 2019
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Thu,  3 Oct 2019 15:11:44 -0700 (PDT)
Subject: New candidate JEP: 363: Remove the Concurrent Mark Sweep (CMS)
 Garbage Collector
Message-ID: <20191003221144.85C84309580@eggemoggin.niobe.net>

https://openjdk.java.net/jeps/363

- Mark


From kishor.kharbas at intel.com  Fri Oct  4 01:00:16 2019
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Fri, 4 Oct 2019 01:00:16 +0000
Subject: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent
 marking bitmaps.
Message-ID: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>

Hi,
When I worked on JDK-8211425<https://bugs.openjdk.java.net/browse/JDK-8211425>, there was a request for better abstraction for pinning G1's CM bitmaps. RFE for the request is here - JDK-8215893<https://bugs.openjdk.java.net/browse/JDK-8215893>.

Here is a proposal : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/

Here G1PageBasedVirtualSpace pins the entire reserved memory to memory during construction. The constructor takes an additional bool flag which says "does it need to pin the memory".
If the memory is pinned, '_special' flag is set to true. I piggy back on _special flag's behavior which is to not do actual OS (un-)commits on calls to (un)commit().
Rest of the changes is the mechanism to pass this flag from CM bitmaps creation in G1CollectedHeap all the way to G1PageBasedVirtualSpace.

Let me know if this is a good abstraction and if there is any better way.

Thanks
Kishor


From thomas.schatzl at oracle.com  Fri Oct  4 12:11:42 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 4 Oct 2019 14:11:42 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
Message-ID: <43f26130-f340-f23a-11fd-773f696998a9@oracle.com>

Hi Sangheon,

   thanks for your  hard work on this!

On 01.10.19 18:43, sangheon.kim at oracle.com wrote:
> Hi Kim and others,
> 
> This webrev.2 simplified a bit more after changing 'heap expansion' 
> approach.
[...]
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
> Testing: hs-tier1 ~ 5 +-UseNUMA
> 

Comments:

- os_solaris.cpp:2236: indentation addition

- os_windows.cpp: os::get_address_id(): the "return 0" is in the same 
line as the method declaration, while the change uses extra lines in 
os_bsd. Please make this uniform.

g1_globals.cpp: unnecessary whitespace  change

- G1Allocator::unsafe_max_alloc(): I need to think some more if this is 
correct - should that really be node specific? (and not the max of all 
nodes).

Otoh I think this is fine.

- g1Allocator::used_in_alloc_regions(): not sure why the assert has been 
removed

- os_linux.cpp: os::numa_get_address_id(): I think "id" should be an 
"int", not uint32_t according to 
http://man7.org/linux/man-pages/man2/get_mempolicy.2.html

And I think you can initialize it with os::InvalidId;

- in G1Allocator::current_node_index() retrieving the current node index 
is part of the G1MemoryNodeManager; I would really prefer if this were 
some property of a Thread. Not sure what others think.

That value could be put into the G1ThreadLocalData.

In any case, G1Allocator should probably cache the reference to the 
G1MemoryNodeManager for faster access.

- G1CollectedHeap::expand_single_region(): the log output in the first 
line looks more like some debug code than generally interesting information.

- G1CollectedHeap::expand_single_region(): pre-existing: it should add 
to the in-safepoint expansion time like expand(); okay to just file a CR.

- instead of the "late initialization method" set_page_size() I would 
prefer to have this value passed in the constructor. It is not required 
to me to have the create() call in the initialization list of 
g1CollectedHeap at all costs... it could be put right after we determine 
the page size in the body of the G1CollectedHeap constructor.

- g1CollectedHeap.hpp:940: no need to delete the newline.

- g1MemoryNodeManager.cpp:41: that comment does not add information imho

- G1NUMA::index_of_current_thread() needs a comment

- G1NUMA::index_of_num_id/is_valid_numa_id/ should be private

- not sure why G1NUMA::initialize()/set_numa() are needed. It's only 
call is right after instantiating a G1NUMA instance

- G1NUMA::request_memory_on_node needs a comment.

- I observed that a *lot* of G1NUMA methods are only used by 
G1MemoryNodeManager; and G1MemoryNodeManager just forwards to G1NUMA a 
lot. Maybe these two can be merged?

- G1NUMA::preferred_index_for_address/request_memory_on_node: I would 
prefer if these methods were not hardcoded with HeapRegion metrics as 
example.

I.e. for preferred_index_for_address(), instead of the address it is 
probably better to pass it the zero-based index directly, that is used 
for calculating the node index. I.e. all callers know the HeapRegion's 
index anyway *and* this would make the method independent of 
G1CollectedHeap.

I.e. something like preferred_node_index_for_index(<region-index>), 
because then the same method can be reused for other data structures 
than the heap/heap region.

G1NUMA::request_memory_on_node() could also be moved to 
G1PageBasedVirtualSpace, using the chunk sizes of page based virtual 
space instead of hardcoding HeapRegion::GrainBytes (i.e. hardcode the 
method to HeapRegion) - or pass in the "chunk size" calculated there 
from G1PageBasedVirtualSize.

I think this would increase the generality and usefulness of 
G1NUMA/G1MemoryNodeManager a lot without "passing in too many node 
indices everywhere".

- G1PageBasedVirtualSpace: the _numa member seems to be used exactly in 
one method where performance does not look critical. Maybe it is better 
to reference it directly there via G1CollectedHeap.

- heapRegionManager.cpp:verfiy_actual_node_index(): not sure if that 
should be debug level.

I would also kind of prefer a method that iterates over all regions and 
prints a summary status (and potentially drop this per-region checking 
at least when allocating a new free region). It is sufficient to print a 
summary of expected/actual values, at most a summary per node. I.e.

"NUMA Node index verification: Nodes: X_0/Y_0 X_1/Y_1 ... X_N/Y_N 
Unknown: Z Total: X/Y"

where X(_i) is the number of matching indexes (for node index i) and 
Y(_i) the number of expected (for node index i).

Also, the correct word to use here is "mismatch" not "different" (to what?)

- in some discussion we talked about the "node_index" lifecycle, and 
what I remember is the following:

   - initially, when we commit/make the region available, we set that 
HeapRegion's node_index to "Unknown" (with AlwaysPretouch on we can of 
course immediately set the correct one).
   - in HeapRegion::node_index() we do something like the following 
pseudo-code:

   {
     if (_node_index == Unknown) {
       // try to get actual node index from OS, and update _node_index 
if we could get the information
     }

     if (_node_index == Unknown) { // Still unknown
       // return _preferred_ node index *without* updating _node_index
     }
     return _node_index;
   }

   - now, during the "verification" pass, we use whether 
HeapRegion::node_index() == preferred_node_index to determine if the 
region is on the correct node.

The change only sets the node index during making the region available, 
and immediately to the preferred node index.

I.e. we eventually end up with the actual node index reported by the OS 
in HeapRegion::_node_index.


- for the expression "G1MemoryNodeManager::num_active_nodes() > 1" it 
would be nice to have an extra method in G1MemoryNodeManager instead of 
repeating it over and over.

- heapRegionManager.cpp:print_node_id_of_regions: that method will print 
a huge amount of lines. Better to print the summary I sketched out above.

- in FreeRegionList::remove_region_with_node_index(), the maximum search 
depth must take into account how many regions are there per page.

Consider 1GB pages, 32M region size, meaning that we get 32 consecutive 
regions/page.
Now with a node amount of 2, the maximum search depth will be 6 - which 
is too low :)
The intention is probably 3  * MAX(page_size / region size, 1) * 
numa->num_active_numa_ids().

I think it is useful to put that expresssion into 
G1NUMA/G1NodeMemoryManager (or somewhere else appropriate - 
HeapRegionManager?) to avoid that part having too much info about page size.

- os.hpp: the new Enum values might or might need some description.

Btw, there is no regression in performance from the .0/.1 versions of 
this code in our benchmarks.

Thanks,
   Thomas


From stefan.johansson at oracle.com  Fri Oct  4 12:23:40 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Fri, 4 Oct 2019 14:23:40 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
Message-ID: <25fe4295-7de8-bbc3-3dd3-6b750d4982f2@oracle.com>

Hi Sangheon,

First of all, thanks for this updated version incorporating a lot of our 
comments. I think we are getting closer to the goal, but I still have 
some more comments :)

On 2019-10-01 18:43, sangheon.kim at oracle.com wrote:
> Hi Kim and others,
> 
> This webrev.2 simplified a bit more after changing 'heap expansion' 
> approach.
> Previously heap may expand with preferred numa id which means contiguous 
> same numa id heap regions may exist but current version is assuming to 
> have evenly split heap regions. i.e. 4 numa node system, heap regions 
> will be 012301230123, so if we know address or heap region index, we can 
> know preferred numa id.
> 
> Many codes related to support previous style expansion were removed.
> 
> ...
> 
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc

src/hotspot/share/gc/g1/g1Allocator.cpp
---
   31 #include "gc/g1/g1NUMA.hpp"
I don't see why this include is needed, but you might want to include 
gc/g1/g1MemoryNodeManager.hpp instead.
---

hotspot/share/gc/g1/g1CollectedHeap.cpp
---
1518   _mem_node_mgr(G1MemoryNodeManager::create()),

I saw your response to Kim regarding G1Allocator needing it do be 
initialized and I get that, but have you looked at moving the creation 
of G1Allocator to initialize() as well, I think it's first use is 
actually below:
1802   _mem_node_mgr->set_page_size(page_size);
here:
1851   _allocator->init_mutator_alloc_regions();

I might be missing some other place where it gets called, but I think it 
should be safe to create both the node manager and the allocator early 
in initialize().
---

src/hotspot/share/gc/g1/g1RegionToSpaceMapper.hpp
---
28 #include "gc/g1/g1MemoryNodeManager.hpp"

Remove this include.
---

src/hotspot/share/gc/g1/g1_globals.hpp
---
326                range(0, 100)

Remove the backslash and add back the removed line to leave the file gc, 
heap, numa, verificationunchanged.
---

src/hotspot/share/gc/g1/heapRegionManager.cpp
---
  142   if (hr != NULL) {
  143     assert(hr->next() == NULL, "Single region should not have next");
  144     assert(is_available(hr->hrm_index()), "Must be committed");
  145
  146     verify_actual_node_index(hr->bottom(), hr->node_index());
  147   }

I don't think this is a good place to do the verification, we allocate 
the free region while holding a lock and I think we should avoid doing a 
system call there. I would rather see this done during a safepoint, 
having a closure that iterates the heap and verify all regions.

I also think it would be nice to have two levels of the output, the one 
line for each region on trace level and on debug we can have a summary, 
something like:
NUMA Node 1: expected=25, actual=23
NUMA Node 2: expected=25, actual=27

What do you (and others) think about that?
---
  216 static void print_node_id_of_regions(uint start, uint num_regions){
  217   LogTarget(Trace, gc, heap, numa) lt;

I understand that it might make the test a bit more complicated, but 
have you thought about instead adding the node index to the heap 
printing done when <gc, heap, region> is enabled on trace level?
---
  235 static void set_heapregion_node_index(HeapRegion* hr) {

I don't think we should special case for when AlwaysPreTouch is on and 
instead always just call hr->set_node_index(preferred_index) directly in 
make_regions_available. The reason is that I think it will make the NUMA 
support harder to understand and explain and it can potentially also 
hide problems with a systems configuration. It might also actually be 
worse then using the preferred id, because the OS might decide to move 
the pages back to the preferred node right after we checked this (not 
sure it will happen, but in theory).

An other problem with this code is the call to:
verify_actual_node_index(hr->bottom(), node_index)

This function will only return the "actual" node index if logging for 
<gc, heap, numa, verification> is enable on debug level.
---

  346  bool HeapRegionManager::is_on_preferred_index(uint region_index, 
uint preferred_node_index) {
  347    uint region_node_index = 
G1MemoryNodeManager::mgr()->preferred_index_for_address(
  348 
G1CollectedHeap::heap()->bottom_addr_for_region(region_index));
  349   return region_node_index == preferred_node_index ||
  350          preferred_node_index == G1MemoryNodeManager::AnyNodeIndex;

I guess adding the AnyNodeIndex case here is because in this patch 
nobody is expanding on a preferred node, right? To me this is just 
another argument to not do any changes to the expand code in this patch. 
I know I suggested adding expand_on_preferred_node(), but I should have 
been clearer about when I think we should add it.
---

src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
---
   56   // Returns memory node ids
   57   virtual const int* node_ids() const;

Doesn't seem to be used, remove.
---

src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
---
  67   LINUX_ONLY(if (UseNUMA) {
...
  79     delete numa;
  80   })

A bit confusing with a multi-line LINUX_ONLY, I would prefer to hide 
this in a private helper, something like:
   if (UseNUMA) {
      LINUX_ONLY(create_numa_manager());
   }

   if (_inst == NULL) {
     _inst = new G1MemoryNodeManager();
   }

Not really happy about this either, but we can look at simplifying the 
NUMA initialization as a follow up.
---

src/hotspot/share/gc/g1/g1NUMA.hpp
---
   87   // Returns numa id of the given numa index.
   88   inline int numa_id_of_index(uint numa_index) const;

Currently unused, either remove or make use of it when calling 
numa_make_local.
---
   94   // Returns current active numa ids.
   95   const int* numa_ids() const { return _numa_ids; }

Only used by memory manager above, which in turn is unused, remove.
---

src/hotspot/share/gc/g1/g1NUMA.hpp
---
   55 // Request the given memory to locate on preferred node.
   56 // There are 2 things to consider.
   57 // First, size comparison for G1HeapRegionSize and page size.
  ...
   62 // Examples of 4 numa ids with non-preferred numa id.

What do you think about this instead:
// Request to spread the given memory evenly across the available NUMA
// nodes. Which node to request for a given address is given by the
// region size and the page size. Below are two examples:

I would also like a "NUMA node" row for each example showing which numa 
node the pages and regions end up on.
---

Thanks,
Stefan

> Testing: hs-tier1 ~ 5 +-UseNUMA
> 
> Thanks,
> Sangheon
> 
> 
>> ------------------------------------------------------------------------------ 
>>
>>
> 


From thomas.schatzl at oracle.com  Fri Oct  4 13:34:20 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 4 Oct 2019 15:34:20 +0200
Subject: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent
 marking bitmaps.
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
Message-ID: <c35eecd7-a155-eea1-0eec-3124abe96af8@oracle.com>

Hi Kishor,

On 04.10.19 03:00, Kharbas, Kishor wrote:
> Hi,
> When I worked on JDK-8211425<https://bugs.openjdk.java.net/browse/JDK-8211425>, there was a request for better abstraction for pinning G1's CM bitmaps. RFE for the request is here - JDK-8215893<https://bugs.openjdk.java.net/browse/JDK-8215893>.
> 
> Here is a proposal : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/
> 
> Here G1PageBasedVirtualSpace pins the entire reserved memory to memory during construction. The constructor takes an additional bool flag which says "does it need to pin the memory".
> If the memory is pinned, '_special' flag is set to true. I piggy back on _special flag's behavior which is to not do actual OS (un-)commits on calls to (un)commit().
> Rest of the changes is the mechanism to pass this flag from CM bitmaps creation in G1CollectedHeap all the way to G1PageBasedVirtualSpace.
> 
> Let me know if this is a good abstraction and if there is any better way.
> 
> Thanks
> Kishor
> 

Some comments:

- in the parameter lists, if the parameters are already laid out 
line-by-line, if adding a new one, please put it on a new line as well.

- this code

   if (_special) {
     if (!rs.special()) {
       commit_internal(addr_to_page_index(_low_boundary), 
addr_to_page_index(_high_boundary));
     }

in g1PageBasedVirtualSpace looks very incomprehensible.  :)

I would prefer (pending the second reviewer's comment) to either use the 
"pinned" flag here, or even better, move the necessary commit calls into 
the (now removed) HeterogeneousHeapRegionManager::initialize().

- I would just purely from feeling prefer if the "pinned" flag parameter 
would be listed after the "type" parameter in the G1RegionToSpaceMapper. 
But that's probably just me.

Also, finally one parameter per line for the declaration/definition of 
the constructor would improve readability.

Thanks,
   Thomas


From zgu at redhat.com  Fri Oct  4 14:51:33 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 4 Oct 2019 10:51:33 -0400
Subject: RFR 8231324: Shenandoah: avoid duplicated weak root works during
 final traversal
Message-ID: <d94f8b07-5cb6-9df2-d312-576ef1b9b99a@redhat.com>

Please review this patch that avoids traversal GC to walk weak roots 
twice during final traversal.

Also, it should process weak roots first, so that, fixup phase does not 
visit dead CLDs/codes, etc.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231324
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231324/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release) on Linux x86_64

Thanks,

-Zhengyu


From mark.reinhold at oracle.com  Fri Oct  4 17:16:15 2019
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Fri,  4 Oct 2019 10:16:15 -0700 (PDT)
Subject: New candidate JEP: 364: ZGC on macOS
Message-ID: <20191004171615.2E34130971A@eggemoggin.niobe.net>

https://openjdk.java.net/jeps/364

- Mark


From kishor.kharbas at intel.com  Fri Oct  4 23:15:50 2019
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Fri, 4 Oct 2019 23:15:50 +0000
Subject: RFR(S): 8215893: Add better abstraction for pinning G1
 concurrent marking bitmaps.
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>

Hi Stefan,

Thanks for the review. Some comments inline.

New webrev : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00_to_01/

                              http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/


> Hi Kishor,

>

> On 04.10.19 03:00, Kharbas, Kishor wrote:

>> Hi,

>> When I worked on JDK-8211425<https://bugs.openjdk.java.net/browse/JDK-8211425>, there was a request for better abstraction for pinning G1's CM bitmaps. RFE for the request is here - JDK-8215893<https://bugs.openjdk.java.net/browse/JDK-8215893>.

>>

>> Here is a proposal : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/

>>

>> Here G1PageBasedVirtualSpace pins the entire reserved memory to memory during construction. The constructor takes an additional bool flag which says "does it need to pin the memory".

>> If the memory is pinned, '_special' flag is set to true. I piggy back on _special flag's behavior which is to not do actual OS (un-)commits on calls to (un)commit().

>> Rest of the changes is the mechanism to pass this flag from CM bitmaps creation in G1CollectedHeap all the way to G1PageBasedVirtualSpace.

>>

>> Let me know if this is a good abstraction and if there is any better way.

>>

>> Thanks

>> Kishor

>>

>

> Some comments:

>

> - in the parameter lists, if the parameters are already laid out

> line-by-line, if adding a new one, please put it on a new line as well.

>

Fixed in the new webrev.


> - this code

>

>    if (_special) {

>      if (!rs.special()) {

>        commit_internal(addr_to_page_index(_low_boundary),

> addr_to_page_index(_high_boundary));

>      }

>

> in g1PageBasedVirtualSpace looks very incomprehensible.  :)

>

> I would prefer (pending the second reviewer's comment) to either use the

> "pinned" flag here, or even better, move the necessary commit calls into

> the (now removed) HeterogeneousHeapRegionManager::initialize().

>

Made it little more comprehensible. Will see what other reviewers think about moving it somewhere else.


> - I would just purely from feeling prefer if the "pinned" flag parameter

> would be listed after the "type" parameter in the G1RegionToSpaceMapper.

> But that's probably just me.

>

I did it this way to logically group the parameters. MemTracker is a tracker used by the VM everywhere and does not pertain to this class as such, so I kept it in the end.


> Also, finally one parameter per line for the declaration/definition of

> the constructor would improve readability.

>

Done.

Thank you,

Kishor


> Thanks,

>    Thomas


From stefan.johansson at oracle.com  Mon Oct  7 08:25:19 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 7 Oct 2019 10:25:19 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <43f26130-f340-f23a-11fd-773f696998a9@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <43f26130-f340-f23a-11fd-773f696998a9@oracle.com>
Message-ID: <523dae88-11d8-1f88-044e-7f722eb3c84d@oracle.com>

Hi Thomas and Sangheon,

I have one comment on Thomas comments =)

On 2019-10-04 14:11, Thomas Schatzl wrote:
> Hi Sangheon,
> 
>  ? thanks for your? hard work on this!
> 
> On 01.10.19 18:43, sangheon.kim at oracle.com wrote:
>> Hi Kim and others,
>>
>> This webrev.2 simplified a bit more after changing 'heap expansion' 
>> approach.
> [...]
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
>> Testing: hs-tier1 ~ 5 +-UseNUMA
>>
> 
> Comments:
> 
> ...
> 
> - in some discussion we talked about the "node_index" lifecycle, and 
> what I remember is the following:
> 
>  ? - initially, when we commit/make the region available, we set that 
> HeapRegion's node_index to "Unknown" (with AlwaysPretouch on we can of 
> course immediately set the correct one).
>  ? - in HeapRegion::node_index() we do something like the following 
> pseudo-code:
> 
>  ? {
>  ??? if (_node_index == Unknown) {
>  ????? // try to get actual node index from OS, and update _node_index 
> if we could get the information
>  ??? }
> 
>  ??? if (_node_index == Unknown) { // Still unknown
>  ????? // return _preferred_ node index *without* updating _node_index
>  ??? }
>  ??? return _node_index;
>  ? }
> 
>  ? - now, during the "verification" pass, we use whether 
> HeapRegion::node_index() == preferred_node_index to determine if the 
> region is on the correct node.
> 
> The change only sets the node index during making the region available, 
> and immediately to the preferred node index.
> 
> I.e. we eventually end up with the actual node index reported by the OS 
> in HeapRegion::_node_index.
> 

I like the idea of being able to get the correct node index from 
HeapRegion, but I have two concerns about the above idea. First, this 
will cause us to do a syscall while holding the lock to get a new 
region. This might not be a big deal, but I would prefer to do this 
update during a safepoint. The second thing is that if pages get 
migrated by the OS we would not see this if we only request the actual 
node index one time.

It's possible that both those concerns can be ignored, but I wanted to 
bring them up to hear others opinions.

Thanks,
Stefan


From thomas.schatzl at oracle.com  Mon Oct  7 08:45:38 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 7 Oct 2019 10:45:38 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <523dae88-11d8-1f88-044e-7f722eb3c84d@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <43f26130-f340-f23a-11fd-773f696998a9@oracle.com>
 <523dae88-11d8-1f88-044e-7f722eb3c84d@oracle.com>
Message-ID: <dc1ba8ec-f0bf-a8f7-7005-4791e2fd6d4b@oracle.com>

Hi,

On 07.10.19 10:25, Stefan Johansson wrote:
> Hi Thomas and Sangheon,
> 
> I have one comment on Thomas comments =)
> 
> On 2019-10-04 14:11, Thomas Schatzl wrote:
>> Hi Sangheon,
[...]
>>
>> I.e. we eventually end up with the actual node index reported by the 
>> OS in HeapRegion::_node_index.
>>
> 
> I like the idea of being able to get the correct node index from 
> HeapRegion, but I have two concerns about the above idea. First, this 
> will cause us to do a syscall while holding the lock to get a new 
> region. This might not be a big deal, but I would prefer to do this
I am not completely into the actual code flow right now, but I do not 
think there is a need to get the node index in this code path from the 
HeapRegion. Maybe when allocating from the free list later?

> update during a safepoint. The second thing is that if pages get 

Fine with me too to piggyback it on some existing region iteration to be 
100% sure.

> migrated by the OS we would not see this if we only request the actual 
> node index one time.

This is what the logging/verification is for I guess at this time. If 
the migration is significant, we need to handle this and update the node 
index - but I think we can do this node index update as RFE.

Above update of the actual node index values during safepoint could also 
"always" do the summary logging then (with gc+numa=debug or something) 
if NUMA is enabled.

Overall I would agree with that too.

> It's possible that both those concerns can be ignored, but I wanted to 
> bring them up to hear others opinions.

Thanks,
   Thomas


From stefan.johansson at oracle.com  Mon Oct  7 08:58:04 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 7 Oct 2019 10:58:04 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <dc1ba8ec-f0bf-a8f7-7005-4791e2fd6d4b@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <43f26130-f340-f23a-11fd-773f696998a9@oracle.com>
 <523dae88-11d8-1f88-044e-7f722eb3c84d@oracle.com>
 <dc1ba8ec-f0bf-a8f7-7005-4791e2fd6d4b@oracle.com>
Message-ID: <d2c93173-843e-fe4f-c9c6-e454e224fcd3@oracle.com>


On 2019-10-07 10:45, Thomas Schatzl wrote:
> Hi,
> 
> On 07.10.19 10:25, Stefan Johansson wrote:
>> Hi Thomas and Sangheon,
>>
>> I have one comment on Thomas comments =)
>>
>> On 2019-10-04 14:11, Thomas Schatzl wrote:
>>> Hi Sangheon,
> [...]
>>>
>>> I.e. we eventually end up with the actual node index reported by the 
>>> OS in HeapRegion::_node_index.
>>>
>>
>> I like the idea of being able to get the correct node index from 
>> HeapRegion, but I have two concerns about the above idea. First, this 
>> will cause us to do a syscall while holding the lock to get a new 
>> region. This might not be a big deal, but I would prefer to do this
> I am not completely into the actual code flow right now, but I do not 
> think there is a need to get the node index in this code path from the 
> HeapRegion. Maybe when allocating from the free list later?

Allocating from the free list is also under the lock, but I think we are 
on the same page, just asking for the hr->node_index() should not cause 
a syscall.

> 
>> update during a safepoint. The second thing is that if pages get 
> 
> Fine with me too to piggyback it on some existing region iteration to be 
> 100% sure.
Yes, that should make the cost fairly low.

> 
>> migrated by the OS we would not see this if we only request the actual 
>> node index one time.
> 
> This is what the logging/verification is for I guess at this time. If 
> the migration is significant, we need to handle this and update the node 
> index - but I think we can do this node index update as RFE.

Yes, I have no idea if migration is a real problem, so separate RFE is ok.

> 
> Above update of the actual node index values during safepoint could also 
> "always" do the summary logging then (with gc+numa=debug or something) 
> if NUMA is enabled.
Sounds reasonable.

Thanks,
Stefan
> 
> Overall I would agree with that too.
> 
>> It's possible that both those concerns can be ignored, but I wanted to 
>> bring them up to hear others opinions.
> 
> Thanks,
>  ? Thomas


From per.liden at oracle.com  Mon Oct  7 11:36:31 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 7 Oct 2019 13:36:31 +0200
Subject: RFR: 8231940: ZGC: Print correct low/high capacity
Message-ID: <2742b8ba-7fa0-b789-a250-4c9de40e1fc0@oracle.com>

After JDK-8222480, heap capacity can go down, not just up. The heap 
logging should take that into account when when printing capacity 
high/low numbers.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231940
Webrev: http://cr.openjdk.java.net/~pliden/8231940/webrev.0

/Per


From per.liden at oracle.com  Mon Oct  7 12:38:05 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 7 Oct 2019 14:38:05 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
Message-ID: <ebe51564-a8c5-1255-173f-e2038d5bd602@oracle.com>

This test is currently disabled for ZGC, but it can easily be enabled by 
adjusting the expected log string. ZGC doesn't print "Pause Full", but 
it still prints the "(Diagnostic Command)" part.

Also, the test enables gc=debug logging, which is unnecessary since this 
is always printed on the gc=info level.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231943
Webrev: http://cr.openjdk.java.net/~pliden/8231943/webrev.0

Testing: Manually ran test with all GCs (except Epsilon)

/Per


From shade at redhat.com  Mon Oct  7 12:51:19 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 7 Oct 2019 14:51:19 +0200
Subject: RFR (S) 8231932: Shenandoah: conc/par GC threads ergonomics overrides
 user settings
Message-ID: <4e63d1e7-0c98-a491-f954-7c6b7048602c@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8231932

Fix:
  https://cr.openjdk.java.net/~shade/8231932/webrev.01/

This manifests in tier1 tests, and that is actually the UX problem. New test captures it directly.
Patched code favors adjusting the setting that was selected ergonomically, which leaves the user
setting alone.

Also, it is awkward to adjust the GC threads settings silently (which is why test failed without
proper message), and we should fail on misconfiguration right away, which explains the adjustments
in existing tests.

Testing: new test, hotspot_gc (with Shenandoah), tier1 (with Shenandoah), hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From stefan.johansson at oracle.com  Mon Oct  7 12:57:39 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 7 Oct 2019 14:57:39 +0200
Subject: RFR: 8231153: Improve concurrent refinement statistics
In-Reply-To: <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
References: <1DADC595-3106-4CE7-BA5D-7B6C7EE0E81E@oracle.com>
 <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
 <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
Message-ID: <3bafef2f-1380-3105-5c54-5e8095c42409@oracle.com>

Hi Kim,

On 2019-10-02 03:08, Kim Barrett wrote:
> New webrevs:
> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.01/
> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.01.inc/

The changes looks good, just one question around the calculation of 
total time and size.

src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp
---
  415 Tickspan G1ConcurrentRefine::total_refinement_time() const {
  ...
  425   const_cast<G1ConcurrentRefine*>(this)->threads_do(&closure);
  426   return closure._total_time;
  427 }
  428
  429 size_t G1ConcurrentRefine::total_refined_cards() const {
  ...
  439   const_cast<G1ConcurrentRefine*>(this)->threads_do(&closure);
  440   return closure._total_cards;
  441 }

Did you consider grouping these two functions into one, to avoid 
iterating the threads twice? Not sure this is a big deal, and it might 
only make the code more complicated, but it feels a bit unnecessary to 
do two iteration right after each other.
---

Thanks,
Stefan

> 
> Testing:
> mach5 tier1-5
> some local by-hand testing to look at the rate tracking.
> 


From rkennke at redhat.com  Mon Oct  7 13:36:38 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 07 Oct 2019 15:36:38 +0200
Subject: RFR (S) 8231932: Shenandoah: conc/par GC threads ergonomics
 overrides user settings
In-Reply-To: <4e63d1e7-0c98-a491-f954-7c6b7048602c@redhat.com>
References: <4e63d1e7-0c98-a491-f954-7c6b7048602c@redhat.com>
Message-ID: <155246FC-4947-48BA-93BF-29B895F49B19@redhat.com>

Looks OK to me.

Thanks!
Roman


Am 7. Oktober 2019 14:51:19 MESZ schrieb Aleksey Shipilev <shade at redhat.com>:
>Bug:
>  https://bugs.openjdk.java.net/browse/JDK-8231932
>
>Fix:
>  https://cr.openjdk.java.net/~shade/8231932/webrev.01/
>
>This manifests in tier1 tests, and that is actually the UX problem. New
>test captures it directly.
>Patched code favors adjusting the setting that was selected
>ergonomically, which leaves the user
>setting alone.
>
>Also, it is awkward to adjust the GC threads settings silently (which
>is why test failed without
>proper message), and we should fail on misconfiguration right away,
>which explains the adjustments
>in existing tests.
>
>Testing: new test, hotspot_gc (with Shenandoah), tier1 (with
>Shenandoah), hotspot_gc_shenandoah
>
>-- 
>Thanks,
>-Aleksey

-- 
Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet.

From stefan.johansson at oracle.com  Mon Oct  7 13:41:21 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 7 Oct 2019 15:41:21 +0200
Subject: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent
 marking bitmaps.
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
Message-ID: <09ff53d7-cbc2-37e9-81b5-b3de6bc6ea16@oracle.com>

Hi Kishor,

On 2019-10-04 03:00, Kharbas, Kishor wrote:
> Hi,
> 
> When I worked on JDK-8211425 
> <https://bugs.openjdk.java.net/browse/JDK-8211425>, there was a request 
> for better abstraction for pinning G1?s CM bitmaps. RFE for the request 
> is here - JDK-8215893 <https://bugs.openjdk.java.net/browse/JDK-8215893>.
> 
> Here is a proposal : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/
> 
> Here G1PageBasedVirtualSpace pins the entire reserved memory to memory 
> during construction. The constructor takes an additional bool flag which 
> says ?does it need to pin the memory?.
> 
> If the memory is pinned, ?_special? flag is set to true. I piggy back on 
> _special flag?s behavior which is to not do actual OS (un-)commits on 
> calls to (un)commit().
> 
> Rest of the changes is the mechanism to pass this flag from CM bitmaps 
> creation in G1CollectedHeap all the way to G1PageBasedVirtualSpace.
> 
> Let me know if this is a good abstraction and if there is any better way.
> 

I'm not sure I like this approach better, and even though I'm not super 
fond of the commit_and_set_special function either, at least the old way 
kept the pinning code quite isolated. Moving the commit_internal() call 
into initialize_with_page_size() feels like a move in the wrong 
direction. I'm not sure I have a much better idea, but one thing to try 
would be to tell the underlying ReservedSpace that it should be 
special/pinned even if it is not mapped with large pages. That way the 
upper layers should just work.

Another thing, can you remind me why we need the bitmaps to be pinned 
but not other structures such as the card table?

Thanks,
Stefan


> Thanks
> 
> Kishor
> 


From shade at redhat.com  Mon Oct  7 14:08:18 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 7 Oct 2019 16:08:18 +0200
Subject: RFR (XS/T) 8231946: Remove obsolete and unused
 ShenandoahVerifyObjectEquals flag
Message-ID: <2ac7e61d-4f52-79aa-239f-80b4a1bbd019@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8231946

This flag was obsoleted and not used for a while. Let's remove it:

diff -r de43643147c6 src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp
--- a/src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp    Mon Oct 07 15:30:29 2019 +0200
+++ b/src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp    Mon Oct 07 16:07:09 2019 +0200
@@ -315,7 +315,4 @@
           "Tracing task termination timings")                               \
                                                                             \
-  develop(bool, ShenandoahVerifyObjectEquals, false,                        \
-          "Verify that == and != are not used on oops. Only in fastdebug")  \
-                                                                            \
   diagnostic(bool, ShenandoahAlwaysPreTouch, false,                         \
           "Pre-touch heap memory, overrides global AlwaysPreTouch")         \

Testing: x86_64 build, hotspot_gc_shenandoah (running)

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Mon Oct  7 14:19:11 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 07 Oct 2019 16:19:11 +0200
Subject: RFR (XS/T) 8231946: Remove obsolete and unused
 ShenandoahVerifyObjectEquals flag
In-Reply-To: <2ac7e61d-4f52-79aa-239f-80b4a1bbd019@redhat.com>
References: <2ac7e61d-4f52-79aa-239f-80b4a1bbd019@redhat.com>
Message-ID: <1D29E0A2-0EF2-4C52-80B9-9EB7229E202D@redhat.com>

Yup. (I believe we have some more unused flags like ShStoreCheck)

Am 7. Oktober 2019 16:08:18 MESZ schrieb Aleksey Shipilev <shade at redhat.com>:
>RFE:
>  https://bugs.openjdk.java.net/browse/JDK-8231946
>
>This flag was obsoleted and not used for a while. Let's remove it:
>
>diff -r de43643147c6
>src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp
>--- a/src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp    Mon Oct
>07 15:30:29 2019 +0200
>+++ b/src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp    Mon Oct
>07 16:07:09 2019 +0200
>@@ -315,7 +315,4 @@
>    "Tracing task termination timings")                               \
>                                                                      \
>-  develop(bool, ShenandoahVerifyObjectEquals, false,                  
>     \
>-          "Verify that == and != are not used on oops. Only in
>fastdebug")  \
>-                                                                      
>     \
>diagnostic(bool, ShenandoahAlwaysPreTouch, false,                      
>  \
>    "Pre-touch heap memory, overrides global AlwaysPreTouch")         \
>
>Testing: x86_64 build, hotspot_gc_shenandoah (running)
>
>-- 
>Thanks,
>-Aleksey

-- 
Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet.

From kim.barrett at oracle.com  Mon Oct  7 18:10:42 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 7 Oct 2019 14:10:42 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
Message-ID: <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>

> On Oct 1, 2019, at 12:43 PM, sangheon.kim at oracle.com wrote:
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
> Testing: hs-tier1 ~ 5 +-UseNUMA

I like the direction of this.  I think there are some additional simplifications possible
around G1NUMA, which are discussed below.

I still need to respond to your earlier individual responses.  That will be in another email.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
  67   LINUX_ONLY(if (UseNUMA) {

Maybe instead use #ifdef LINUX.  Either way, add a trailing comment at
the end of the conditional block.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1NUMA.cpp 
  79   // If we don't have preferred numa id, touch the given area with round-robin manner.

This comment seems out of place / obsolete.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1NUMA.cpp
 138   uint region_index = G1CollectedHeap::heap()->addr_to_region(address);

This requires the address be in the range reserved for the heap.
That's okay; that's what we decided we want to do.  But that should be
part of the function's description, e.g. it should be mentioned as a
precondition for prefered_index_for_address.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1NUMA.hpp
  87   // Returns numa id of the given numa index.
  88   inline int numa_id_of_index(uint numa_index) const;

Unused function.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1NUMA.hpp
  83   inline uint index_of_numa_id(int numa_id) const;

This function should be private.  It is only needed in the
implementation of index_of_current_thread and index_of_address.
It should have a precondition that the argument is an active numa id,
e.g. a definition something like

uint G1NUMA::index_of_numa_id(int numa_id) const {
  assert(numa_id >= 0, "invalid numa id %d", numa_id);
  assert(numa_id < _len_numa_id_to_index_map, "invalid numa id %d", numa_id);
  uint numa_index = _numa_id_to_index_map[numa_id];
  assert(numa_index != G1MemoryNodeManager::InvalidNodeIndex,
         "invalid numa id %d", numa_id);
  return numa_index;
}

To make this work, index_of_address should also be changed, to
something like:

uint G1NUMA::index_of_address(HeapWord* address) const {
  int numa_id = os::numa_get_address_id((uintptr_t)address);
  if (numa_id == os::InvalidId) {
    return G1MemoryNodeManager::InvalidNodeIndex;
  } else {
    return index_of_numa_id(numa_id);
  }
}

------------------------------------------------------------------------------ 
src/hotspot/share/gc/g1/g1NUMA.cpp
  31 void G1NUMA::init_numa_id_to_index_map(const int* numa_ids, uint num_numa_ids) {

This function is only called from one place, G1NUMA::initialize.  The
code would be simpler and more clear if the body of this function were
just directly inlined into initialize and this function eliminated.

And once that's done it becomes apparent that initialize could be
hoisted into the (moved out of line) constructor.

This also lets num_active_numa_ids just be a trivial accessor function
in the header; there's no possibility of finding it uninitialized
after the constructor returns, so no need for the assert that it has
been set.

------------------------------------------------------------------------------  
src/hotspot/share/gc/g1/g1NUMA.inline.hpp
  32 inline bool G1NUMA::is_valid_numa_id(int numa_id) {

Only called by init_numa_to_index_map in a guarantee that would be
more obviously vacuous after the earlier suggested merge of that
function into initialize.

------------------------------------------------------------------------------
src/hotspot/share/runtime/os.hpp 
 393   enum NumaIdState {
 394     InvalidId = -1,
 395     AnyId = -2
 396   };

The type NumaIdState is unused.
The AnyId enumerator is unused.

Suggest making InvalidId just a static const int in the class.

------------------------------------------------------------------------------
src/hotspot/share/runtime/os.hpp 
 398   static int numa_get_address_id(uintptr_t address);

Why is the type of address uintptr_t rather than a pointer type?

I see that the underlying Linux syscall (get_mempolicy) wants an
unsigned long, but that detail ought to be isolated to the Linux
implementation layer.  Callers are going to want to pass in addresses
(pointers) and should not need to cast.  That cast should happen at
the point where the syscall is being made.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1Allocator.inline.hpp
  37 inline MutatorAllocRegion* G1Allocator::mutator_alloc_region(uint node_index) {
  38   assert(_g1h->mem_node_mgr()->is_valid_node_index(node_index), "Invariant, index %u", node_index);
  39   return &_mutator_alloc_regions[node_index];
  40 }

I think the assert here should be that node_index < _num_alloc_regions.

is_valid_node_index gives a somewhat indirect (so weak) check of the
validity of the array access.

Such a change would also eliminate one of the two callers of
is_valid_node_index, which I think can be eliminated (see next comment).

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegionManager.cpp
 126 HeapRegion* HeapRegionManager::allocate_free_region(HeapRegionType type, uint requested_node_index) {
...
 131   if (mgr->num_active_nodes() > 1 && mgr->is_valid_node_index(requested_node_index)) {

I think a better test here would be
  if ((requested_node_index != G1MemoryNodeManager::AnyNodeIndex) &&
      (mgr->num_active_nodes() > 1)) {

This eliminates one of two calls to is_valid_node_index (which I think
can be eliminated, see previous comment).  And callers should not be
passing in actually invalid indices.  I think there are asserts lower
down in the stack (in G1NUMA) to complain about such, but they
shouldn't be getting in here anyway.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
  42   static const uint InvalidNodeIndex = UINT_MAX;
  43   static const uint AnyNodeIndex = InvalidNodeIndex - 1;

These seem misplaced to me.  Shouldn't they be in G1NUMA?  Possibly
reexported here for convenience?  (Assuming it actually is convenient.)

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
  42   static const uint InvalidNodeIndex = UINT_MAX;

I think the only place this arises is as the result of
index_of_address when the numa id for the location isn't known.  Which
suggests the name should be "UnknownNodeIndex" rather than
"InvalidNodeIndex".  And the description of index_of_address should
mention that it can return that value (whatever its name ends up being.)

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp

I'm not sure G1MemoryNodeManager is useful. It seems to be just a thin
wrapper over the G1NUMA API, with a virtual dispatch between a
non-NUMA or single-node implementation and the multi-node
implementation that uses a G1NUMA that is only created for multi-node
support. The virtual dispatch can't be eliminated in most (all or
nearly all?) cases.

But I think most of the single-node implementation would just fall out
as a 1-node boundary case for multi-node G1MemoryNodeManager / G1NUMA.

So I think this might all be collapsed down to a G1NUMA that always
exists.  If there are any places that require actual distinction, that
class can have a private member to select the appropriate behavior.
(Or maybe it's just the number of active nodes.)

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1NUMA.inline.hpp

I think that with the changes I've proposed above, I think there's not
much left in this file, and it might not be worth having it.  Consider
moving any lingering remnents to the .hpp or .cpp file as appropriate.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1NUMA.hpp

Consider adding a page_size() accessor function (private for now) that
asserts the associated data member is > 0 (e.g. initialized), since it
is initialized after construction.  Use that instead of direct uses of
the data member.

------------------------------------------------------------------------------
src/hotspot/share/runtime/arguments.cpp
4108     // such as Parallel GC for Linux and Solaris or G1 GC for Linux will
...
4111     // Non NUMA-aware collectors such as CMS and Serial-GC on
4112     // all platforms and ParallelGC on Windows will interleave all

I think that these comments about which configurations do or don't
support NUMA are just a maintenance headache. I think it would be
better here to just say

  NUMA-aware collectors will interleave ...
  Non NUMA-aware collectors will interleave ...

And leave out mentions of configurations that may change (as is being
done here) or be removed (as soon expected for CMS).

------------------------------------------------------------------------------


From kim.barrett at oracle.com  Mon Oct  7 18:35:56 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 7 Oct 2019 14:35:56 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
Message-ID: <B69076FD-7DAD-48EA-A088-B6F634B2A1D2@oracle.com>

> On Oct 7, 2019, at 2:10 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>> On Oct 1, 2019, at 12:43 PM, sangheon.kim at oracle.com wrote:
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
>> Testing: hs-tier1 ~ 5 +-UseNUMA
> 
> I like the direction of this.  I think there are some additional simplifications possible
> around G1NUMA, which are discussed below.

I forgot to mention: Some of my comments were a bit intertwined, so
that I ended up making a couple of patches to help me keep track.
Here are webrevs for those patches, which might be of some help to
you; use any parts you find useful.

https://cr.openjdk.java.net/~kbarrett/8220310/kab_g1numa/
https://cr.openjdk.java.net/~kbarrett/8220310/is_valid_numa_index/


From kim.barrett at oracle.com  Mon Oct  7 18:48:21 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 7 Oct 2019 14:48:21 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
Message-ID: <AF25E9CE-4DB5-48EE-BEB7-502D81DE514A@oracle.com>

> On Oct 1, 2019, at 12:43 PM, sangheon.kim at oracle.com wrote:
> 
Here are my inline responses to yours.

> 
> On 9/24/19 6:44 PM, Kim Barrett wrote:
>>> On Sep 21, 2019, at 1:19 AM, sangheon.kim at oracle.com
>>>  wrote:
>>> 
>>> webrev:
>>> 
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.1
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.1.inc
>>>  (this may not help much! :) )
>>> Testing: hs-tier 1 ~ 5 (with/without UseNUMA)
>>> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/gc/g1/g1AllocRegion.hpp
>>   96   uint _node_index;
>> 
>> Protected; should be private.
>> 
> _node_index is used from derived classes.
> Are you suggesting to add a getter?

Oops, missed that it was used in derived classes.

I usually try to avoid non-private data members, and would add a getter here,
but that?s not a universal style in our code.

>> src/hotspot/share/gc/g1/g1Allocator.cpp
>>   53 G1Allocator::~G1Allocator() {
>>   54   for (uint i = 0; i < _num_alloc_region; i++) {
>>   55     _mutator_alloc_region[i].~MutatorAllocRegion();
>>   56   }
>>   57   FREE_C_HEAP_ARRAY(MutatorAllocRegion, _mutator_alloc_region);
>>   58 }
>> 
>> --- should also be calling _mutator_alloc_region[i].release() ??
>> --- or does destructor do that?
>> 
> No, release() is never called.
> release() is not actually releasing allocated resources but sets null to pointers and inc/dec some numbers such as used bytes. So I was thinking we don't need to call release().

Thanks for clarifying that.  That was a reminder for me to go figure that out,
but I forgot to do so before sending off that round of comments.

>> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>>   83 G1PageBasedVirtualSpace::~G1PageBasedVirtualSpace() {
>> ...
>>   92   _numa                   = NULL;
>>   93 }
>> 
>> [pre-existing] Destructors are for resource management. Nulling out /
>> zeroing out members in a destructor generally isn't useful. This is
>> really a comment on the existing code rather than a request to change
>> anything. The addition of line 92 is okay in context, just the context
>> is not good.
>> 
> Agreed on pre-existing.
> The intent here is to align with existing context, so leave as is?

You can leave as is.

>> src/hotspot/share/gc/g1/g1NUMA.cpp
>>   42   memset(_numa_id_to_index_map, 
>>   43          G1MemoryNodeManager::InvalidNodeIndex,
>>   44          sizeof(uint) * _len_numa_id_to_index_map);
>> 
>> memset only works here because all bytes of InvalidNodeIndex happen to
>> have the same value.  I would prefer an explicit fill loop rather than
>> memset here.  Or a static assert on the value, but that's probably
>> more code.
>> 
> Changed to fill during loop.
> I'm aware of this and the only reason of changing InvalidNodeIndex from 0xfffe to 0xffff was to use memset here.
> I was thinking you are okay with memset as you commented to use memset from your previous email. :) 

Seems like my earlier suggestion to use memset was a bad idea?


From kim.barrett at oracle.com  Mon Oct  7 22:38:49 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 7 Oct 2019 18:38:49 -0400
Subject: RFR: 8231153: Improve concurrent refinement statistics
In-Reply-To: <3bafef2f-1380-3105-5c54-5e8095c42409@oracle.com>
References: <1DADC595-3106-4CE7-BA5D-7B6C7EE0E81E@oracle.com>
 <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
 <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
 <3bafef2f-1380-3105-5c54-5e8095c42409@oracle.com>
Message-ID: <0280B88E-45D7-46C0-A0E5-2E708B0132ED@oracle.com>

> On Oct 7, 2019, at 8:57 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> Hi Kim,
> 
> On 2019-10-02 03:08, Kim Barrett wrote:
>> New webrevs:
>> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.01/
>> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.01.inc/
> 
> The changes looks good, just one question around the calculation of total time and size.
> 
> src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp
> ---
> 415 Tickspan G1ConcurrentRefine::total_refinement_time() const {
> ...
> 425   const_cast<G1ConcurrentRefine*>(this)->threads_do(&closure);
> 426   return closure._total_time;
> 427 }
> 428
> 429 size_t G1ConcurrentRefine::total_refined_cards() const {
> ...
> 439   const_cast<G1ConcurrentRefine*>(this)->threads_do(&closure);
> 440   return closure._total_cards;
> 441 }
> 
> Did you consider grouping these two functions into one, to avoid iterating the threads twice? Not sure this is a big deal, and it might only make the code more complicated, but it feels a bit unnecessary to do two iteration right after each other.

Thanks for the suggestion. I tried doing something like that in an
earlier version of this change, but I didn't like how it turned out.
But enough code has changed since then that I decided to try again.
This time seems okay.

So G1ConcurrentRefine now provides a RefinementStats class that
packages up the time and card counts, and a new function
total_refinment_stats() that returns one of those. Also removed
total_refinement_time() and total_refined_cards(), which are no longer
used. (If that were to change they are easily reinstated as wrappers
over total_refinement_stats().)

New webrevs:
full: https://cr.openjdk.java.net/~kbarrett/8231153/open.02/
incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.02.inc/

Testing:
mach5 tier1


From sangheon.kim at oracle.com  Tue Oct  8 04:13:00 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Mon, 7 Oct 2019 21:13:00 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <43f26130-f340-f23a-11fd-773f696998a9@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <43f26130-f340-f23a-11fd-773f696998a9@oracle.com>
Message-ID: <2cb3774d-3ea9-19ef-1eaa-a224129bed93@oracle.com>

Hi Thomas,

Many thanks for this thorough review!

On 10/4/19 5:11 AM, Thomas Schatzl wrote:
> Hi Sangheon,
>
> ? thanks for your? hard work on this!
>
> On 01.10.19 18:43, sangheon.kim at oracle.com wrote:
>> Hi Kim and others,
>>
>> This webrev.2 simplified a bit more after changing 'heap expansion' 
>> approach.
> [...]
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
>> Testing: hs-tier1 ~ 5 +-UseNUMA
>>
>
> Comments:
>
> - os_solaris.cpp:2236: indentation addition
I don't see any changes at line 2236?

>
> - os_windows.cpp: os::get_address_id(): the "return 0" is in the same 
> line as the method declaration, while the change uses extra lines in 
> os_bsd. Please make this uniform.
Done.

>
> g1_globals.cpp: unnecessary whitespace? change
Done.

>
> - G1Allocator::unsafe_max_alloc(): I need to think some more if this 
> is correct - should that really be node specific? (and not the max of 
> all nodes).
>
> Otoh I think this is fine.
Yes, according to the comment at the first line at the method:
 ? // Return the remaining space in the cur alloc region, but not less than
 ? // the min TLAB size.

'cur alloc region' differs per numa node, so it reflects node index.

>
> - g1Allocator::used_in_alloc_regions(): not sure why the assert has 
> been removed
Probably removed during rebasing the patch.

>
> - os_linux.cpp: os::numa_get_address_id(): I think "id" should be an 
> "int", not uint32_t according to 
> http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
>
> And I think you can initialize it with os::InvalidId;
All done, change to 'int' and 'InvalidId'.

>
> - in G1Allocator::current_node_index() retrieving the current node 
> index is part of the G1MemoryNodeManager; I would really prefer if 
> this were some property of a Thread. Not sure what others think.
>
> That value could be put into the G1ThreadLocalData.
>
> In any case, G1Allocator should probably cache the reference to the 
> G1MemoryNodeManager for faster access.
Added new member of G1MemoryNodeManager* at G1Allocator.

>
> - G1CollectedHeap::expand_single_region(): the log output in the first 
> line looks more like some debug code than generally interesting 
> information.
Removed the first line log.

>
> - G1CollectedHeap::expand_single_region(): pre-existing: it should add 
> to the in-safepoint expansion time like expand(); okay to just file a CR.
Filed
>
> - instead of the "late initialization method" set_page_size() I would 
> prefer to have this value passed in the constructor. It is not 
> required to me to have the create() call in the initialization list of 
> g1CollectedHeap at all costs... it could be put right after we 
> determine the page size in the body of the G1CollectedHeap constructor.
We do need G1MemoryNodeManager instance to get the number of active numa 
nodes when we construct G1Allocator. i.e. we create per numa node 
G1AllocRegion at G1Allocator.
And we also need page size at G1MemoryNodeManager after G1CollectedHeap 
is initialized.

I tried to add comment at G1NUMA::initialize().

>
> - g1CollectedHeap.hpp:940: no need to delete the newline.
Reverted the newline.

>
> - g1MemoryNodeManager.cpp:41: that comment does not add information imho
:)
Removed the comment.

>
> - G1NUMA::index_of_current_thread() needs a comment
Added:
// Returns numa index of current calling thread.

Do you have any suggestions?

I was thinking the method name is more than enough to explain itself. :)

>
> - G1NUMA::index_of_num_id/is_valid_numa_id/ should be private
Done.
Actually got same comment from Kim as well during private discussion.

>
> - not sure why G1NUMA::initialize()/set_numa() are needed. It's only 
> call is right after instantiating a G1NUMA instance
Probably you are pointing G1NUMA::initialize() and set_page_size()?
I tried to explain above why we need 2 calls.

>
> - G1NUMA::request_memory_on_node needs a comment.
Added:
 ? // Request the given range of memory to be located at a specific numa 
node.
 ? // But OS doesn't guarantee to reside on the node.
 ? // The numa node is decided by preferred_index_for_address().

>
> - I observed that a *lot* of G1NUMA methods are only used by 
> G1MemoryNodeManager; and G1MemoryNodeManager just forwards to G1NUMA a 
> lot. Maybe these two can be merged?
Done.
I agree since G1NUMA is getting smaller and smaller.
Since this is relatively large change, I had to revisit all addressed 
comments above. :)

>
> - G1NUMA::preferred_index_for_address/request_memory_on_node: I would 
> prefer if these methods were not hardcoded with HeapRegion metrics as 
> example.
>
> I.e. for preferred_index_for_address(), instead of the address it is 
> probably better to pass it the zero-based index directly, that is used 
> for calculating the node index. I.e. all callers know the HeapRegion's 
> index anyway *and* this would make the method independent of 
> G1CollectedHeap.
>
> I.e. something like preferred_node_index_for_index(<region-index>), 
> because then the same method can be reused for other data structures 
> than the heap/heap region.
Done. Removed all dependency with G1CollectedHeap and HeapRegion at 
G1MemoryNodeManager(previously G1NUMA).
Added 'size_t _region_size' at G1NUMA and then 
G1NUMA::preferred_node_index_for_index(uint heap_region_index).

> G1NUMA::request_memory_on_node() could also be moved to 
> G1PageBasedVirtualSpace, using the chunk sizes of page based virtual 
> space instead of hardcoding HeapRegion::GrainBytes (i.e. hardcode the 
> method to HeapRegion) - or pass in the "chunk size" calculated there 
> from G1PageBasedVirtualSize.
>
> I think this would increase the generality and usefulness of 
> G1NUMA/G1MemoryNodeManager a lot without "passing in too many node 
> indices everywhere".
>
> - G1PageBasedVirtualSpace: the _numa member seems to be used exactly 
> in one method where performance does not look critical. Maybe it is 
> better to reference it directly there via G1CollectedHeap.
With the above comment of 'G1NUMA::request_memory_on_node() to be moved 
to G1PageBasedVirtualSpace, I tried to change a bit.
_numa is removed from G1PageBasedVS.
The intent of new implementation is to avoid HeapRegion dependency as 
you mentioned.

>
> - heapRegionManager.cpp:verfiy_actual_node_index(): not sure if that 
> should be debug level.
Are you suggesting 'trace' level'?

>
> I would also kind of prefer a method that iterates over all regions 
> and prints a summary status (and potentially drop this per-region 
> checking at least when allocating a new free region). 
Probably I'm missing something but what I tried to mention during 
internal discussion was this. But someone told me adding different log 
level should be sufficient and I agree on that.

I don't have strong opinion verifying at HRM::allocate_free_region() so 
I will remove if there's no other opinion on this. Currently we can 
check node index when the region is being used on specific log level/tag.

Stefan (and probably you as well) mentioned verifying at safepoint, but 
it would be better to address at JDK-8220312 (3/3 which is part of this 
JEP) since I would like to go forward. :)

> It is sufficient to print a summary of expected/actual values, at most 
> a summary per node. I.e.
>
> "NUMA Node index verification: Nodes: X_0/Y_0 X_1/Y_1 ... X_N/Y_N 
> Unknown: Z Total: X/Y"
>
> where X(_i) is the number of matching indexes (for node index i) and 
> Y(_i) the number of expected (for node index i).
I'm okay with adding additional log which is simpler version than 
existing one.
I would say the new one at Debug level while existing one remains Trace 
level.
The benefit of printing current way is that we can see how heap region 
(and page) is consisted. So current jtreg test is utilizing it.

Do you have any recommendation for print timing for the new log?

>
> Also, the correct word to use here is "mismatch" not "different" (to 
> what?)
Changed to 'mismatch'.
The logging was printing both actual value and preferred value, so I'm 
thinking those 2 values are different. :)

>
> - in some discussion we talked about the "node_index" lifecycle, and 
> what I remember is the following:
>
> ? - initially, when we commit/make the region available, we set that 
> HeapRegion's node_index to "Unknown" (with AlwaysPretouch on we can of 
> course immediately set the correct one).
> ? - in HeapRegion::node_index() we do something like the following 
> pseudo-code:
>
> ? {
> ??? if (_node_index == Unknown) {
> ????? // try to get actual node index from OS, and update _node_index 
> if we could get the information
> ??? }
>
> ??? if (_node_index == Unknown) { // Still unknown
> ????? // return _preferred_ node index *without* updating _node_index
> ??? }
> ??? return _node_index;
> ? }
>
> ? - now, during the "verification" pass, we use whether 
> HeapRegion::node_index() == preferred_node_index to determine if the 
> region is on the correct node.
>
> The change only sets the node index during making the region 
> available, and immediately to the preferred node index.
>
> I.e. we eventually end up with the actual node index reported by the 
> OS in HeapRegion::_node_index.
Above is what I tried to implement what we discussed internally. :)

I was aware about this bug that set_heapregion_node_index() is using 
verify_actual_node_index(). But forget fixing it before posting the 
webrev. :(

Changed like below:
static void set_heapregion_node_index(HeapRegion* hr) {
 ? uint node_index;
 ? if(AlwaysPreTouch) {
 ??? // If we already pretouched, we can check actual node index here.
 ??? node_index = 
G1MemoryNodeManager::mgr()->index_of_address(hr->bottom());
 ? } else {
 ??? node_index = 
G1MemoryNodeManager::mgr()->preferred_node_index_for_index(hr->hrm_index());
 ? }
 ? hr->set_node_index(node_index);
}

BTW, setting node index at make_regions_available() is intentional since 
it is the only place HeapRegion is initialized so HeapRegion change is 
limited. Am I missing something?
I would prefer to remain HeapRegion::node_index() simple getter.
Another idea that I didn't choose is let HeapRegion::initialize() do the 
setting node index work and change HeapRegion::set_node_index() to 
clear_node_index(which sets to InvalidIndex).

>
>
> - for the expression "G1MemoryNodeManager::num_active_nodes() > 1" it 
> would be nice to have an extra method in G1MemoryNodeManager instead 
> of repeating it over and over.
Done.
FYI, currently there are 2 locations but following patches may use more.

>
> - heapRegionManager.cpp:print_node_id_of_regions: that method will 
> print a huge amount of lines. Better to print the summary I sketched 
> out above.
This is why I added at 'trace' level and it is okay to me.

>
> - in FreeRegionList::remove_region_with_node_index(), the maximum 
> search depth must take into account how many regions are there per page.
>
> Consider 1GB pages, 32M region size, meaning that we get 32 
> consecutive regions/page.
> Now with a node amount of 2, the maximum search depth will be 6 - 
> which is too low :)
> The intention is probably 3? * MAX(page_size / region size, 1) * 
> numa->num_active_numa_ids().
>
> I think it is useful to put that expresssion into 
> G1NUMA/G1NodeMemoryManager (or somewhere else appropriate - 
> HeapRegionManager?) to avoid that part having too much info about page 
> size.
Nice catch!
Added G1MemoryNodeManager::max_search_depth() which addresses your comment.

>
> - os.hpp: the new Enum values might or might need some description.
enum will be replaced with static const int as AnyId will be removed.

>
> Btw, there is no regression in performance from the .0/.1 versions of 
> this code in our benchmarks.
Great!
Many thanks for doing benchmark tests, Thomas!

Let me post next webrev after addressing all Stefan and Kim's comments 
as well.

Thanks,
Sangheon


>
> Thanks,
> ? Thomas


From sangheon.kim at oracle.com  Tue Oct  8 05:44:02 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Mon, 7 Oct 2019 22:44:02 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <25fe4295-7de8-bbc3-3dd3-6b750d4982f2@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <25fe4295-7de8-bbc3-3dd3-6b750d4982f2@oracle.com>
Message-ID: <a789f302-e83f-0788-8bf1-a303287b4e9f@oracle.com>

Hi Stefan,

On 10/4/19 5:23 AM, Stefan Johansson wrote:
> Hi Sangheon,
>
> First of all, thanks for this updated version incorporating a lot of 
> our comments. I think we are getting closer to the goal, but I still 
> have some more comments :)
Thanks for the nice suggestions!

>
> On 2019-10-01 18:43, sangheon.kim at oracle.com wrote:
>> Hi Kim and others,
>>
>> This webrev.2 simplified a bit more after changing 'heap expansion' 
>> approach.
>> Previously heap may expand with preferred numa id which means 
>> contiguous same numa id heap regions may exist but current version is 
>> assuming to have evenly split heap regions. i.e. 4 numa node system, 
>> heap regions will be 012301230123, so if we know address or heap 
>> region index, we can know preferred numa id.
>>
>> Many codes related to support previous style expansion were removed.
>>
>> ...
>>
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
>
> src/hotspot/share/gc/g1/g1Allocator.cpp
> ---
> ? 31 #include "gc/g1/g1NUMA.hpp"
> I don't see why this include is needed, but you might want to include 
> gc/g1/g1MemoryNodeManager.hpp instead.
You're right.
Done.

> ---
>
> hotspot/share/gc/g1/g1CollectedHeap.cpp
> ---
> 1518?? _mem_node_mgr(G1MemoryNodeManager::create()),
>
> I saw your response to Kim regarding G1Allocator needing it do be 
> initialized and I get that, but have you looked at moving the creation 
> of G1Allocator to initialize() as well, I think it's first use is 
> actually below:
> 1802?? _mem_node_mgr->set_page_size(page_size);
> here:
> 1851?? _allocator->init_mutator_alloc_regions();
>
> I might be missing some other place where it gets called, but I think 
> it should be safe to create both the node manager and the allocator 
> early in initialize().
Yeah, we can consider this as well. But there are some other followup 
enhancements which may affect to this initialization order, so I would 
like to leave as is. And then file a separate CR.
One of the example is separating free list, so HeapRegionManager also 
needs G1MemoryNodeManager instance to initialize free list.

> ---
>
> src/hotspot/share/gc/g1/g1RegionToSpaceMapper.hpp
> ---
> 28 #include "gc/g1/g1MemoryNodeManager.hpp"
>
> Remove this include.
Done.

> ---
>
> src/hotspot/share/gc/g1/g1_globals.hpp
> ---
> 326??????????????? range(0, 100)
>
> Remove the backslash and add back the removed line to leave the file 
> gc, heap, numa, verificationunchanged.
Done.

> ---
>
> src/hotspot/share/gc/g1/heapRegionManager.cpp
> ---
> ?142?? if (hr != NULL) {
> ?143???? assert(hr->next() == NULL, "Single region should not have 
> next");
> ?144???? assert(is_available(hr->hrm_index()), "Must be committed");
> ?145
> ?146???? verify_actual_node_index(hr->bottom(), hr->node_index());
> ?147?? }
>
> I don't think this is a good place to do the verification, we allocate 
> the free region while holding a lock and I think we should avoid doing 
> a system call there. I would rather see this done during a safepoint, 
> having a closure that iterates the heap and verify all regions.
I tried to point out this during the discussion but probably not enough. :(
My understanding of the result is okay as the logs are protected by log 
level+tag. But as replied to Thomas, I will remove the verification at 
HRM::allocate_free_region() if there's no more opinions.

Any opinions? Thomas or Kim?

>
> I also think it would be nice to have two levels of the output, the 
> one line for each region on trace level and on debug we can have a 
> summary, something like:
> NUMA Node 1: expected=25, actual=23
> NUMA Node 2: expected=25, actual=27
>
> What do you (and others) think about that?
Having 2 level log print seems good to me.
And your suggestion is similar to Thomas' one and I would like to 
address it at the later patch #3 (JDK-8220312 which is also part of the JEP)

> ---
> ?216 static void print_node_id_of_regions(uint start, uint num_regions){
> ?217?? LogTarget(Trace, gc, heap, numa) lt;
>
> I understand that it might make the test a bit more complicated, but 
> have you thought about instead adding the node index to the heap 
> printing done when <gc, heap, region> is enabled on trace level?
So you are suggesting the log tag from gc+heap+numa to gc+heap+region?

> ---
> ?235 static void set_heapregion_node_index(HeapRegion* hr) {
>
> I don't think we should special case for when AlwaysPreTouch is on and 
> instead always just call hr->set_node_index(preferred_index) directly 
> in make_regions_available. The reason is that I think it will make the 
> NUMA support harder to understand and explain and it can potentially 
> also hide problems with a systems configuration. It might also 
> actually be worse then using the preferred id, because the OS might 
> decide to move the pages back to the preferred node right after we 
> checked this (not sure it will happen, but in theory).
I have different opinion, sorry.
I do believe when AlwaysPreTouch is enabled, we should check actual node 
and then use it because;
1. If don't check the actual node id when 'AlwayPreTouch' is enabled, we 
will loose a chance of having improvement if actual node is different 
from preferred node. (I know this will not happen frequently but in 
theory.. )
2. I don't think acting differently with AlwayPreTouch is a problem. I 
think it is opposite that it is a good chance to analyze the behavior of 
VM earlier. Earlier means we are planning to add verification code at 
safepoint(not yet decided when, so please give me good suggestion) which 
is later than make_regions_available(). In addition, the default value 
of AlwaysPreTouch is false so it means user requested pages to faulted in.
3. We are already assuming we cannot immediately react when OS migrates 
the memory. So if OS migrates after checking, still we are consistent on 
that assumption.

>
> An other problem with this code is the call to:
> verify_actual_node_index(hr->bottom(), node_index)
>
> This function will only return the "actual" node index if logging for 
> <gc, heap, numa, verification> is enable on debug level.
Yes, I'm aware of this problem so planned to fix before posting the 
webrev but completely forgot about it. My bad.
Replaced to work as expected.

> ---
>
> ?346? bool HeapRegionManager::is_on_preferred_index(uint region_index, 
> uint preferred_node_index) {
> ?347??? uint region_node_index = 
> G1MemoryNodeManager::mgr()->preferred_index_for_address(
> ?348 G1CollectedHeap::heap()->bottom_addr_for_region(region_index));
> ?349?? return region_node_index == preferred_node_index ||
> ?350????????? preferred_node_index == G1MemoryNodeManager::AnyNodeIndex;
>
> I guess adding the AnyNodeIndex case here is because in this patch 
> nobody is expanding on a preferred node, right? To me this is just 
> another argument to not do any changes to the expand code in this 
> patch. I know I suggested adding expand_on_preferred_node(), but I 
> should have been clearer about when I think we should add it.
Got it.
Removed AnyNodeIndex.

> ---
>
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
> ---
> ? 56?? // Returns memory node ids
> ? 57?? virtual const int* node_ids() const;
>
> Doesn't seem to be used, remove.
It will be used at patch 3/3, JDK-8220312.

> ---
>
> src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
> ---
> ?67?? LINUX_ONLY(if (UseNUMA) {
> ...
> ?79???? delete numa;
> ?80?? })
>
> A bit confusing with a multi-line LINUX_ONLY, I would prefer to hide 
> this in a private helper, something like:
> ? if (UseNUMA) {
> ???? LINUX_ONLY(create_numa_manager());
> ? }
>
> ? if (_inst == NULL) {
> ??? _inst = new G1MemoryNodeManager();
> ? }
>
> Not really happy about this either, but we can look at simplifying the 
> NUMA initialization as a follow up.
Changed as Kim suggested, hope you are okay with this.

#ifdef LINUX


> ---
>
> src/hotspot/share/gc/g1/g1NUMA.hpp
> ---
> ? 87?? // Returns numa id of the given numa index.
> ? 88?? inline int numa_id_of_index(uint numa_index) const;
>
> Currently unused, either remove or make use of it when calling 
> numa_make_local.
Done.

> ---
> ? 94?? // Returns current active numa ids.
> ? 95?? const int* numa_ids() const { return _numa_ids; }
>
> Only used by memory manager above, which in turn is unused, remove.
It will be used at patch 3/3, JDK-8220312.

> ---
>
> src/hotspot/share/gc/g1/g1NUMA.hpp
> ---
> ? 55 // Request the given memory to locate on preferred node.
> ? 56 // There are 2 things to consider.
> ? 57 // First, size comparison for G1HeapRegionSize and page size.
> ?...
> ? 62 // Examples of 4 numa ids with non-preferred numa id.
>
> What do you think about this instead:
> // Request to spread the given memory evenly across the available NUMA
> // nodes. Which node to request for a given address is given by the
> // region size and the page size. Below are two examples:
>
> I would also like a "NUMA node" row for each example showing which 
> numa node the pages and regions end up on.
Changed / added as you suggested.

Will post the webrev.3 after addressing Kim's comments and tests finished.

Thanks,
Sangheon


> ---
>
> Thanks,
> Stefan
>
>> Testing: hs-tier1 ~ 5 +-UseNUMA
>>
>> Thanks,
>> Sangheon
>>
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
>>


From thomas.schatzl at oracle.com  Tue Oct  8 07:45:45 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 8 Oct 2019 09:45:45 +0200
Subject: RFR: 8231153: Improve concurrent refinement statistics
In-Reply-To: <0280B88E-45D7-46C0-A0E5-2E708B0132ED@oracle.com>
References: <1DADC595-3106-4CE7-BA5D-7B6C7EE0E81E@oracle.com>
 <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
 <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
 <3bafef2f-1380-3105-5c54-5e8095c42409@oracle.com>
 <0280B88E-45D7-46C0-A0E5-2E708B0132ED@oracle.com>
Message-ID: <a404b808-b0d7-3085-491e-57523eebcf91@oracle.com>

Hi,

On 08.10.19 00:38, Kim Barrett wrote:
>> On Oct 7, 2019, at 8:57 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>
>> Hi Kim,
>>
>> On 2019-10-02 03:08, Kim Barrett wrote:
>>> New webrevs:
>>> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.01/
>>> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.01.inc/
>>
>> The changes looks good, just one question around the calculation of total time and size.
>>
[...]
>>
>> Did you consider grouping these two functions into one, to avoid iterating the threads twice? Not sure this is a big deal, and it might only make the code more complicated, but it feels a bit unnecessary to do two iteration right after each other.
> 
> Thanks for the suggestion. I tried doing something like that in an
> earlier version of this change, but I didn't like how it turned out.
> But enough code has changed since then that I decided to try again.
> This time seems okay.
> 
> So G1ConcurrentRefine now provides a RefinementStats class that
> packages up the time and card counts, and a new function
> total_refinment_stats() that returns one of those. Also removed
> total_refinement_time() and total_refined_cards(), which are no longer
> used. (If that were to change they are easily reinstated as wrappers
> over total_refinement_stats().)
> 
> New webrevs:
> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.02/
> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.02.inc/
> 

  still good.

Thomas


From thomas.schatzl at oracle.com  Tue Oct  8 07:50:38 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 8 Oct 2019 09:50:38 +0200
Subject: RFR (S): 8231956: Remove seq_add_card/reference from PerRegionTable
 class
Message-ID: <5a94fbaa-08f1-5fca-62cc-030709b6ba13@oracle.com>

Hi all,

   can I have reviews for this small change that removes some unused 
methods and performs associated cleanup of unnecessary parameters?

There is one related cleanup that might raise some questions:

   38 inline void PerRegionTable::add_card_work(CardIdx_t from_card, 
bool par) {
   39   if (!_bm.at(from_card)) {
   40     if (par) {
   41       if (_bm.par_set_bit(from_card)) {
   42         Atomic::inc(&_occupied);

changed to

   38 inline void PerRegionTable::add_card(CardIdx_t from_card_index) {
   39   if (_bm.par_set_bit(from_card_index)) {


The reason for this change is that BitMap::par_set_bit() implicitly 
performs the BitMap::at() check even without doing a cmpxchg, 
duplicating this functionality.

CR:
https://bugs.openjdk.java.net/browse/JDK-8231956
Webrev:
http://cr.openjdk.java.net/~tschatzl/8231956/webrev/
Testing:
hs-tier1-5

Thanks,
   Thomas


From stefan.johansson at oracle.com  Tue Oct  8 08:23:19 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 8 Oct 2019 10:23:19 +0200
Subject: RFR: 8231153: Improve concurrent refinement statistics
In-Reply-To: <0280B88E-45D7-46C0-A0E5-2E708B0132ED@oracle.com>
References: <1DADC595-3106-4CE7-BA5D-7B6C7EE0E81E@oracle.com>
 <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
 <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
 <3bafef2f-1380-3105-5c54-5e8095c42409@oracle.com>
 <0280B88E-45D7-46C0-A0E5-2E708B0132ED@oracle.com>
Message-ID: <1c77cb41-2a70-c9d6-a600-2a87df24ae9c@oracle.com>


On 2019-10-08 00:38, Kim Barrett wrote:
>> On Oct 7, 2019, at 8:57 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>
>> Hi Kim,
>>
>> On 2019-10-02 03:08, Kim Barrett wrote:
>>> New webrevs:
>>> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.01/
>>> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.01.inc/
>>
>> The changes looks good, just one question around the calculation of total time and size.
>>
>> src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp
>> ---
>> 415 Tickspan G1ConcurrentRefine::total_refinement_time() const {
>> ...
>> 425   const_cast<G1ConcurrentRefine*>(this)->threads_do(&closure);
>> 426   return closure._total_time;
>> 427 }
>> 428
>> 429 size_t G1ConcurrentRefine::total_refined_cards() const {
>> ...
>> 439   const_cast<G1ConcurrentRefine*>(this)->threads_do(&closure);
>> 440   return closure._total_cards;
>> 441 }
>>
>> Did you consider grouping these two functions into one, to avoid iterating the threads twice? Not sure this is a big deal, and it might only make the code more complicated, but it feels a bit unnecessary to do two iteration right after each other.
> 
> Thanks for the suggestion. I tried doing something like that in an
> earlier version of this change, but I didn't like how it turned out.
> But enough code has changed since then that I decided to try again.
> This time seems okay.
> 
> So G1ConcurrentRefine now provides a RefinementStats class that
> packages up the time and card counts, and a new function
> total_refinment_stats() that returns one of those. Also removed
> total_refinement_time() and total_refined_cards(), which are no longer
> used. (If that were to change they are easily reinstated as wrappers
> over total_refinement_stats().)
> 
> New webrevs:
> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.02/
> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.02.inc/
>
Thanks for trying it out a second time, this is more or less exactly 
what I had in mind.

Looks good,
Stefan

> Testing:
> mach5 tier1
> 
> 


From thomas.schatzl at oracle.com  Tue Oct  8 08:39:04 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 8 Oct 2019 10:39:04 +0200
Subject: RFR: 8231489: GC watermark_0_1 failed due to "metaspace.gc.Fault:
 GC has happened too rare"
In-Reply-To: <2aa33fe4-5d04-cd58-f786-d7a12b977ccf@oracle.com>
References: <2aa33fe4-5d04-cd58-f786-d7a12b977ccf@oracle.com>
Message-ID: <bdc9603c-d441-fbcd-91ca-6060898f0300@oracle.com>

Hi,

On 03.10.19 10:47, Per Liden wrote:
> vmTestbase/metaspace/gc/HighWaterMarkTest relies on timing and fails 
> when "Metaspace GC Threshold" isn't handled in a STW pause.
> 
> The problem can be reproduced on both G1 and ZGC, but it's hard, as the 
> window is small. However, it reproduces every time when injecting a 
> 100ms delay to prolong the GC cycle a bit. This test used to be disabled 
> for G1 with ClassUnloadingWithConcurrentMark, but JDK-8204163 enabled it 
> about a year ago.
> 
> Fixing the test properly is tricky. As far as I can see, we can either:
> 1) Disable this test for G1+ClassUnloadingWithConcurrentMark and ZGC, or
> 2) Add a sleep in the test loop, to make the race less likely to happen, or
> 3) Remove the test completely, with the rational that it's a buggy low 
> value test.
> 
> I've gone with 1) here. The test is already disabled for CMS today, with 
> code in the test itself (i.e. not using @requires), so I did two 
> alternative patches:
> 
> A) Follows the existing style to disable the other GCs:
> http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt1
> 
> B) Adds @requires to the tests using the HighWaterMarkTest class, and 
> removes the old check to disable CMS:
> http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt2
> 
> I prefer B, but I don't have a strong opinion on which way to go.
> 

B is fine with me.

Looks good.

Thomas


From per.liden at oracle.com  Tue Oct  8 09:12:09 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 8 Oct 2019 11:12:09 +0200
Subject: RFR: 8231489: GC watermark_0_1 failed due to "metaspace.gc.Fault:
 GC has happened too rare"
In-Reply-To: <bdc9603c-d441-fbcd-91ca-6060898f0300@oracle.com>
References: <2aa33fe4-5d04-cd58-f786-d7a12b977ccf@oracle.com>
 <bdc9603c-d441-fbcd-91ca-6060898f0300@oracle.com>
Message-ID: <995df990-30ce-b2eb-e8c1-9199c8a6806d@oracle.com>

Thanks for reviewing, Thomas!

/Per

On 10/8/19 10:39 AM, Thomas Schatzl wrote:
> Hi,
> 
> On 03.10.19 10:47, Per Liden wrote:
>> vmTestbase/metaspace/gc/HighWaterMarkTest relies on timing and fails 
>> when "Metaspace GC Threshold" isn't handled in a STW pause.
>>
>> The problem can be reproduced on both G1 and ZGC, but it's hard, as 
>> the window is small. However, it reproduces every time when injecting 
>> a 100ms delay to prolong the GC cycle a bit. This test used to be 
>> disabled for G1 with ClassUnloadingWithConcurrentMark, but JDK-8204163 
>> enabled it about a year ago.
>>
>> Fixing the test properly is tricky. As far as I can see, we can either:
>> 1) Disable this test for G1+ClassUnloadingWithConcurrentMark and ZGC, or
>> 2) Add a sleep in the test loop, to make the race less likely to 
>> happen, or
>> 3) Remove the test completely, with the rational that it's a buggy low 
>> value test.
>>
>> I've gone with 1) here. The test is already disabled for CMS today, 
>> with code in the test itself (i.e. not using @requires), so I did two 
>> alternative patches:
>>
>> A) Follows the existing style to disable the other GCs:
>> http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt1
>>
>> B) Adds @requires to the tests using the HighWaterMarkTest class, and 
>> removes the old check to disable CMS:
>> http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt2
>>
>> I prefer B, but I don't have a strong opinion on which way to go.
>>
> 
> B is fine with me.
> 
> Looks good.
> 
> Thomas
> 


From per.liden at oracle.com  Tue Oct  8 09:15:45 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 8 Oct 2019 11:15:45 +0200
Subject: RFR (S): 8231956: Remove seq_add_card/reference from
 PerRegionTable class
In-Reply-To: <5a94fbaa-08f1-5fca-62cc-030709b6ba13@oracle.com>
References: <5a94fbaa-08f1-5fca-62cc-030709b6ba13@oracle.com>
Message-ID: <785ee316-a389-2218-0e4c-d53db0120088@oracle.com>

Looks good!

/Per

On 10/8/19 9:50 AM, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I have reviews for this small change that removes some unused 
> methods and performs associated cleanup of unnecessary parameters?
> 
> There is one related cleanup that might raise some questions:
> 
>  ? 38 inline void PerRegionTable::add_card_work(CardIdx_t from_card, 
> bool par) {
>  ? 39?? if (!_bm.at(from_card)) {
>  ? 40???? if (par) {
>  ? 41?????? if (_bm.par_set_bit(from_card)) {
>  ? 42???????? Atomic::inc(&_occupied);
> 
> changed to
> 
>  ? 38 inline void PerRegionTable::add_card(CardIdx_t from_card_index) {
>  ? 39?? if (_bm.par_set_bit(from_card_index)) {
> 
> 
> The reason for this change is that BitMap::par_set_bit() implicitly 
> performs the BitMap::at() check even without doing a cmpxchg, 
> duplicating this functionality.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8231956
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8231956/webrev/
> Testing:
> hs-tier1-5
> 
> Thanks,
>  ? Thomas


From stefan.johansson at oracle.com  Tue Oct  8 09:25:52 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 8 Oct 2019 11:25:52 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <a789f302-e83f-0788-8bf1-a303287b4e9f@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <25fe4295-7de8-bbc3-3dd3-6b750d4982f2@oracle.com>
 <a789f302-e83f-0788-8bf1-a303287b4e9f@oracle.com>
Message-ID: <aade2978-cfcc-0763-88c4-8f73a9195010@oracle.com>

Hi Sangheon,

Thanks for addressing all out comments. Just some quick replies below.

On 2019-10-08 07:44, sangheon.kim at oracle.com wrote:
> Hi Stefan,
> 
> On 10/4/19 5:23 AM, Stefan Johansson wrote:
>> Hi Sangheon,
>>
>> First of all, thanks for this updated version incorporating a lot of 
>> our comments. I think we are getting closer to the goal, but I still 
>> have some more comments :)
> Thanks for the nice suggestions!
> 
>>
>> On 2019-10-01 18:43, sangheon.kim at oracle.com wrote:
>>> Hi Kim and others,
>>>
>>> This webrev.2 simplified a bit more after changing 'heap expansion' 
>>> approach.
>>> Previously heap may expand with preferred numa id which means 
>>> contiguous same numa id heap regions may exist but current version is 
>>> assuming to have evenly split heap regions. i.e. 4 numa node system, 
>>> heap regions will be 012301230123, so if we know address or heap 
>>> region index, we can know preferred numa id.
>>>
>>> Many codes related to support previous style expansion were removed.
>>>
>>> ...
>>>
>>> webrev:
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
>>
>> src/hotspot/share/gc/g1/g1Allocator.cpp
>> ---
>> ? 31 #include "gc/g1/g1NUMA.hpp"
>> I don't see why this include is needed, but you might want to include 
>> gc/g1/g1MemoryNodeManager.hpp instead.
> You're right.
> Done.
> 
>> ---
>>
>> hotspot/share/gc/g1/g1CollectedHeap.cpp
>> ---
>> 1518?? _mem_node_mgr(G1MemoryNodeManager::create()),
>>
>> I saw your response to Kim regarding G1Allocator needing it do be 
>> initialized and I get that, but have you looked at moving the creation 
>> of G1Allocator to initialize() as well, I think it's first use is 
>> actually below:
>> 1802?? _mem_node_mgr->set_page_size(page_size);
>> here:
>> 1851?? _allocator->init_mutator_alloc_regions();
>>
>> I might be missing some other place where it gets called, but I think 
>> it should be safe to create both the node manager and the allocator 
>> early in initialize().
> Yeah, we can consider this as well. But there are some other followup 
> enhancements which may affect to this initialization order, so I would 
> like to leave as is. And then file a separate CR.
> One of the example is separating free list, so HeapRegionManager also 
> needs G1MemoryNodeManager instance to initialize free list.
> 
>> ---
>>
>> src/hotspot/share/gc/g1/g1RegionToSpaceMapper.hpp
>> ---
>> 28 #include "gc/g1/g1MemoryNodeManager.hpp"
>>
>> Remove this include.
> Done.
> 
>> ---
>>
>> src/hotspot/share/gc/g1/g1_globals.hpp
>> ---
>> 326??????????????? range(0, 100)
>>
>> Remove the backslash and add back the removed line to leave the file 
>> gc, heap, numa, verificationunchanged.
> Done.
> 
>> ---
>>
>> src/hotspot/share/gc/g1/heapRegionManager.cpp
>> ---
>> ?142?? if (hr != NULL) {
>> ?143???? assert(hr->next() == NULL, "Single region should not have 
>> next");
>> ?144???? assert(is_available(hr->hrm_index()), "Must be committed");
>> ?145
>> ?146???? verify_actual_node_index(hr->bottom(), hr->node_index());
>> ?147?? }
>>
>> I don't think this is a good place to do the verification, we allocate 
>> the free region while holding a lock and I think we should avoid doing 
>> a system call there. I would rather see this done during a safepoint, 
>> having a closure that iterates the heap and verify all regions.
> I tried to point out this during the discussion but probably not enough. :(
> My understanding of the result is okay as the logs are protected by log 
> level+tag. But as replied to Thomas, I will remove the verification at 
> HRM::allocate_free_region() if there's no more opinions.
> 
> Any opinions? Thomas or Kim?
> 
>>
>> I also think it would be nice to have two levels of the output, the 
>> one line for each region on trace level and on debug we can have a 
>> summary, something like:
>> NUMA Node 1: expected=25, actual=23
>> NUMA Node 2: expected=25, actual=27
>>
>> What do you (and others) think about that?
> Having 2 level log print seems good to me.
> And your suggestion is similar to Thomas' one and I would like to 
> address it at the later patch #3 (JDK-8220312 which is also part of the 
> JEP)
> 
>> ---
>> ?216 static void print_node_id_of_regions(uint start, uint num_regions){
>> ?217?? LogTarget(Trace, gc, heap, numa) lt;
>>
>> I understand that it might make the test a bit more complicated, but 
>> have you thought about instead adding the node index to the heap 
>> printing done when <gc, heap, region> is enabled on trace level?
> So you are suggesting the log tag from gc+heap+numa to gc+heap+region?
No, my suggestion is to add it to HeapRegion::print_on(outputStream* 
st), if numa is enable. Adding a new column for numa node id could be 
nice to have not only for testing. This would require the test to change 
a bit an possibly even add a WhiteBox method that prints all region 
information. This would be nice since it both gives useful output in the 
region printing and you can control when it is printed from the test. 
But it would make the parsing of the information a little bit harder.

> 
>> ---
>> ?235 static void set_heapregion_node_index(HeapRegion* hr) {
>>
>> I don't think we should special case for when AlwaysPreTouch is on and 
>> instead always just call hr->set_node_index(preferred_index) directly 
>> in make_regions_available. The reason is that I think it will make the 
>> NUMA support harder to understand and explain and it can potentially 
>> also hide problems with a systems configuration. It might also 
>> actually be worse then using the preferred id, because the OS might 
>> decide to move the pages back to the preferred node right after we 
>> checked this (not sure it will happen, but in theory).
> I have different opinion, sorry.
> I do believe when AlwaysPreTouch is enabled, we should check actual node 
> and then use it because;
> 1. If don't check the actual node id when 'AlwayPreTouch' is enabled, we 
> will loose a chance of having improvement if actual node is different 
> from preferred node. (I know this will not happen frequently but in 
> theory.. )
> 2. I don't think acting differently with AlwayPreTouch is a problem. I 
> think it is opposite that it is a good chance to analyze the behavior of 
> VM earlier. Earlier means we are planning to add verification code at 
> safepoint(not yet decided when, so please give me good suggestion) which 
> is later than make_regions_available(). In addition, the default value 
> of AlwaysPreTouch is false so it means user requested pages to faulted in.
> 3. We are already assuming we cannot immediately react when OS migrates 
> the memory. So if OS migrates after checking, still we are consistent on 
> that assumption.

It's ok that we have different opinions, and I'm fine with this if 
everybody else agrees on it.

> 
>>
>> An other problem with this code is the call to:
>> verify_actual_node_index(hr->bottom(), node_index)
>>
>> This function will only return the "actual" node index if logging for 
>> <gc, heap, numa, verification> is enable on debug level.
> Yes, I'm aware of this problem so planned to fix before posting the 
> webrev but completely forgot about it. My bad.
> Replaced to work as expected.
> 
>> ---
>>
>> ?346? bool HeapRegionManager::is_on_preferred_index(uint region_index, 
>> uint preferred_node_index) {
>> ?347??? uint region_node_index = 
>> G1MemoryNodeManager::mgr()->preferred_index_for_address(
>> ?348 G1CollectedHeap::heap()->bottom_addr_for_region(region_index));
>> ?349?? return region_node_index == preferred_node_index ||
>> ?350????????? preferred_node_index == G1MemoryNodeManager::AnyNodeIndex;
>>
>> I guess adding the AnyNodeIndex case here is because in this patch 
>> nobody is expanding on a preferred node, right? To me this is just 
>> another argument to not do any changes to the expand code in this 
>> patch. I know I suggested adding expand_on_preferred_node(), but I 
>> should have been clearer about when I think we should add it.
> Got it.
> Removed AnyNodeIndex.
> 
>> ---
>>
>> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>> ---
>> ? 56?? // Returns memory node ids
>> ? 57?? virtual const int* node_ids() const;
>>
>> Doesn't seem to be used, remove.
> It will be used at patch 3/3, JDK-8220312.
> 
>> ---
>>
>> src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
>> ---
>> ?67?? LINUX_ONLY(if (UseNUMA) {
>> ...
>> ?79???? delete numa;
>> ?80?? })
>>
>> A bit confusing with a multi-line LINUX_ONLY, I would prefer to hide 
>> this in a private helper, something like:
>> ? if (UseNUMA) {
>> ???? LINUX_ONLY(create_numa_manager());
>> ? }
>>
>> ? if (_inst == NULL) {
>> ??? _inst = new G1MemoryNodeManager();
>> ? }
>>
>> Not really happy about this either, but we can look at simplifying the 
>> NUMA initialization as a follow up.
> Changed as Kim suggested, hope you are okay with this.

Yes, using an #ifdef should be good enough.

> 
> #ifdef LINUX
> 
> 
>> ---
>>
>> src/hotspot/share/gc/g1/g1NUMA.hpp
>> ---
>> ? 87?? // Returns numa id of the given numa index.
>> ? 88?? inline int numa_id_of_index(uint numa_index) const;
>>
>> Currently unused, either remove or make use of it when calling 
>> numa_make_local.
> Done.
> 
>> ---
>> ? 94?? // Returns current active numa ids.
>> ? 95?? const int* numa_ids() const { return _numa_ids; }
>>
>> Only used by memory manager above, which in turn is unused, remove.
> It will be used at patch 3/3, JDK-8220312.
> 
>> ---
>>
>> src/hotspot/share/gc/g1/g1NUMA.hpp
>> ---
>> ? 55 // Request the given memory to locate on preferred node.
>> ? 56 // There are 2 things to consider.
>> ? 57 // First, size comparison for G1HeapRegionSize and page size.
>> ?...
>> ? 62 // Examples of 4 numa ids with non-preferred numa id.
>>
>> What do you think about this instead:
>> // Request to spread the given memory evenly across the available NUMA
>> // nodes. Which node to request for a given address is given by the
>> // region size and the page size. Below are two examples:
>>
>> I would also like a "NUMA node" row for each example showing which 
>> numa node the pages and regions end up on.
> Changed / added as you suggested.
> 
> Will post the webrev.3 after addressing Kim's comments and tests finished.
> 
> Thanks,
> Sangheon
> 
> 
>> ---
>>
>> Thanks,
>> Stefan
>>
>>> Testing: hs-tier1 ~ 5 +-UseNUMA
>>>
>>> Thanks,
>>> Sangheon
>>>
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>>
>>>
> 


From stefan.johansson at oracle.com  Tue Oct  8 09:28:22 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 8 Oct 2019 11:28:22 +0200
Subject: RFR: 8231489: GC watermark_0_1 failed due to "metaspace.gc.Fault:
 GC has happened too rare"
In-Reply-To: <bdc9603c-d441-fbcd-91ca-6060898f0300@oracle.com>
References: <2aa33fe4-5d04-cd58-f786-d7a12b977ccf@oracle.com>
 <bdc9603c-d441-fbcd-91ca-6060898f0300@oracle.com>
Message-ID: <a2e92569-720e-c0f3-8325-bf98ec2fa92e@oracle.com>


On 2019-10-08 10:39, Thomas Schatzl wrote:
> Hi,
> 
> On 03.10.19 10:47, Per Liden wrote:
>> vmTestbase/metaspace/gc/HighWaterMarkTest relies on timing and fails 
>> when "Metaspace GC Threshold" isn't handled in a STW pause.
>>
>> The problem can be reproduced on both G1 and ZGC, but it's hard, as 
>> the window is small. However, it reproduces every time when injecting 
>> a 100ms delay to prolong the GC cycle a bit. This test used to be 
>> disabled for G1 with ClassUnloadingWithConcurrentMark, but JDK-8204163 
>> enabled it about a year ago.
>>
>> Fixing the test properly is tricky. As far as I can see, we can either:
>> 1) Disable this test for G1+ClassUnloadingWithConcurrentMark and ZGC, or
>> 2) Add a sleep in the test loop, to make the race less likely to 
>> happen, or
>> 3) Remove the test completely, with the rational that it's a buggy low 
>> value test.
>>
>> I've gone with 1) here. The test is already disabled for CMS today, 
>> with code in the test itself (i.e. not using @requires), so I did two 
>> alternative patches:
>>
>> A) Follows the existing style to disable the other GCs:
>> http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt1
>>
>> B) Adds @requires to the tests using the HighWaterMarkTest class, and 
>> removes the old check to disable CMS:
>> http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt2
>>
>> I prefer B, but I don't have a strong opinion on which way to go.
>>
> 
> B is fine with me.
Same here, I think it is good to use @requires even if they are a bit 
complicated in this case.

Looks good,
Stefan

> 
> Looks good.
> 
> Thomas
> 


From per.liden at oracle.com  Tue Oct  8 09:49:38 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 8 Oct 2019 11:49:38 +0200
Subject: RFR: 8231489: GC watermark_0_1 failed due to "metaspace.gc.Fault:
 GC has happened too rare"
In-Reply-To: <a2e92569-720e-c0f3-8325-bf98ec2fa92e@oracle.com>
References: <2aa33fe4-5d04-cd58-f786-d7a12b977ccf@oracle.com>
 <bdc9603c-d441-fbcd-91ca-6060898f0300@oracle.com>
 <a2e92569-720e-c0f3-8325-bf98ec2fa92e@oracle.com>
Message-ID: <c59fe2f5-4679-3617-b740-e0cee7ca1a57@oracle.com>

Thanks Stefan!

/Per

On 10/8/19 11:28 AM, Stefan Johansson wrote:
> 
> 
> On 2019-10-08 10:39, Thomas Schatzl wrote:
>> Hi,
>>
>> On 03.10.19 10:47, Per Liden wrote:
>>> vmTestbase/metaspace/gc/HighWaterMarkTest relies on timing and fails 
>>> when "Metaspace GC Threshold" isn't handled in a STW pause.
>>>
>>> The problem can be reproduced on both G1 and ZGC, but it's hard, as 
>>> the window is small. However, it reproduces every time when injecting 
>>> a 100ms delay to prolong the GC cycle a bit. This test used to be 
>>> disabled for G1 with ClassUnloadingWithConcurrentMark, but 
>>> JDK-8204163 enabled it about a year ago.
>>>
>>> Fixing the test properly is tricky. As far as I can see, we can either:
>>> 1) Disable this test for G1+ClassUnloadingWithConcurrentMark and ZGC, or
>>> 2) Add a sleep in the test loop, to make the race less likely to 
>>> happen, or
>>> 3) Remove the test completely, with the rational that it's a buggy 
>>> low value test.
>>>
>>> I've gone with 1) here. The test is already disabled for CMS today, 
>>> with code in the test itself (i.e. not using @requires), so I did two 
>>> alternative patches:
>>>
>>> A) Follows the existing style to disable the other GCs:
>>> http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt1
>>>
>>> B) Adds @requires to the tests using the HighWaterMarkTest class, and 
>>> removes the old check to disable CMS:
>>> http://cr.openjdk.java.net/~pliden/8231489/webrev.0-alt2
>>>
>>> I prefer B, but I don't have a strong opinion on which way to go.
>>>
>>
>> B is fine with me.
> Same here, I think it is good to use @requires even if they are a bit 
> complicated in this case.
> 
> Looks good,
> Stefan
> 
>>
>> Looks good.
>>
>> Thomas
>>


From thomas.schatzl at oracle.com  Tue Oct  8 09:54:56 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 8 Oct 2019 11:54:56 +0200
Subject: RFR (S): 8231956: Remove seq_add_card/reference from
 PerRegionTable class
In-Reply-To: <785ee316-a389-2218-0e4c-d53db0120088@oracle.com>
References: <5a94fbaa-08f1-5fca-62cc-030709b6ba13@oracle.com>
 <785ee316-a389-2218-0e4c-d53db0120088@oracle.com>
Message-ID: <1d1f3e1b-c886-155c-75f4-db33edf5b44b@oracle.com>

Hi Per,

On 08.10.19 11:15, Per Liden wrote:
> Looks good!
> 
> /Per

   thanks for your review.

Thomas


From stefan.johansson at oracle.com  Tue Oct  8 10:49:27 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 8 Oct 2019 12:49:27 +0200
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
Message-ID: <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>

Hi Haoyu,

I've done some more testing and I haven't seen any issues with the patch 
so far and the performance looks promising in most cases. For simple 
tests I've seen some regressions, but I'm not really sure why. Will do 
some more digging.

To move forward with this the first thing we need to do is making sure 
that you being covered by the Oracle Contributor Agreement is enough. 
 From what we can see it is only you as an individual that has signed 
the OCA and in that case it is important that this statement from the 
OCA is fulfilled: "no other person or entity, including my employer, has 
or will have rights with respect my contributions"

Is this the case for this contribution or should we have the university 
sign the OCA as well? For more information regarding the OCA please 
refer to:
https://www.oracle.com/technetwork/oca-faq-405384.pdf

Thanks,
Stefan

On 2019-09-16 16:02, Haoyu Li wrote:
> FYI, the evaluation results on OpenJDK 14 are plotted in the attachment. 
> I compute the full GC throughput by dividing the heap size before full 
> GC by the GC pause time, and the results are arithmetic mean values of 
> ten runs after a warm-up run.?The evaluation is conducted on a machine 
> with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16 physical cores 
> with SMT enabled) and 64G DRAM.
> 
> Best Regrads,
> Haoyu Li,
> Institute of Parallel and Distributed Systems(IPADS),
> School of Software,
> Shanghai Jiao Tong University
> 
> 
> Stefan Johansson <stefan.johansson at oracle.com 
> <mailto:stefan.johansson at oracle.com>> ?2019?9?12??? ??5:34???
> 
>     Hi Haoyu,
> 
>     I recently came across your patch and I would like to pick up on
>     some of the things Kim mentioned in his mails. I especially want
>     evaluate and?investigate if this is a technique we can use to
>     improve the other?GCs as well. To start?that work I want to take the
>     patch for a spin in our internal performance testing. The patch
>     doesn?t apply clean to the latest JDK repository, so if you could
>     provide an updated patch that would be very helpful.
> 
>     It would also be great if you could share some more information
>     around the results presented in the paper. For example, it would be
>     good to get the full?command lines for the different benchmarks so
>     we can run them locally and reproduce the results?you?ve?seen.
> 
>     Thanks,
>     Stefan
> 
>>     12 mars 2019 kl. 03:21 skrev Haoyu Li <leihouyju at gmail.com
>>     <mailto:leihouyju at gmail.com>>:
>>
>>     Hi Kim,
>>
>>     Thanks for reviewing and testing the patch. If there are any
>>     failures or performance degradation relevant to the work, please
>>     let me know and I'll be very happy to keep improving it. Also, any
>>     suggestions about code improvements are well appreciated.
>>
>>     I'm not quite sure if both G1 and Shenandoah have the similar
>>     region dependency issue, since I haven't studied their GC
>>     behaviors before. If they have, I'm also willing to propose a more
>>     general optimization.
>>
>>     As to the memory overhead, I believe it will be low because this
>>     patch exploits empty regions in the young space rather than
>>     off-heap memory to allocate shadow regions, and also reuses the
>>     /_source_region/ field of each /RegionData /to record the
>>     correspongding shadow region index. We only introduce a new
>>     integer filed /_shadow /in the RegionData class to indicate the
>>     status of a region, a global /GrowableArray _free_shadow/?to store
>>     the indices of shadow regions, and a global /Monitor/?to protect
>>     the array. These information might help if the memory overhead
>>     need to be evaluated.
>>
>>     Looking forward to your insight.
>>
>>     Best Regrads,
>>     Haoyu Li,
>>     Institute of Parallel and Distributed Systems(IPADS),
>>     School of Software,
>>     Shanghai Jiao Tong University
>>
>>
>>     Kim Barrett <kim.barrett at oracle.com
>>     <mailto:kim.barrett at oracle.com>> ?2019?3?12??? ??6:11???
>>
>>         > On Mar 11, 2019, at 1:45 AM, Kim Barrett
>>         <kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>> wrote:
>>         >
>>         >> On Jan 24, 2019, at 3:58 AM, Haoyu Li <leihouyju at gmail.com
>>         <mailto:leihouyju at gmail.com>> wrote:
>>         >>
>>         >> Hi Kim,
>>         >>
>>         >> I have ported my patch to OpenJDK 13 according to your
>>         instructions in your last mail, and the patch is attached in
>>         this mail. The patch does not change much since PSGC is indeed
>>         pretty stable.
>>         >>
>>         >> Also, I evaluate the correctness and performance of PS full
>>         GC with benchmarks from DaCapo, SPECjvm2008, and JOlden suits
>>         on a machine with dual Intel Xeon E5-2618L v3 CPUs(16 physical
>>         cores), 64G DRAM and linux kernel 4.17. The evaluation result,
>>         indicating 1.9X GC throughput improvement on average, is
>>         attached, too.
>>         >>
>>         >> However, I have no idea how to further test this patch for
>>         both correctness and performance. Can I please get any
>>         guidance from you or some sponsor?
>>         >
>>         > Sorry I missed that you had sent an updated version of the
>>         patch.
>>         >
>>         > I?ve run the full regression suite across Oracle-supported
>>         platforms.? There are some
>>         > failures, but there are almost always some failures in the
>>         later tiers right now.? I?ll start
>>         > looking at them tomorrow to figure out whether any of them
>>         are relevant.
>>         >
>>         > I?m also planning to run some of our performance benchmarks.
>>         >
>>         > I?ve lightly skimmed the proposed changes.? There might be
>>         some code improvements
>>         > to be made.
>>         >
>>         > I?m also wondering if this technique applies to other
>>         collectors.? It seems like both G1 and
>>         > Shenandoah full gc?s might have similar issues?? If so, a
>>         solution that is ParallelGC-specific
>>         > is less interesting than one that has broader
>>         applicability.? Though maybe this optimization
>>         > is less important for G1 and Shenandoah, since they actively
>>         try to avoid full gc?s.
>>         >
>>         > I?m also not clear on how much additional memory might be
>>         temporarily allocated by this
>>         > mechanism.
>>
>>         I?ve created a CR for this:
>>         https://bugs.openjdk.java.net/browse/JDK-8220465
>>
> 


From per.liden at oracle.com  Tue Oct  8 13:02:21 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 8 Oct 2019 15:02:21 +0200
Subject: RFR: 8232001: ZGC: Ignore metaspace GC threshold until GC is warm
Message-ID: <bf25c1c7-c8a1-b7d6-85cd-d5ff96c189a7@oracle.com>

As reported here:

https://mail.openjdk.java.net/pipermail/zgc-dev/2019-September/000736.html

The ZDirector heuristics can get of to a bad start if the statistics is 
contaminated by early "Metaspace GC Threshold" GC requests. To avoid 
this, we could simply ignore such requests until the GC is warm, at the 
potential cost of expanding metaspace a bit more during startup.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232001
Webrev: http://cr.openjdk.java.net/~pliden/8232001/webrev.0

/Per


From stefan.johansson at oracle.com  Tue Oct  8 13:08:11 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 8 Oct 2019 15:08:11 +0200
Subject: RFR (S): 8231956: Remove seq_add_card/reference from
 PerRegionTable class
In-Reply-To: <785ee316-a389-2218-0e4c-d53db0120088@oracle.com>
References: <5a94fbaa-08f1-5fca-62cc-030709b6ba13@oracle.com>
 <785ee316-a389-2218-0e4c-d53db0120088@oracle.com>
Message-ID: <4c791423-cf93-0261-f9a0-b208b9baf10e@oracle.com>


On 2019-10-08 11:15, Per Liden wrote:
> Looks good!
> 
+1

Stefan
> /Per
> 
> On 10/8/19 9:50 AM, Thomas Schatzl wrote:
>> Hi all,
>>
>> ?? can I have reviews for this small change that removes some unused 
>> methods and performs associated cleanup of unnecessary parameters?
>>
>> There is one related cleanup that might raise some questions:
>>
>> ?? 38 inline void PerRegionTable::add_card_work(CardIdx_t from_card, 
>> bool par) {
>> ?? 39?? if (!_bm.at(from_card)) {
>> ?? 40???? if (par) {
>> ?? 41?????? if (_bm.par_set_bit(from_card)) {
>> ?? 42???????? Atomic::inc(&_occupied);
>>
>> changed to
>>
>> ?? 38 inline void PerRegionTable::add_card(CardIdx_t from_card_index) {
>> ?? 39?? if (_bm.par_set_bit(from_card_index)) {
>>
>>
>> The reason for this change is that BitMap::par_set_bit() implicitly 
>> performs the BitMap::at() check even without doing a cmpxchg, 
>> duplicating this functionality.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8231956
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8231956/webrev/
>> Testing:
>> hs-tier1-5
>>
>> Thanks,
>> ?? Thomas


From thomas.schatzl at oracle.com  Tue Oct  8 13:28:25 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 8 Oct 2019 15:28:25 +0200
Subject: RFR (S): 8231956: Remove seq_add_card/reference from
 PerRegionTable class
In-Reply-To: <4c791423-cf93-0261-f9a0-b208b9baf10e@oracle.com>
References: <5a94fbaa-08f1-5fca-62cc-030709b6ba13@oracle.com>
 <785ee316-a389-2218-0e4c-d53db0120088@oracle.com>
 <4c791423-cf93-0261-f9a0-b208b9baf10e@oracle.com>
Message-ID: <813ac43a-c5a7-238c-d7bc-404b124f0b90@oracle.com>

Hi Stefan,

On 08.10.19 15:08, Stefan Johansson wrote:
> 
> 
> On 2019-10-08 11:15, Per Liden wrote:
>> Looks good!
>>
> +1

   thanks for your review.

Thomas


From kim.barrett at oracle.com  Tue Oct  8 19:05:44 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 8 Oct 2019 15:05:44 -0400
Subject: RFR: 8231153: Improve concurrent refinement statistics
In-Reply-To: <1c77cb41-2a70-c9d6-a600-2a87df24ae9c@oracle.com>
References: <1DADC595-3106-4CE7-BA5D-7B6C7EE0E81E@oracle.com>
 <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
 <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
 <3bafef2f-1380-3105-5c54-5e8095c42409@oracle.com>
 <0280B88E-45D7-46C0-A0E5-2E708B0132ED@oracle.com>
 <1c77cb41-2a70-c9d6-a600-2a87df24ae9c@oracle.com>
Message-ID: <56E306A1-45DC-4779-A4AD-62133B0A0D52@oracle.com>

> On Oct 8, 2019, at 4:23 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> 
> 
> On 2019-10-08 00:38, Kim Barrett wrote:
>> New webrevs:
>> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.02/
>> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.02.inc/
>> 
> Thanks for trying it out a second time, this is more or less exactly what I had in mind.
> 
> Looks good,
> Stefan
> 
>> Testing:
>> mach5 tier1

Thanks.


From kim.barrett at oracle.com  Tue Oct  8 19:06:00 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 8 Oct 2019 15:06:00 -0400
Subject: RFR: 8231153: Improve concurrent refinement statistics
In-Reply-To: <a404b808-b0d7-3085-491e-57523eebcf91@oracle.com>
References: <1DADC595-3106-4CE7-BA5D-7B6C7EE0E81E@oracle.com>
 <4a851a19-0979-c696-0c80-1165bd755834@oracle.com>
 <BACFB18B-7CA1-4C58-8597-016D197CDCDF@oracle.com>
 <3bafef2f-1380-3105-5c54-5e8095c42409@oracle.com>
 <0280B88E-45D7-46C0-A0E5-2E708B0132ED@oracle.com>
 <a404b808-b0d7-3085-491e-57523eebcf91@oracle.com>
Message-ID: <3D3BA259-AE30-460A-9381-B6E67A2207EE@oracle.com>

> On Oct 8, 2019, at 3:45 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 08.10.19 00:38, Kim Barrett wrote:
>> New webrevs:
>> full: https://cr.openjdk.java.net/~kbarrett/8231153/open.02/
>> incr: https://cr.openjdk.java.net/~kbarrett/8231153/open.02.inc/
> 
> still good.
> 
> Thomas

Thanks.


From kim.barrett at oracle.com  Tue Oct  8 23:48:06 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 8 Oct 2019 19:48:06 -0400
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
Message-ID: <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>

> On Sep 30, 2019, at 7:14 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> All fixed in new webrev:
> 
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.1 (full)
> 
> Rerunning hs-tier1-5, almost done
> 
> Thanks,
>  Thomas

Because of NMETHOD_SENTINEL we already have a "lying to the type
system" problem for the nmethod link field, as it doesn't necessarily
contain an nmethod*.  The introduction of the strongly claimed tagging
mechanism just emphasizes that.  I think that should be cleaned up and
the "lying to the type system" should be eliminated.  However, I also
think that can be done as a followup cleanup.

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp

I initially thought there was a bug in the strong claim.

A weak claim is established by thread1, successfully setting the link
field to NMETHOD_SENTINEL.  Before thread1 continues from there...

Thread2 tries to strongly mark, and sees NMETHOD_SENTINAL in the link
field. NMETHOD_SENTINEL == badAddress == -2, which happens to have the
low bit clear. So this seems to work after all.

Add STATIC_ASSERT(is_aligned_((uintptr_t)NMETHOD_SENTINEL, 2))

------------------------------------------------------------------------------ 
src/hotspot/share/code/nmethod.cpp

[pre-existing]
I think the comment in oops_do_marking_prolog about using cmxchg makes
no sense.

And why does oops_do_marking_epilogue use cmpxchg at the end?

------------------------------------------------------------------------------ 
src/hotspot/share/code/nmethod.cpp

I think using a self-loop to mark end of list would eliminate the need
for NMETHOD_SENTINEL.

Also eliminates the need for oops_do_marking_prologue.
Requires changing oops_do_marking_epilogue to recognize the self-loop.

[This can be deferred to later cleanup.]

------------------------------------------------------------------------------

oops_do_mark_merge_claim second argument is called "claim" but should
be "strongly_claim" or some such.  Actually the whole new suite of

oops_do_mark_is_claimed
oops_do_mark_strip_claim
oops_do_mark_merge_claim

all seem misnamed.  The link field having a non-NULL value is a
(possibly weak) claim.  The link field having a non-NULL not 2byte
aligned value is a strong claim.  Those functions are all dealing with
strong claims.

is_claimed should use is_aligned
strip_claim should use align_down

------------------------------------------------------------------------------

With the introduction of the strongly claimed tag bit, the link field
ought not be of type nmethod*, because using that type means we're
constructing improperly aligned pointers, which is unspecified behavior.
Should now be char* or void* or some opaque pointer type.

struct nmethod_claim;  // nmethod_claimant ?

[This can be deferred to later cleanup.]

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp
1884   // On fall through, another racing thread marked this nmethod before we did.

[pre-existing] I think s/marked/claimed/ would be better.

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp  
1900   while (cur != oops_do_mark_strip_claim(NMETHOD_SENTINEL)) {

Why stripping the claim tag from NMETHOD_SENTINEL; it isn't tagged.
(And must not be, as discussed in an earlier comment.)

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.hpp
 494   bool test_set_oops_do_strongly_marked();
 495   bool test_set_oops_do_mark(bool strongly = false);

I found the naming and protocol here confusing.  I'd prefer a
"try_claim" style that returns true if the claim attempt is
successful, similar to what we now (since JDK-8210119) do for
SubTasksDone and friends.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1ParScanThreadState.cpp 
 131 bool G1ParScanThreadState::has_remembered_strong_nmethods() const {
 132   return _remembered_strong_nmethods != NULL && _remembered_strong_nmethods->length() > 0;
 133 }

Use !is_empty() rather than length() > 0.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1ParScanThreadState.cpp 
 105   assert(_remembered_strong_nmethods == NULL || _remembered_strong_nmethods->is_empty(), "should be empty at this point.");

Use !has_remembered_strong_nmethods().

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1CollectedHeap.cpp
3874   if (collector_state()->in_initial_mark_gc()) {
3875     remark_strong_nmethods(per_thread_states);
3876   }

I think this additional task and the associated pending strong nmethod
sets in the pss can be eliminated by using a 2-bit tag and a more
complex state machine earlier.

I think the additional task is acceptable at least for now, and this
could be examined as a followup.  There's a tradeoff between the cost
of the additional task and the added complexity to remove it.

Below is some (obviously) untested pseudo-code (sort of pythonesque)
for what I have in mind.  The basic idea is that if thread A wants to
strongly process an nmethod while thread B is weakly processing it,
thread A can mark the nmethod as needing strong processing.  When
thread B finishes the weak processing it notices the strong request
and performs the strong processing too.

Note that this code doesn't use NMETHOD_SENTINEL.  The end of the
global list is indicated by having the last element have a self-looped
link value with appropriate tagging.  That avoids both the sentinel
and tagged NULL values (which have their own potential problems).

States, encoded in the link member:
- unclaimed: NULL
- weak: tag 00
- weak done: tag 01
- weak, need strong: tag 10
- strong: tag 11

weak_processor(n):
    if n->link != NULL:
        # already claimed; nothing to do here.
        return
    elif not replace_if_null(tagged(n, 0), &n->link):
        # just claimed by another thread; nothing to do here.
        return
    # successfully claimed for weak processing.
    assert n->link == tagged(n, 0)
    do_weak_processing(n)
    # push onto global list.  self-loop end of list to avoid tagged NULL.
    next = xchg(n, &_list_head) 
    if next == NULL: next = n 
    # try to install end of list + weak done tag.
    if cmpxchg(tagged(next, 1), &n->link, tagged(n, 0) == tagged(n, 0):
        return
    # failed, which means some other thread added strong request.
    assert n->link == tagged(n, 2)
    # do deferred strong processing.
    n->link = tagged(next, 3)
    do_strong_processing(n)
    
strong_processor(n):
    if replace_if_null(tagged(n, 3), &n->link):
        # successfully claimed for strong processing.
        do_strong_processing(n) 
        # push onto global list.  self-loop end of list to avoid tagged NULL.
        next = xchg(n, &_list_head)
        if next == NULL: next = n
        n->link = tagged(next, 3)
        return
    # claim failed.  figure out why and handle it.
    while true:
        raw_next = n->link
        next = strip_tag(raw_next)
        if raw_next - next >= 2:
            # already claimed for strong processing or requested for such.
            return
        elif cmpxchg(tagged(next, 2), &n->link, tagged(next, 0)) == tagged(next, 0):
            # added deferred strong request, so done.
            return
        elif cmpxchg(tagged(next, 3), &n->link, tagged(next, 1)) == tagged(next, 1):
            # claimed deferred strong request.
            do_strong_processing(n)
            return
        # concurrent changes interferred with us.  try again.
        # number of retries is bounded and small, since the state
        # transitions are few and monotonic.  (I think we cannot
        # reach here more than 2 times.)

------------------------------------------------------------------------------


From sangheon.kim at oracle.com  Wed Oct  9 04:27:03 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 8 Oct 2019 21:27:03 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
Message-ID: <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>

Hi Kim,

On 10/7/19 11:10 AM, Kim Barrett wrote:
>> On Oct 1, 2019, at 12:43 PM, sangheon.kim at oracle.com wrote:
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc
>> Testing: hs-tier1 ~ 5 +-UseNUMA
> I like the direction of this.  I think there are some additional simplifications possible
> around G1NUMA, which are discussed below.
>
> I still need to respond to your earlier individual responses.  That will be in another email.
OK!

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.cpp
>    67   LINUX_ONLY(if (UseNUMA) {
>
> Maybe instead use #ifdef LINUX.  Either way, add a trailing comment at
> the end of the conditional block.
Changed to use

#ifdef LINUX

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.cpp
>    79   // If we don't have preferred numa id, touch the given area with round-robin manner.
>
> This comment seems out of place / obsolete.
OK

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.cpp
>   138   uint region_index = G1CollectedHeap::heap()->addr_to_region(address);
>
> This requires the address be in the range reserved for the heap.
> That's okay; that's what we decided we want to do.  But that should be
> part of the function's description, e.g. it should be mentioned as a
> precondition for prefered_index_for_address.
I think I addressed your point.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.hpp
>    87   // Returns numa id of the given numa index.
>    88   inline int numa_id_of_index(uint numa_index) const;
>
> Unused function.
Removed.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.hpp
>    83   inline uint index_of_numa_id(int numa_id) const;
>
> This function should be private.  It is only needed in the
> implementation of index_of_current_thread and index_of_address.
> It should have a precondition that the argument is an active numa id,
> e.g. a definition something like
>
> uint G1NUMA::index_of_numa_id(int numa_id) const {
>    assert(numa_id >= 0, "invalid numa id %d", numa_id);
>    assert(numa_id < _len_numa_id_to_index_map, "invalid numa id %d", numa_id);
>    uint numa_index = _numa_id_to_index_map[numa_id];
>    assert(numa_index != G1MemoryNodeManager::InvalidNodeIndex,
>           "invalid numa id %d", numa_id);
>    return numa_index;
> }
>
> To make this work, index_of_address should also be changed, to
> something like:
>
> uint G1NUMA::index_of_address(HeapWord* address) const {
>    int numa_id = os::numa_get_address_id((uintptr_t)address);
>    if (numa_id == os::InvalidId) {
>      return G1MemoryNodeManager::InvalidNodeIndex;
>    } else {
>      return index_of_numa_id(numa_id);
>    }
> }
Changed as your patch.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.cpp
>    31 void G1NUMA::init_numa_id_to_index_map(const int* numa_ids, uint num_numa_ids) {
>
> This function is only called from one place, G1NUMA::initialize.  The
> code would be simpler and more clear if the body of this function were
> just directly inlined into initialize and this function eliminated.
>
> And once that's done it becomes apparent that initialize could be
> hoisted into the (moved out of line) constructor.
>
> This also lets num_active_numa_ids just be a trivial accessor function
> in the header; there's no possibility of finding it uninitialized
> after the constructor returns, so no need for the assert that it has
> been set.
Changed as your patch.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.inline.hpp
>    32 inline bool G1NUMA::is_valid_numa_id(int numa_id) {
>
> Only called by init_numa_to_index_map in a guarantee that would be
> more obviously vacuous after the earlier suggested merge of that
> function into initialize.
Done.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/os.hpp
>   393   enum NumaIdState {
>   394     InvalidId = -1,
>   395     AnyId = -2
>   396   };
>
> The type NumaIdState is unused.
> The AnyId enumerator is unused.
>
> Suggest making InvalidId just a static const int in the class.
Changed to static const int.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/os.hpp
>   398   static int numa_get_address_id(uintptr_t address);
>
> Why is the type of address uintptr_t rather than a pointer type?
>
> I see that the underlying Linux syscall (get_mempolicy) wants an
> unsigned long, but that detail ought to be isolated to the Linux
> implementation layer.  Callers are going to want to pass in addresses
> (pointers) and should not need to cast.  That cast should happen at
> the point where the syscall is being made.
Changed to void*.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1Allocator.inline.hpp
>    37 inline MutatorAllocRegion* G1Allocator::mutator_alloc_region(uint node_index) {
>    38   assert(_g1h->mem_node_mgr()->is_valid_node_index(node_index), "Invariant, index %u", node_index);
>    39   return &_mutator_alloc_regions[node_index];
>    40 }
>
> I think the assert here should be that node_index < _num_alloc_regions.
>
> is_valid_node_index gives a somewhat indirect (so weak) check of the
> validity of the array access.
>
> Such a change would also eliminate one of the two callers of
> is_valid_node_index, which I think can be eliminated (see next comment).
Done.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionManager.cpp
>   126 HeapRegion* HeapRegionManager::allocate_free_region(HeapRegionType type, uint requested_node_index) {
> ...
>   131   if (mgr->num_active_nodes() > 1 && mgr->is_valid_node_index(requested_node_index)) {
>
> I think a better test here would be
>    if ((requested_node_index != G1MemoryNodeManager::AnyNodeIndex) &&
>        (mgr->num_active_nodes() > 1)) {
>
> This eliminates one of two calls to is_valid_node_index (which I think
> can be eliminated, see previous comment).  And callers should not be
> passing in actually invalid indices.  I think there are asserts lower
> down in the stack (in G1NUMA) to complain about such, but they
> shouldn't be getting in here anyway.
Done, but introduced G1MemoryNodeManager::has_multi_node() which is 
Thomas' comment.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>    42   static const uint InvalidNodeIndex = UINT_MAX;
>    43   static const uint AnyNodeIndex = InvalidNodeIndex - 1;
>
> These seem misplaced to me.  Shouldn't they be in G1NUMA?  Possibly
> reexported here for convenience?  (Assuming it actually is convenient.)
Yes, for convenience.
But G1NUMA is merged into G1MemoryNodeManager so no more argue here.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>    42   static const uint InvalidNodeIndex = UINT_MAX;
>
> I think the only place this arises is as the result of
> index_of_address when the numa id for the location isn't known.  Which
> suggests the name should be "UnknownNodeIndex" rather than
> "InvalidNodeIndex".  And the description of index_of_address should
> mention that it can return that value (whatever its name ends up being.)
Good idea.
Changed to UnknownNodeIndex.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>
> I'm not sure G1MemoryNodeManager is useful. It seems to be just a thin
> wrapper over the G1NUMA API, with a virtual dispatch between a
> non-NUMA or single-node implementation and the multi-node
> implementation that uses a G1NUMA that is only created for multi-node
> support. The virtual dispatch can't be eliminated in most (all or
> nearly all?) cases.
>
> But I think most of the single-node implementation would just fall out
> as a 1-node boundary case for multi-node G1MemoryNodeManager / G1NUMA.
>
> So I think this might all be collapsed down to a G1NUMA that always
> exists.  If there are any places that require actual distinction, that
> class can have a private member to select the appropriate behavior.
> (Or maybe it's just the number of active nodes.)
G1NUMA is merged to G1MemoryNodeManager.
Previously G1MNM owned G1NUMA so I tried to keep this relation. Now 
G1MultiMemoryNodeManager has NUMA related implementations.
Thomas also suggested merging these two.

We discussed about virtual dispatch stuff, but I couldn't find anything 
better than now.
More than welcome if you have any suggestion. Or keep for later 
enhancements.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.inline.hpp
>
> I think that with the changes I've proposed above, I think there's not
> much left in this file, and it might not be worth having it.  Consider
> moving any lingering remnents to the .hpp or .cpp file as appropriate.
Removed g1NUMA.inline.hpp

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.hpp
>
> Consider adding a page_size() accessor function (private for now) that
> asserts the associated data member is > 0 (e.g. initialized), since it
> is initialized after construction.  Use that instead of direct uses of
> the data member.
Added page_size().

>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/arguments.cpp
> 4108     // such as Parallel GC for Linux and Solaris or G1 GC for Linux will
> ...
> 4111     // Non NUMA-aware collectors such as CMS and Serial-GC on
> 4112     // all platforms and ParallelGC on Windows will interleave all
>
> I think that these comments about which configurations do or don't
> support NUMA are just a maintenance headache. I think it would be
> better here to just say
>
>    NUMA-aware collectors will interleave ...
>    Non NUMA-aware collectors will interleave ...
>
> And leave out mentions of configurations that may change (as is being
> done here) or be removed (as soon expected for CMS).
I just removed mentioning of configurations.
 ??? // UseNUMAInterleaving is set to ON for all collectors and 
platforms when
 ??? // UseNUMA is set to ON. NUMA-aware collectors will interleave old 
gen and
 ??? // survivor spaces on top of NUMA allocation policy for the eden space.
 ??? // Non NUMA-aware collectors will interleave all of the heap spaces 
across
 ??? // NUMA nodes.

Here's the major change list at the webrev. Or arguable list :)
1) Verification at HRM::allocate_free_region() is removed and it will be 
added somewhere at safepoint by JDK-8220312 (3/3 which is part of this 
JEP). Probably at the end of young gc?
2) Node id printing is changed. Removed old one and added at 
HeapRegion::print_on() with new column. Node id is only printed when 
UseNUMA is enabled and gc+heap+region=trace. If there's single active 
node, it will print the node id and this is intentional. Another 
approach would be printing only if there are multiple nodes.
3) If AlwaysPreTouch is enabled, HeapRegion will have actual node index 
instead of preferred node index.
4) HeapRegion::_node_index is set at HRM::make_regions_available() as 
there is the only place initializing HeapRegion. Another approach would 
be setting the index at HeapRegion::initialize(we have to pollute HR 
with G1MNM stuff) or conditionally(*) setting the index at 
HeapRegion::node_index(). (*) if the index is unknown etc..
5) G1NUMA class is merged into G1MemoryNodeManager.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220310/webrev.3
http://cr.openjdk.java.net/~sangheki/8220310/webrev.3.inc
Testing: hs-tier 1~5, with/without UseNUMA

Thanks,
Sangheon


>
> ------------------------------------------------------------------------------
>


From zgu at redhat.com  Wed Oct  9 11:47:59 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 9 Oct 2019 07:47:59 -0400
Subject: RFR 8232008: Shenandoah: C1 load barrier does not match interpreter
 version
Message-ID: <d531e080-00bd-ea38-3ab8-3aae46a1d960@redhat.com>


Bug: https://bugs.openjdk.java.net/browse/JDK-8232008
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232008/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release) with x86_64 and x86-32 
JVM on Linux.

Thanks,

-Zhengyu


From thomas.schatzl at oracle.com  Wed Oct  9 14:10:13 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 9 Oct 2019 16:10:13 +0200
Subject: G1 patch of elastic Java heap
In-Reply-To: <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>
References: <6270ce59-4a8e-431e-9ccf-f6d2c0f927eb.maoliang.ml@alibaba-inc.com>
 <d82e704831f0afbc61f8a3fb6b69bb1463b7ede8.camel@oracle.com>
 <e4ffd4d9-3ec0-4592-ac8c-d5a77c6b2e75.maoliang.ml@alibaba-inc.com>
 <1267a5dd2cf6cc1d03df64d07a06ba0f45195951.camel@oracle.com>
 <3140197d-8cab-4a86-af92-58431c74cb6b.maoliang.ml@alibaba-inc.com>
 <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>
Message-ID: <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>

Hi,

   sorry for the late reply.

First, I have a more general question: lots of changes deal with 
providing options to separately change properties generations at 
runtime. Like if there were separate pools of young and old gen memory.

G1 is kind of built upon the idea that you pass a pause time goal and 
then modifies generation sizes and takes memory for the generations from 
a single memory pool as needed.

To me this indicates that automatic sizing is not working correctly, but 
there are many(?) use cases where it does not work as expected. This 
requires manual tuning in generation sizes for whatever reason.

Can you share your thoughts about this? There seems to be some bit of 
information missing to me - this is probably the reason for some of the 
dumb questions about the flags, and me being not too fond of them.

On 26.09.19 08:49, Liang Mao wrote:
> 
> Hi All,
> 
> Here is the user guide of G1ElasticHeap patch. Hope it will help to 
> understand.
> 
> G1ElasticHeap
> G1ElasticHeap?is?a?GC?feature?to?return?memory?of?Java?heap?to?OS?to?reduce?the 
> 
> memory?footprint?of?Java?process.?To?enable?this?feature,?you?need?to?use?G1?GC 
> 
> by?options:?-XX:+UseG1GC?-XX:+G1ElasticHeap.
> 
> ##?Usage
> There?are?3?modes?which?can?be?enabled?in?G1ElasticHeap.
> ###?1.?Periodic?uncommit
> Memory?will?be?uncommitted?by?periodic?GC.?To?enable?periodic?uncommit,?use?option 
> 
> -XX:+ElasticHeapPeriodicUncommit?or?dynamically?enable?the?option?via?jinfo:
> 
> `jinfo?-flag?+ElasticHeapPeriodicUncommit?PID`

As far as I can tell, this setting periodically scans the heap for (too 
many?) uncommitted regions and, well, uncommits them.

Not completely sure if that is better than doing periodic gcs - as we do 
not expect to gain memory outside of a GC; in JDK12+ (I think) G1 alwasy 
uncommits at the remark pause which should give most of the benefits.

There *may* be reason to also try to uncommit after the last mixed GC, 
but not sure if uncommit is that urgent - to some degree the existing 
JEP 346: Promptly return unused committed memory from G1 
(https://openjdk.java.net/jeps/346) should cover some of the use cases. 
I.e. after some delay (and inactivity) there will be another Remark 
pause anyway.

The main reason why Remark has been chosen to uncommit memory is because 
we assume that the heap size at Remark (this is what adaptive IHOP 
shoots for) is the "target heap size".


> Related?options:
> 
>>?ElasticHeapPeriodicYGCIntervalMillis,?15000?\
> (target?young?GC?interval?15?seconds?in?default)?\
> (eg,?if?Java?runs?with?MaxNewSize=4g,?young?GC?every?30?seconds,?G1ElasticHeap?will?keep?15s
>  ?GC?interval?and?make?a?max?2g?young?generation?to?uncommit?2g?memory)
> 
>>?ElasticHeapPeriodicInitialMarkIntervalMillis,?3600000?\
> (Target?initial?mark?interval,?1?hour?in?default.?Unused?memory?of?old?generation?will?be?uncommitted
>  ?after?last?mixed?GC.)

This sesm to implement an unconditional concurrent cycle like with the 
CMSTriggerInterval flag for CMS.

Maybe there is a more clever alternative on triggering concurrent cycles 
like ZGC does based on the ratio between time spent by the mutator and 
the gc.

> 
>>?ElasticHeapPeriodicUncommitStartupDelay,?300?\
> (Delay?after?startup?to?do?memory?uncommit,?300?seconds?in?default)
> 
>>?ElasticHeapPeriodicMinYoungCommitPercent,?50?\
> (Percentage?of?young?generation?to?keep,?default?50%?of?the?young?generation?will?not?be?uncommitted)

See above about separating young/old.

> 
> ###?2.?Generation?limit
> To?limit?the?young/old?generation?separately.?Use?jcmd?or?MXBean?to?enable.

I do not understand the reason for those, see above.

[...]
> 
> ###?3.?Softmx?mode
> Dynamically?to?limit?the?heap?as?a?percentage?of?origin?Xmx.
> 
> Use?jcmd:
> 
> `jcmd?PID?ElasticHeap?softmx_percent=60`
> 
> Use?MXBean:
> 
> `elasticHeapMXBean.setSoftmxPercent(70);`

That one sounds good, and actually there is a flag SoftMaxHeapSize 
already in the VM. Only ZGC implements it though.

I think this idea matches the specifications in 
https://bugs.openjdk.java.net/browse/JDK-8222145 (i.e. as far as I can 
tell, the softmxpercent is a "soft"/target heap size), so I think this 
could be implemented under the SoftMaxHeapSize flag.

SoftMaxHeapSize is already manageable too, so could be modified already. 
Only the implementation is missing in G1 :)

> 
> ###?Other?G1ElasticHeap?advanced?options:
>>?ElasticHeapMinYoungCommitPercent,?10?\
>  ?(Mininum?percentage?of?young?generation)
> 
>>?ElasticHeapYGCIntervalMinMillis,?5000?\
>  ?(Mininum?young?GC?interval)
> 
>>?ElasticHeapInitialMarkIntervalMinMillis,?60000?\
> (Mininum?initial?mark?interval)
> 
>>?ElasticHeapEagerMixedGCIntervalMillis,?15000?\
> (Guaranteed?mixed?GC?interval,?to?make?sure?the?mixed?will?happen?in?time?to?uncommit?memory?after?last?mixed?GC)

These options seem to be mostly useful for when the allocation rate of 
the mutator is not high enough to advance the collection cycle.

Would that feature provide the requested feature? Maybe it needs some 
minor improvement, but to me it seems very burdensome to specify so many 
options...

> 
>>?ElasticHeapOldGenReservePercent,?5?\
> (To?keep?a?mininum?percentage?of?Xmx?for?old?generation?in?the?uncommitment?after?last?mixed?GC)

That seems to be related to some strict separation of young/old again.

> 
>>?ElasticHeapPeriodicYGCIntervalCeilingPercent,?25?\
> ElasticHeapPeriodicYGCIntervalFloorPercent,?25?\
> (The?actual?young?GC?interval?will?fluctuate?between?\
> ElasticHeapPeriodicYGCIntervalMillis?*?(100?-?ElasticHeapPeriodicYGCIntervalFloorPercent)?/?100?and?\
> ElasticHeapPeriodicYGCIntervalMillis?*?(100?+?ElasticHeapPeriodicYGCIntervalCeilingPercent)?/?100?)
> 

Thanks,
   Thomas


From shade at redhat.com  Wed Oct  9 14:15:06 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 9 Oct 2019 16:15:06 +0200
Subject: RFR (S) 8232051: Epsilon should warn about Xms/Xmx/AlwaysPreTouch
 configuration
Message-ID: <dac4aee1-a4df-5ae1-ea95-57c4913c22b5@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8232051

This is arguably the UX bug: users expect low latency, but may not be aware that additional
configuration is needed for GCs to perform well in those conditions. Epsilon already enables LSM,
and should warn about Xms/Xmx/AlwaysPreTouch config too. It cannot adjust these settings, though,
because it would affect startup time -- users would have to opt-in.

Fix:
  https://cr.openjdk.java.net/~shade/8232051/webrev.01/

Testing: Linux x86_64 {fastdebug, release} gc/epsilon; jdk-submit (running)

-- 
Thanks,
-Aleksey


From thomas.schatzl at oracle.com  Wed Oct  9 20:32:58 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 09 Oct 2019 22:32:58 +0200
Subject: G1 patch of elastic Java heap
In-Reply-To: <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>
References: <6270ce59-4a8e-431e-9ccf-f6d2c0f927eb.maoliang.ml@alibaba-inc.com>
 <d82e704831f0afbc61f8a3fb6b69bb1463b7ede8.camel@oracle.com>
 <e4ffd4d9-3ec0-4592-ac8c-d5a77c6b2e75.maoliang.ml@alibaba-inc.com>
 <1267a5dd2cf6cc1d03df64d07a06ba0f45195951.camel@oracle.com>
 <3140197d-8cab-4a86-af92-58431c74cb6b.maoliang.ml@alibaba-inc.com>
 <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>
 <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>
Message-ID: <632f30814d5028cbc957d3ba2c04537aaa21bd41.camel@oracle.com>

Hi,

On Wed, 2019-10-09 at 16:10 +0200, Thomas Schatzl wrote:
> Hi,
> 
>    sorry for the late reply.
> 
> First, I have a more general question: lots of changes deal with 
> providing options to separately change properties generations at 
> runtime. Like if there were separate pools of young and old gen
> memory.
> 
> G1 is kind of built upon the idea that you pass a pause time goal
> and then modifies generation sizes and takes memory for the
> generations from a single memory pool as needed.
> 
> To me this indicates that automatic sizing is not working correctly,
> but there are many(?) use cases where it does not work as expected.
> This requires manual tuning in generation sizes for whatever reason.
> 
> Can you share your thoughts about this? There seems to be some bit
> of information missing to me - this is probably the reason for some
> of the dumb questions about the flags, and me being not too fond of
> them.
> 
> On 26.09.19 08:49, Liang Mao wrote:
> > 
> > Hi All,
> > 
> > Here is the user guide of G1ElasticHeap patch. Hope it will help
> > to 
> > understand.
> > 
> > G1ElasticHeap
> > G1ElasticHeap is a GC feature to return memory of Java heap to OS t
> > o reduce the 
> > 
> > memory footprint of Java process. To enable this feature, you need 
> > to use G1 GC 
> > 
> > by options: -XX:+UseG1GC -XX:+G1ElasticHeap.
> > 
> > ## Usage
> > There are 3 modes which can be enabled in G1ElasticHeap.
> > ### 1. Periodic uncommit
> > Memory will be uncommitted by periodic GC. To enable periodic uncom
> > mit, use option 
> > 
> > -XX:+ElasticHeapPeriodicUncommit or dynamically enable the option v
> > ia jinfo:
> > 
> > `jinfo -flag +ElasticHeapPeriodicUncommit PID`
> 
> As far as I can tell, this setting periodically scans the heap for
> (too many?) uncommitted regions and, well, uncommits them.
> 
> Not completely sure if that is better than doing periodic gcs - as we
> do not expect to gain memory outside of a GC; in JDK12+ (I think) G1
> alwasy uncommits at the remark pause which should give most of the
> benefits.
> 
> There *may* be reason to also try to uncommit after the last mixed
> GC, but not sure if uncommit is that urgent - to some degree the
> existing JEP 346: Promptly return unused committed memory from G1 
> (https://openjdk.java.net/jeps/346) should cover some of the use
> cases. 
> I.e. after some delay (and inactivity) there will be another Remark 
> pause anyway.
> 
> The main reason why Remark has been chosen to uncommit memory is
> because we assume that the heap size at Remark (this is what adaptive
> IHOP shoots for) is the "target heap size".
> 
> 
> > Related options:
> > 
> > >  ElasticHeapPeriodicYGCIntervalMillis, 15000 \
> > 
> > (target young GC interval 15 seconds in default) \
> > (eg, if Java runs with MaxNewSize=4g, young GC every 30 seconds, G1
> > ElasticHeap will keep 15s
> >   GC interval and make a max 2g young generation to uncommit 2g mem
> > ory)
> > 
> > >  ElasticHeapPeriodicInitialMarkIntervalMillis, 3600000 \
> > 
> > (Target initial mark interval, 1 hour in default. Unused memory of 
> > old generation will be uncommitted
> >   after last mixed GC.)
> 
> This sesm to implement an unconditional concurrent cycle like with
> the CMSTriggerInterval flag for CMS.
> 
> Maybe there is a more clever alternative on triggering concurrent
> cycles like ZGC does based on the ratio between time spent by the
> mutator and the gc.
> 
> > 
> > >  ElasticHeapPeriodicUncommitStartupDelay, 300 \
> > 
> > (Delay after startup to do memory uncommit, 300 seconds in default)
> > 
> > >  ElasticHeapPeriodicMinYoungCommitPercent, 50 \
> > 
> > (Percentage of young generation to keep, default 50% of the young g
> > eneration will not be uncommitted)
> 
> See above about separating young/old.
> 
> > 
> > ### 2. Generation limit
> > To limit the young/old generation separately. Use jcmd or MXBean to
> >  enable.
> 
> I do not understand the reason for those, see above.
> 
> [...]
> > 
> > ### 3. Softmx mode
> > Dynamically to limit the heap as a percentage of origin Xmx.
> > 
> > Use jcmd:
> > 
> > `jcmd PID ElasticHeap softmx_percent=60`
> > 
> > Use MXBean:
> > 
> > `elasticHeapMXBean.setSoftmxPercent(70);`
> 
> That one sounds good, and actually there is a flag SoftMaxHeapSize 
> already in the VM. Only ZGC implements it though.
> 
> I think this idea matches the specifications in 
> https://bugs.openjdk.java.net/browse/JDK-8222145 (i.e. as far as I
> can tell, the softmxpercent is a "soft"/target heap size), so I think
> this could be implemented under the SoftMaxHeapSize flag.
> 
> SoftMaxHeapSize is already manageable too, so could be modified
> already. Only the implementation is missing in G1 :)
> 
> > 
> > ### Other G1ElasticHeap advanced options:
> > >  ElasticHeapMinYoungCommitPercent, 10 \
> > 
> >   (Mininum percentage of young generation)
> > 
> > >  ElasticHeapYGCIntervalMinMillis, 5000 \
> > 
> >   (Mininum young GC interval)
> > 
> > >  ElasticHeapInitialMarkIntervalMinMillis, 60000 \
> > 
> > (Mininum initial mark interval)
> > 
> > >  ElasticHeapEagerMixedGCIntervalMillis, 15000 \
> > 
> > (Guaranteed mixed GC interval, to make sure the mixed will happen i
> > n time to uncommit memory after last mixed GC)
> 
> These options seem to be mostly useful for when the allocation rate
> of the mutator is not high enough to advance the collection cycle.
> 
> Would the feature provide the requested feature? Maybe it needs
> some minor improvement, but to me it seems very burdensome to specify
> so many options...
> 

The first sentence got mangled somewhere: A guaranteed concurrent cycle
and/or the existing "Promptly return unused memory" feature would imho
implicitly provide "guaranteed" advancement in the garbage collection
cycle.

Starting a particular kind of collection seems to be almost only useful
for debugging; also while in jdk11+ triggering a mixed gc is still
possible at any time, it may not yield the expected benefit as G1 does
not maintain remembered sets all the time - i.e. most of the time there
are no old regions with remembered sets around.

Maybe the "Promptly return unused memory" feature could be adapted a
bit in cases when there is "some but still not significant" activity to
not trigger at all to cover such cases.

Thanks,
  Thomas


From per.liden at oracle.com  Wed Oct  9 21:04:57 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 9 Oct 2019 23:04:57 +0200
Subject: RFR: 8232070: ZGC: Remove unused ZVerifyLoadBarriers
Message-ID: <1df387c3-bae2-45b3-7930-1baf56dea03c@oracle.com>

After JDK-8230565, we left the develop flag ZVerifyLoadBarriers around, 
which is no longer used and can be removed.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232070
Webrev: http://cr.openjdk.java.net/~pliden/8232070/webrev.0

/Per


From kim.barrett at oracle.com  Wed Oct  9 21:23:16 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 9 Oct 2019 17:23:16 -0400
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
Message-ID: <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>

> On Oct 8, 2019, at 7:48 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> 3874   if (collector_state()->in_initial_mark_gc()) {
> 3875     remark_strong_nmethods(per_thread_states);
> 3876   }
> 
> I think this additional task and the associated pending strong nmethod
> sets in the pss can be eliminated by using a 2-bit tag and a more
> complex state machine earlier.

I thought about this some more and have some improvements to the
previous pseudo-code, including eliminating the loop in
strong_processor.  More careful consideration of the possible states
showed them to be more limited than I'd previously thought they were.
I hadn't noticed the benefit from delaying weak_processor's push onto
the global list and combining it with the transition to the "weak
done" state.

States, encoded in the link member of nmethod N:
- unclaimed: NULL
- weak: N, tag 00
- weak done: NEXT, tag 01
- weak, need strong: N, tag 10
- strong: NEXT, tag 11

where NEXT is the next nmethod in the global list, or N if it is the
last entry, e.g. self-loop indicates end of list.

weak_processor(n):
    if n->link != NULL:
        # already claimed; nothing to do here.
        return
    elif not replace_if_null(tagged(n, 0), &n->link):
        # just claimed by another thread; nothing to do here.
        return
    # successfully claimed for weak processing.
    assert n->link == tagged(n, 0)
    do_weak_processing(n)
    # push onto global list.  self-loop end of list to avoid tagged NULL.
    # not pushing onto global list until ready to mark weak processing
    # done significantly simplifies the set of states.
    next = xchg(n, &_list_head) 
    if next == NULL: next = n 
    # try to install end of list + weak done tag.
    if cmpxchg(tagged(next, 1), &n->link, tagged(n, 0)) == tagged(n, 0):
        return
    # failed, which means some other thread added strong request.
    assert n->link == tagged(n, 2)
    # do deferred strong processing.
    n->link = tagged(next, 3)
    do_strong_processing(n)

strong_processor(n):
    raw_next = cmpxchg(tagged(n, 3), &n->link, NULL)
    if raw_next == NULL:
        # successfully claimed for strong processing.
        do_strong_processing(n)
        # push onto global list.  self-loop end of list to avoid tagged NULL.
        next = xchg(n, &_list_head)
        if next == NULL: next = n
        n->link = tagged(next, 3)
        return
    # claim failed.  figure out why and handle it.
    next = strip_tag(raw_next)
    if raw_next == next:          # (raw_next - next) == 0
        # claim failed because being weak processed (state == "weak").
	# try to request deferred strong processing.
        assert next == tagged(n, 0)
        raw_next = cmpxchg(tagged(n, 2), &n->link, next)
        if (raw_next == next):
            # successfully requested deferred strong processing.
            return
        # failed because of a concurrent transition.
	# no longer in "weak" state.
        next = strip_tag(raw_next)
    if (raw_next - next) >= 2:
        # already claimed for strong processing or requested for such.
        return
    # weak processing is complete.
    # raw_next: tag == 1, NEXT == next list entry or N    
    if cmpxchg(tagged(NEXT, 3), &N->link, raw_next) == raw_next:
        # claimed "weak done" to "strong".
        do_strong_processing(N)
    # if claim failed then some other thread got it.


From stefan.johansson at oracle.com  Wed Oct  9 21:40:37 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 9 Oct 2019 23:40:37 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
Message-ID: <93592401-FC69-4B7F-95BE-DE9A0F070F3A@oracle.com>

Hi Sangheon,

Thanks again for a much improved version. Some comments below.

> 9 okt. 2019 kl. 06:27 skrev sangheon.kim at oracle.com:
> 
> ...
> 
> Here's the major change list at the webrev. Or arguable list :)
> 1) Verification at HRM::allocate_free_region() is removed and it will be added somewhere at safepoint by JDK-8220312 (3/3 which is part of this JEP). Probably at the end of young gc?
> 2) Node id printing is changed. Removed old one and added at HeapRegion::print_on() with new column. Node id is only printed when UseNUMA is enabled and gc+heap+region=trace. If there's single active node, it will print the node id and this is intentional. Another approach would be printing only if there are multiple nodes.
> 3) If AlwaysPreTouch is enabled, HeapRegion will have actual node index instead of preferred node index.
> 4) HeapRegion::_node_index is set at HRM::make_regions_available() as there is the only place initializing HeapRegion. Another approach would be setting the index at HeapRegion::initialize(we have to pollute HR with G1MNM stuff) or conditionally(*) setting the index at HeapRegion::node_index(). (*) if the index is unknown etc..
> 5) G1NUMA class is merged into G1MemoryNodeManager.

I saw your comment above about suggestions around this area and I can try out one thought I had, something I think Thomas mentioned as well. Making the non-NUMA case work exactly as a the NUMA case with one node. I?ll need some more time for that, but below are my comments on the current patch.

> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3.inc

src/hotspot/os/linux/os_linux.cpp
?
3026     warning("Failed to get numa id at " PTR_FORMAT " with errno=%d", p2i((void*)address), errno);

The cast here is no longer needed.
?

src/hotspot/share/gc/g1/g1Allocator.hpp
?
 44   G1MemoryNodeManager* _mnm;

I would prefer a more descriptive name like _memory_node_manager.
?

src/hotspot/share/gc/g1/g1CollectedHeap.hpp
?
 196   // Manages single or multi node memory.
 197   G1MemoryNodeManager* _mem_node_mgr;
 ...
 558   G1MemoryNodeManager* mem_node_mgr() const { return _mem_node_mgr; }

As above, I would prefer spelling out the names to memory_node_manager().
?

src/hotspot/share/gc/g1/g1_globals.hpp
?
Last line still removed a ?\?, please revert this change.
?

src/hotspot/share/gc/g1/heapRegion.cpp
?
 462   if (UseNUMA) {
 463     const int* node_ids = G1MemoryNodeManager::mgr()->node_ids();
 464     st->print("|Node ID %02d", node_ids[this->node_index()]);
 465   }
 466   st->print_cr("?);

I would prefer having a function that returns the node id given the index. Like the inverse of index_of_node_id(). 

I also think it would be more informative to say "NUMA id" or "NUMA node?.
?

src/hotspot/share/gc/g1/heapRegionManager.cpp
?
 195 // Set node index of the given HeapRegion.
 196 // If AlwaysPreTouch is enabled, set with actual node index.
 197 // If it is disabled, set with preferred node index which is already decided.
 198 static void set_heapregion_node_index(HeapRegion* hr) {
 199   uint node_index;
 200   if(AlwaysPreTouch) {
 201     // If we already pretouched, we can check actual node index here.
 202     node_index = G1MemoryNodeManager::mgr()->index_of_address(hr->bottom());
 203   } else {
 204     node_index = G1MemoryNodeManager::mgr()->preferred_node_index_for_index(hr->hrm_index());
 205   }
 206   hr->set_node_index(node_index);
 207 }

I would prefer to have a helper for calculating the index to set not a helper for setting the index. If you agree, you could move this logic to G1MemoryNodeManager::index_for_region() and then you can change:
 233     // Set node index of the heap region after initialization but before inserting
 234     // to free list.
 235     set_heapregion_node_index(hr);

To just:
 235     hr->set_node_index(G1MemoryNodeManager::mgr()->index_for_region(hr));
?
 309  bool HeapRegionManager::is_on_preferred_index(uint region_index, uint preferred_node_index) {
 310    uint region_node_index = G1MemoryNodeManager::mgr()->preferred_node_index_for_index(region_index);
 311   return region_node_index == preferred_node_index;
 312  }

Indentation on row 311.
?

src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
?
 44   static G1MemoryNodeManager* mgr() { return _inst; }

I think we should change the name of this getter to manager(), to avoid unnecessary shortenings. 
?
57   virtual bool has_multi_nodes() const { return false; }

Same as above I would prefer has_multiple_nodes()
?

Thanks,
Stefan

> Testing: hs-tier 1~5, with/without UseNUMA
> 
> Thanks,
> Sangheon


From sangheon.kim at oracle.com  Wed Oct  9 21:41:33 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Wed, 9 Oct 2019 14:41:33 -0700
Subject: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent
 marking bitmaps.
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
Message-ID: <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>

Hi Kishor,

On 10/4/19 4:15 PM, Kharbas, Kishor wrote:
>
> Hi Stefan,
>
> Thanks for the review. Some comments inline.
>
> New webrev : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00_to_01/
>
> http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/ 
> <http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/>
>
I am reviewing the patch but have a question on top of Stefan's question[1].
Why the bimap mappers are committed? I think all troubles started from 
'committing but treating as special here. Couldn't just treat the bitmap 
mappers as 'special' without commit?
If 'not committing' is doable, couldn't simply create ReservedSpace with 
'special' enabled (independent to large page setting, which is same to 
Stefan's comment)? Or add PinnedResevedSpace to force 'special enabled'.

[1]: Another thing, can you remind me why we need the bitmaps to be 
pinned but not other structures such as the card table?

+HeterogeneousHeapRegionManager::initialize() ...

+ // We commit bitmap for all regions during initialization and mark the 
bitmap space as special.
+ // This allows regions to be un-committed while concurrent-marking 
threads are accesing the bitmap concurrently.

Thanks,
Sangheon


> > Hi Kishor,
>
> >
>
> > On 04.10.19 03:00, Kharbas, Kishor wrote:
>
> >> Hi,
>
> >> When I worked on 
> JDK-8211425<https://bugs.openjdk.java.net/browse/JDK-8211425>, there 
> was a request for better abstraction for pinning G1's CM bitmaps. RFE 
> for the request is here - 
> JDK-8215893<https://bugs.openjdk.java.net/browse/JDK-8215893>.
>
> >>
>
> >> Here is a proposal : 
> http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/
>
> >>
>
> >> Here G1PageBasedVirtualSpace pins the entire reserved memory to 
> memory during construction. The constructor takes an additional bool 
> flag which says "does it need to pin the memory".
>
> >> If the memory is pinned, '_special' flag is set to true. I piggy 
> back on _special flag's behavior which is to not do actual OS 
> (un-)commits on calls to (un)commit().
>
> >> Rest of the changes is the mechanism to pass this flag from CM 
> bitmaps creation in G1CollectedHeap all the way to 
> G1PageBasedVirtualSpace.
>
> >>
>
> >> Let me know if this is a good abstraction and if there is any 
> better way.
>
> >>
>
> >> Thanks
>
> >> Kishor
>
> >>
>
> >
>
> > Some comments:
>
> >
>
> > - in the parameter lists, if the parameters are already laid out
>
> > line-by-line, if adding a new one, please put it on a new line as well.
>
> >
>
> Fixed in the new webrev.
>
> > - this code
>
> >
>
> >??? if (_special) {
>
> >????? if (!rs.special()) {
>
> > commit_internal(addr_to_page_index(_low_boundary),
>
> > addr_to_page_index(_high_boundary));
>
> >????? }
>
> >
>
> > in g1PageBasedVirtualSpace looks very incomprehensible.? :)
>
> >
>
> > I would prefer (pending the second reviewer's comment) to either use 
> the
>
> > "pinned" flag here, or even better, move the necessary commit calls 
> into
>
> > the (now removed) HeterogeneousHeapRegionManager::initialize().
>
> >
>
> Made it little more comprehensible. Will see what other reviewers 
> think about moving it somewhere else.
>
> > - I would just purely from feeling prefer if the "pinned" flag 
> parameter
>
> > would be listed after the "type" parameter in the 
> G1RegionToSpaceMapper.
>
> > But that's probably just me.
>
> >
>
> I did it this way to logically group the parameters. MemTracker is a 
> tracker used by the VM everywhere and does not pertain to this class 
> as such, so I kept it in the end.
>
> > Also, finally one parameter per line for the declaration/definition of
>
> > the constructor would improve readability.
>
> >
>
> Done.
>
> Thank you,
>
> Kishor
>
> > Thanks,
>
> >? ??Thomas
>


From per.liden at oracle.com  Thu Oct 10 08:55:45 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 10 Oct 2019 10:55:45 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <ebe51564-a8c5-1255-173f-e2038d5bd602@oracle.com>
References: <ebe51564-a8c5-1255-173f-e2038d5bd602@oracle.com>
Message-ID: <0a2eee49-9bb4-a1be-f8fc-b2efcc01fd59@oracle.com>

(CC:ing serviceability-dev)

On 10/7/19 2:38 PM, Per Liden wrote:
> This test is currently disabled for ZGC, but it can easily be enabled by 
> adjusting the expected log string. ZGC doesn't print "Pause Full", but 
> it still prints the "(Diagnostic Command)" part.
> 
> Also, the test enables gc=debug logging, which is unnecessary since this 
> is always printed on the gc=info level.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231943
> Webrev: http://cr.openjdk.java.net/~pliden/8231943/webrev.0
> 
> Testing: Manually ran test with all GCs (except Epsilon)
> 
> /Per


From per.liden at oracle.com  Thu Oct 10 09:04:13 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 10 Oct 2019 11:04:13 +0200
Subject: RFR: 8231996: ZGC: Replace ZStatisticsForceTrace with check if JFR
 event is enabled
Message-ID: <b3f50d5d-002a-02c2-39d5-d07d4e9a0c27@oracle.com>

Remove and replace the diagnostic flag ZStatisticsForceTrace with a 
check if JFR event is enabled. This flag was introduced as a safety 
measure back when sending JFR events was problematic in some contexts. 
This is no longer the case, so we can just let the 
default.jfc/profile.jfc control when those events should be sent.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231996
Webrev: http://cr.openjdk.java.net/~pliden/8231996/webrev.0

/Per


From thomas.schatzl at oracle.com  Thu Oct 10 09:29:45 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 10 Oct 2019 11:29:45 +0200
Subject: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent
 marking bitmaps.
In-Reply-To: <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
 <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
Message-ID: <ace26be7-cbe7-5f26-c4a4-5d9777a5201c@oracle.com>

Hi,

On 09.10.19 23:41, sangheon.kim at oracle.com wrote:
> Hi Kishor,
> 
> On 10/4/19 4:15 PM, Kharbas, Kishor wrote:
>>
>> Hi Stefan,
>>
>> Thanks for the review. Some comments inline.
>>
>> New webrev : 
>> http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00_to_01/
>>
>> http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/ 
>> <http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/>
>>
> I am reviewing the patch but have a question on top of Stefan's 
> question[1].
> Why the bimap mappers are committed? I think all troubles started from 
> 'committing but treating as special here. Couldn't just treat the bitmap 
> mappers as 'special' without commit?
> If 'not committing' is doable, couldn't simply create ReservedSpace with 
> 'special' enabled (independent to large page setting, which is same to 
> Stefan's comment)? Or add PinnedResevedSpace to force 'special enabled'.
> 
> [1]: Another thing, can you remind me why we need the bitmaps to be 
> pinned but not other structures such as the card table?
> 
> +HeterogeneousHeapRegionManager::initialize() ...
> 
> + // We commit bitmap for all regions during initialization and mark the 
> bitmap space as special.
> + // This allows regions to be un-committed while concurrent-marking 
> threads are accesing the bitmap concurrently.

   what is the situation where G1 would uncommit parts of the heap while 
concurrent marking is running? Stale entries in the mark task queues?

Regular G1 limits uncommitting of regions (and associated data 
structures) to after concurrent marking.

Note that if never releasing mark bitmaps is really necessary, then 
never releasing card/offset table is probably required as well.

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Thu Oct 10 09:48:24 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 10 Oct 2019 11:48:24 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <93592401-FC69-4B7F-95BE-DE9A0F070F3A@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <93592401-FC69-4B7F-95BE-DE9A0F070F3A@oracle.com>
Message-ID: <ad57d17b-ae69-1319-25f9-32e48b40cbe1@oracle.com>

Hi,

On 09.10.19 23:40, Stefan Johansson wrote:
> Hi Sangheon,
> 
> Thanks again for a much improved version. Some comments below.

   agree, it looks quite nice now.

[...]
> 
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3.inc
[...]
> 
> src/hotspot/share/gc/g1/heapRegion.cpp
> ?
>   462   if (UseNUMA) {
>   463     const int* node_ids = G1MemoryNodeManager::mgr()->node_ids();
>   464     st->print("|Node ID %02d", node_ids[this->node_index()]);
>   465   }
>   466   st->print_cr("?);
> 
> I would prefer having a function that returns the node id given the index. Like the inverse of index_of_node_id().
> 
> I also think it would be more informative to say "NUMA id" or "NUMA node?.

I would also remove the "Node ID" string here as it does not convey any 
information. Most other columns also do not carry their description.

Thanks,
   Thomas


From shade at redhat.com  Thu Oct 10 10:53:56 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 10 Oct 2019 12:53:56 +0200
Subject: RFR (S) 8231947: Shenandoah: cleanup ShenandoahHumongousMoves flag
 treatment
Message-ID: <b124fbb0-6cf2-c3fc-9aff-9ab5d08ee375@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8231947

Fix:
  https://cr.openjdk.java.net/~shade/8231947/webrev.02/

This was enabled for a while now. Flag is changed to diagnostic, comment updated, the accessors
renamed to make more sense.

Testing: hotspot_gc_shenandoah, new test

-- 
Thanks,
-Aleksey


From leihouyju at gmail.com  Thu Oct 10 11:06:15 2019
From: leihouyju at gmail.com (Haoyu Li)
Date: Thu, 10 Oct 2019 19:06:15 +0800
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
Message-ID: <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>

Hi Stefan,

Thanks for your testing! One possible reason for the regressions in simple
tests is that the region dependencies maybe not heavy enough. Because the
locality of shadow regions is lower than that of heap regions, writing to
shadow regions will be slower than to normal regions, and this is a part of
the reason why I reuse shadow regions. Therefore, if only a few shadow
regions are created and not reused, the overhead may not be amortized.

As to the OCA, it is the case that I'm the only person signing the
agreement. Please let me know if you have any further questions. Thanks
again!

Best Regrads,
Haoyu Li

Stefan Johansson <stefan.johansson at oracle.com> ?2019?10?8??? ??6:49???

> Hi Haoyu,
>
> I've done some more testing and I haven't seen any issues with the patch
> so far and the performance looks promising in most cases. For simple
> tests I've seen some regressions, but I'm not really sure why. Will do
> some more digging.
>
> To move forward with this the first thing we need to do is making sure
> that you being covered by the Oracle Contributor Agreement is enough.
>  From what we can see it is only you as an individual that has signed
> the OCA and in that case it is important that this statement from the
> OCA is fulfilled: "no other person or entity, including my employer, has
> or will have rights with respect my contributions"
>
> Is this the case for this contribution or should we have the university
> sign the OCA as well? For more information regarding the OCA please
> refer to:
> https://www.oracle.com/technetwork/oca-faq-405384.pdf
>
> Thanks,
> Stefan
>
> On 2019-09-16 16:02, Haoyu Li wrote:
> > FYI, the evaluation results on OpenJDK 14 are plotted in the attachment.
> > I compute the full GC throughput by dividing the heap size before full
> > GC by the GC pause time, and the results are arithmetic mean values of
> > ten runs after a warm-up run. The evaluation is conducted on a machine
> > with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16 physical cores
> > with SMT enabled) and 64G DRAM.
> >
> > Best Regrads,
> > Haoyu Li,
> > Institute of Parallel and Distributed Systems(IPADS),
> > School of Software,
> > Shanghai Jiao Tong University
> >
> >
> > Stefan Johansson <stefan.johansson at oracle.com
> > <mailto:stefan.johansson at oracle.com>> ?2019?9?12??? ??5:34???
> >
> >     Hi Haoyu,
> >
> >     I recently came across your patch and I would like to pick up on
> >     some of the things Kim mentioned in his mails. I especially want
> >     evaluate and investigate if this is a technique we can use to
> >     improve the other GCs as well. To start that work I want to take the
> >     patch for a spin in our internal performance testing. The patch
> >     doesn?t apply clean to the latest JDK repository, so if you could
> >     provide an updated patch that would be very helpful.
> >
> >     It would also be great if you could share some more information
> >     around the results presented in the paper. For example, it would be
> >     good to get the full command lines for the different benchmarks so
> >     we can run them locally and reproduce the results you?ve seen.
> >
> >     Thanks,
> >     Stefan
> >
> >>     12 mars 2019 kl. 03:21 skrev Haoyu Li <leihouyju at gmail.com
> >>     <mailto:leihouyju at gmail.com>>:
> >>
> >>     Hi Kim,
> >>
> >>     Thanks for reviewing and testing the patch. If there are any
> >>     failures or performance degradation relevant to the work, please
> >>     let me know and I'll be very happy to keep improving it. Also, any
> >>     suggestions about code improvements are well appreciated.
> >>
> >>     I'm not quite sure if both G1 and Shenandoah have the similar
> >>     region dependency issue, since I haven't studied their GC
> >>     behaviors before. If they have, I'm also willing to propose a more
> >>     general optimization.
> >>
> >>     As to the memory overhead, I believe it will be low because this
> >>     patch exploits empty regions in the young space rather than
> >>     off-heap memory to allocate shadow regions, and also reuses the
> >>     /_source_region/ field of each /RegionData /to record the
> >>     correspongding shadow region index. We only introduce a new
> >>     integer filed /_shadow /in the RegionData class to indicate the
> >>     status of a region, a global /GrowableArray _free_shadow/ to store
> >>     the indices of shadow regions, and a global /Monitor/ to protect
> >>     the array. These information might help if the memory overhead
> >>     need to be evaluated.
> >>
> >>     Looking forward to your insight.
> >>
> >>     Best Regrads,
> >>     Haoyu Li,
> >>     Institute of Parallel and Distributed Systems(IPADS),
> >>     School of Software,
> >>     Shanghai Jiao Tong University
> >>
> >>
> >>     Kim Barrett <kim.barrett at oracle.com
> >>     <mailto:kim.barrett at oracle.com>> ?2019?3?12??? ??6:11???
> >>
> >>         > On Mar 11, 2019, at 1:45 AM, Kim Barrett
> >>         <kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>> wrote:
> >>         >
> >>         >> On Jan 24, 2019, at 3:58 AM, Haoyu Li <leihouyju at gmail.com
> >>         <mailto:leihouyju at gmail.com>> wrote:
> >>         >>
> >>         >> Hi Kim,
> >>         >>
> >>         >> I have ported my patch to OpenJDK 13 according to your
> >>         instructions in your last mail, and the patch is attached in
> >>         this mail. The patch does not change much since PSGC is indeed
> >>         pretty stable.
> >>         >>
> >>         >> Also, I evaluate the correctness and performance of PS full
> >>         GC with benchmarks from DaCapo, SPECjvm2008, and JOlden suits
> >>         on a machine with dual Intel Xeon E5-2618L v3 CPUs(16 physical
> >>         cores), 64G DRAM and linux kernel 4.17. The evaluation result,
> >>         indicating 1.9X GC throughput improvement on average, is
> >>         attached, too.
> >>         >>
> >>         >> However, I have no idea how to further test this patch for
> >>         both correctness and performance. Can I please get any
> >>         guidance from you or some sponsor?
> >>         >
> >>         > Sorry I missed that you had sent an updated version of the
> >>         patch.
> >>         >
> >>         > I?ve run the full regression suite across Oracle-supported
> >>         platforms.  There are some
> >>         > failures, but there are almost always some failures in the
> >>         later tiers right now.  I?ll start
> >>         > looking at them tomorrow to figure out whether any of them
> >>         are relevant.
> >>         >
> >>         > I?m also planning to run some of our performance benchmarks.
> >>         >
> >>         > I?ve lightly skimmed the proposed changes.  There might be
> >>         some code improvements
> >>         > to be made.
> >>         >
> >>         > I?m also wondering if this technique applies to other
> >>         collectors.  It seems like both G1 and
> >>         > Shenandoah full gc?s might have similar issues?  If so, a
> >>         solution that is ParallelGC-specific
> >>         > is less interesting than one that has broader
> >>         applicability.  Though maybe this optimization
> >>         > is less important for G1 and Shenandoah, since they actively
> >>         try to avoid full gc?s.
> >>         >
> >>         > I?m also not clear on how much additional memory might be
> >>         temporarily allocated by this
> >>         > mechanism.
> >>
> >>         I?ve created a CR for this:
> >>         https://bugs.openjdk.java.net/browse/JDK-8220465
> >>
> >
>


From rkennke at redhat.com  Thu Oct 10 11:16:55 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 10 Oct 2019 13:16:55 +0200
Subject: RFR (S) 8231947: Shenandoah: cleanup ShenandoahHumongousMoves
 flag treatment
In-Reply-To: <b124fbb0-6cf2-c3fc-9aff-9ab5d08ee375@redhat.com>
References: <b124fbb0-6cf2-c3fc-9aff-9ab5d08ee375@redhat.com>
Message-ID: <cb2ff4a4-d838-df8c-82f2-aa94ca9cb79c@redhat.com>

Yup! Thanks!

Roman


> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8231947
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8231947/webrev.02/
> 
> This was enabled for a while now. Flag is changed to diagnostic, comment updated, the accessors
> renamed to make more sense.
> 
> Testing: hotspot_gc_shenandoah, new test
> 


From shade at redhat.com  Thu Oct 10 11:32:28 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 10 Oct 2019 13:32:28 +0200
Subject: RFR (M) 8232102: Shenandoah: print everything in proper units
Message-ID: <f0bdc629-febf-f359-bf1e-92a70c409884@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8232102

Shenandoah is used on smaller heaps as well as large ones. There, the default unit of "M" makes the
logs too coarse and lose information. We have already fixed up some of the uses where it is
critical. This issue handles the rest of the cases. This makes more sense after JDK-8217315 is
fixed. Note the GC timings/heap-sizes themselves are handled by shared code, see JDK-8232100.

Fix:
  https://cr.openjdk.java.net/~shade/8232102/webrev.01/

Testing: hotspot_gc_shenandoah {fastdebug, release}, eyeballing gc logs

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Thu Oct 10 11:43:03 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 10 Oct 2019 13:43:03 +0200
Subject: RFR (M) 8232102: Shenandoah: print everything in proper units
In-Reply-To: <f0bdc629-febf-f359-bf1e-92a70c409884@redhat.com>
References: <f0bdc629-febf-f359-bf1e-92a70c409884@redhat.com>
Message-ID: <cb3da458-9a94-40b4-4483-9e5541a21c9b@redhat.com>

Looks good! Thanks!

Roman


> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8232102
> 
> Shenandoah is used on smaller heaps as well as large ones. There, the default unit of "M" makes the
> logs too coarse and lose information. We have already fixed up some of the uses where it is
> critical. This issue handles the rest of the cases. This makes more sense after JDK-8217315 is
> fixed. Note the GC timings/heap-sizes themselves are handled by shared code, see JDK-8232100.
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232102/webrev.01/
> 
> Testing: hotspot_gc_shenandoah {fastdebug, release}, eyeballing gc logs
> 


From shade at redhat.com  Thu Oct 10 12:03:00 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 10 Oct 2019 14:03:00 +0200
Subject: RFR (S) 8232100: GC timings should use proper units for heap sizes
Message-ID: <ab5224ff-1842-23b1-e8c6-80f97e0dd1de@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8232100

Webrev:
  https://cr.openjdk.java.net/~shade/8232100/webrev.01/

GC log prints heap sizes in selected GC events. Currently, it unconditionally uses "M" as the suffix
for heap sizes, which makes GC logs too coarse on smaller heaps. This loses performance data
accuracy, which is sometimes a dealbreaker in logs analysis. Let's make it into proper units.

I ran many tests of my own, but would appreciate if somebody runs it through more comprehensive
suite of tests, looking for tests that parse the GC logs for whatever reason.

Testing: eyeballing GC logs, jdk-submit, hotspot_gc {g1, shenandoah, parallel}

-- 
Thanks,
-Aleksey


From stefan.johansson at oracle.com  Thu Oct 10 12:37:18 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 10 Oct 2019 14:37:18 +0200
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
Message-ID: <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>

Hi,

On 2019-10-10 13:06, Haoyu Li wrote:
> Hi Stefan,
> 
> Thanks for your testing! One possible reason for the regressions in 
> simple tests is that the region dependencies maybe not heavy enough. 
> Because the locality of shadow regions is lower than that of heap 
> regions, writing to shadow regions will be slower than to normal 
> regions, and this is a part of the reason why I reuse shadow regions. 
> Therefore, if only a few shadow regions are created and not reused, the 
> overhead may not be amortized.

I guess it is something like this. I thought that for "easy" heaps the 
shadow regions won't be used at all, and should therefor not really cost 
anything.

> 
> As to the OCA, it is the case that I'm the only person signing the 
> agreement. Please let me know if you have any further questions. Thanks 
> again!

Ok, so you are the sole author of the patch. The important part, as the 
agreement states, is:
"no other person or entity, including my employer, has or will have 
rights with respect my contributions"

Is that the case?

Thanks,
Stefan

> 
> Best Regrads,
> Haoyu Li
> 
> Stefan Johansson <stefan.johansson at oracle.com 
> <mailto:stefan.johansson at oracle.com>> ?2019?10?8??? ??6:49???
> 
>     Hi Haoyu,
> 
>     I've done some more testing and I haven't seen any issues with the
>     patch
>     so far and the performance looks promising in most cases. For simple
>     tests I've seen some regressions, but I'm not really sure why. Will do
>     some more digging.
> 
>     To move forward with this the first thing we need to do is making sure
>     that you being covered by the Oracle Contributor Agreement is enough.
>      ?From what we can see it is only you as an individual that has signed
>     the OCA and in that case it is important that this statement from the
>     OCA is fulfilled: "no other person or entity, including my employer,
>     has
>     or will have rights with respect my contributions"
> 
>     Is this the case for this contribution or should we have the university
>     sign the OCA as well? For more information regarding the OCA please
>     refer to:
>     https://www.oracle.com/technetwork/oca-faq-405384.pdf
> 
>     Thanks,
>     Stefan
> 
>     On 2019-09-16 16:02, Haoyu Li wrote:
>      > FYI, the evaluation results on OpenJDK 14 are plotted in the
>     attachment.
>      > I compute the full GC throughput by dividing the heap size before
>     full
>      > GC by the GC pause time, and the results are arithmetic mean
>     values of
>      > ten runs after a warm-up run.?The evaluation is conducted on a
>     machine
>      > with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16 physical
>     cores
>      > with SMT enabled) and 64G DRAM.
>      >
>      > Best Regrads,
>      > Haoyu Li,
>      > Institute of Parallel and Distributed Systems(IPADS),
>      > School of Software,
>      > Shanghai Jiao Tong University
>      >
>      >
>      > Stefan Johansson <stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      > <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>> ?2019?9?12??? ??5:34
>     ???
>      >
>      >? ? ?Hi Haoyu,
>      >
>      >? ? ?I recently came across your patch and I would like to pick up on
>      >? ? ?some of the things Kim mentioned in his mails. I especially want
>      >? ? ?evaluate and?investigate if this is a technique we can use to
>      >? ? ?improve the other?GCs as well. To start?that work I want to
>     take the
>      >? ? ?patch for a spin in our internal performance testing. The patch
>      >? ? ?doesn?t apply clean to the latest JDK repository, so if you could
>      >? ? ?provide an updated patch that would be very helpful.
>      >
>      >? ? ?It would also be great if you could share some more information
>      >? ? ?around the results presented in the paper. For example, it
>     would be
>      >? ? ?good to get the full?command lines for the different
>     benchmarks so
>      >? ? ?we can run them locally and reproduce the results?you?ve?seen.
>      >
>      >? ? ?Thanks,
>      >? ? ?Stefan
>      >
>      >>? ? ?12 mars 2019 kl. 03:21 skrev Haoyu Li <leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com>
>      >>? ? ?<mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>>:
>      >>
>      >>? ? ?Hi Kim,
>      >>
>      >>? ? ?Thanks for reviewing and testing the patch. If there are any
>      >>? ? ?failures or performance degradation relevant to the work, please
>      >>? ? ?let me know and I'll be very happy to keep improving it.
>     Also, any
>      >>? ? ?suggestions about code improvements are well appreciated.
>      >>
>      >>? ? ?I'm not quite sure if both G1 and Shenandoah have the similar
>      >>? ? ?region dependency issue, since I haven't studied their GC
>      >>? ? ?behaviors before. If they have, I'm also willing to propose
>     a more
>      >>? ? ?general optimization.
>      >>
>      >>? ? ?As to the memory overhead, I believe it will be low because this
>      >>? ? ?patch exploits empty regions in the young space rather than
>      >>? ? ?off-heap memory to allocate shadow regions, and also reuses the
>      >>? ? ?/_source_region/ field of each /RegionData /to record the
>      >>? ? ?correspongding shadow region index. We only introduce a new
>      >>? ? ?integer filed /_shadow /in the RegionData class to indicate the
>      >>? ? ?status of a region, a global /GrowableArray _free_shadow/?to
>     store
>      >>? ? ?the indices of shadow regions, and a global /Monitor/?to protect
>      >>? ? ?the array. These information might help if the memory overhead
>      >>? ? ?need to be evaluated.
>      >>
>      >>? ? ?Looking forward to your insight.
>      >>
>      >>? ? ?Best Regrads,
>      >>? ? ?Haoyu Li,
>      >>? ? ?Institute of Parallel and Distributed Systems(IPADS),
>      >>? ? ?School of Software,
>      >>? ? ?Shanghai Jiao Tong University
>      >>
>      >>
>      >>? ? ?Kim Barrett <kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>? ? ?<mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>> ?2019?3?12??? ??6:11???
>      >>
>      >>? ? ? ? ?> On Mar 11, 2019, at 1:45 AM, Kim Barrett
>      >>? ? ? ? ?<kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>
>     <mailto:kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>>> wrote:
>      >>? ? ? ? ?>
>      >>? ? ? ? ?>> On Jan 24, 2019, at 3:58 AM, Haoyu Li
>     <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>      >>? ? ? ? ?<mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com>>> wrote:
>      >>? ? ? ? ?>>
>      >>? ? ? ? ?>> Hi Kim,
>      >>? ? ? ? ?>>
>      >>? ? ? ? ?>> I have ported my patch to OpenJDK 13 according to your
>      >>? ? ? ? ?instructions in your last mail, and the patch is attached in
>      >>? ? ? ? ?this mail. The patch does not change much since PSGC is
>     indeed
>      >>? ? ? ? ?pretty stable.
>      >>? ? ? ? ?>>
>      >>? ? ? ? ?>> Also, I evaluate the correctness and performance of
>     PS full
>      >>? ? ? ? ?GC with benchmarks from DaCapo, SPECjvm2008, and JOlden
>     suits
>      >>? ? ? ? ?on a machine with dual Intel Xeon E5-2618L v3 CPUs(16
>     physical
>      >>? ? ? ? ?cores), 64G DRAM and linux kernel 4.17. The evaluation
>     result,
>      >>? ? ? ? ?indicating 1.9X GC throughput improvement on average, is
>      >>? ? ? ? ?attached, too.
>      >>? ? ? ? ?>>
>      >>? ? ? ? ?>> However, I have no idea how to further test this
>     patch for
>      >>? ? ? ? ?both correctness and performance. Can I please get any
>      >>? ? ? ? ?guidance from you or some sponsor?
>      >>? ? ? ? ?>
>      >>? ? ? ? ?> Sorry I missed that you had sent an updated version of the
>      >>? ? ? ? ?patch.
>      >>? ? ? ? ?>
>      >>? ? ? ? ?> I?ve run the full regression suite across Oracle-supported
>      >>? ? ? ? ?platforms.? There are some
>      >>? ? ? ? ?> failures, but there are almost always some failures in the
>      >>? ? ? ? ?later tiers right now.? I?ll start
>      >>? ? ? ? ?> looking at them tomorrow to figure out whether any of them
>      >>? ? ? ? ?are relevant.
>      >>? ? ? ? ?>
>      >>? ? ? ? ?> I?m also planning to run some of our performance
>     benchmarks.
>      >>? ? ? ? ?>
>      >>? ? ? ? ?> I?ve lightly skimmed the proposed changes.? There might be
>      >>? ? ? ? ?some code improvements
>      >>? ? ? ? ?> to be made.
>      >>? ? ? ? ?>
>      >>? ? ? ? ?> I?m also wondering if this technique applies to other
>      >>? ? ? ? ?collectors.? It seems like both G1 and
>      >>? ? ? ? ?> Shenandoah full gc?s might have similar issues?? If so, a
>      >>? ? ? ? ?solution that is ParallelGC-specific
>      >>? ? ? ? ?> is less interesting than one that has broader
>      >>? ? ? ? ?applicability.? Though maybe this optimization
>      >>? ? ? ? ?> is less important for G1 and Shenandoah, since they
>     actively
>      >>? ? ? ? ?try to avoid full gc?s.
>      >>? ? ? ? ?>
>      >>? ? ? ? ?> I?m also not clear on how much additional memory might be
>      >>? ? ? ? ?temporarily allocated by this
>      >>? ? ? ? ?> mechanism.
>      >>
>      >>? ? ? ? ?I?ve created a CR for this:
>      >> https://bugs.openjdk.java.net/browse/JDK-8220465
>      >>
>      >
> 


From leihouyju at gmail.com  Thu Oct 10 13:10:52 2019
From: leihouyju at gmail.com (Haoyu Li)
Date: Thu, 10 Oct 2019 21:10:52 +0800
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
Message-ID: <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>

Hi Stefan,

Thanks for your quick response! As to your concern about the OCA, I am the
sole author of the patch. And it is the case as what the agreement states.

Best Regrads,
Haoyu Li,


Stefan Johansson <stefan.johansson at oracle.com> ?2019?10?10??? ??8:37???

> Hi,
>
> On 2019-10-10 13:06, Haoyu Li wrote:
> > Hi Stefan,
> >
> > Thanks for your testing! One possible reason for the regressions in
> > simple tests is that the region dependencies maybe not heavy enough.
> > Because the locality of shadow regions is lower than that of heap
> > regions, writing to shadow regions will be slower than to normal
> > regions, and this is a part of the reason why I reuse shadow regions.
> > Therefore, if only a few shadow regions are created and not reused, the
> > overhead may not be amortized.
>
> I guess it is something like this. I thought that for "easy" heaps the
> shadow regions won't be used at all, and should therefor not really cost
> anything.
>
> >
> > As to the OCA, it is the case that I'm the only person signing the
> > agreement. Please let me know if you have any further questions. Thanks
> > again!
>
> Ok, so you are the sole author of the patch. The important part, as the
> agreement states, is:
> "no other person or entity, including my employer, has or will have
> rights with respect my contributions"
>
> Is that the case?
>
> Thanks,
> Stefan
>
> >
> > Best Regrads,
> > Haoyu Li
> >
> > Stefan Johansson <stefan.johansson at oracle.com
> > <mailto:stefan.johansson at oracle.com>> ?2019?10?8??? ??6:49???
> >
> >     Hi Haoyu,
> >
> >     I've done some more testing and I haven't seen any issues with the
> >     patch
> >     so far and the performance looks promising in most cases. For simple
> >     tests I've seen some regressions, but I'm not really sure why. Will
> do
> >     some more digging.
> >
> >     To move forward with this the first thing we need to do is making
> sure
> >     that you being covered by the Oracle Contributor Agreement is enough.
> >       From what we can see it is only you as an individual that has
> signed
> >     the OCA and in that case it is important that this statement from the
> >     OCA is fulfilled: "no other person or entity, including my employer,
> >     has
> >     or will have rights with respect my contributions"
> >
> >     Is this the case for this contribution or should we have the
> university
> >     sign the OCA as well? For more information regarding the OCA please
> >     refer to:
> >     https://www.oracle.com/technetwork/oca-faq-405384.pdf
> >
> >     Thanks,
> >     Stefan
> >
> >     On 2019-09-16 16:02, Haoyu Li wrote:
> >      > FYI, the evaluation results on OpenJDK 14 are plotted in the
> >     attachment.
> >      > I compute the full GC throughput by dividing the heap size before
> >     full
> >      > GC by the GC pause time, and the results are arithmetic mean
> >     values of
> >      > ten runs after a warm-up run. The evaluation is conducted on a
> >     machine
> >      > with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16 physical
> >     cores
> >      > with SMT enabled) and 64G DRAM.
> >      >
> >      > Best Regrads,
> >      > Haoyu Li,
> >      > Institute of Parallel and Distributed Systems(IPADS),
> >      > School of Software,
> >      > Shanghai Jiao Tong University
> >      >
> >      >
> >      > Stefan Johansson <stefan.johansson at oracle.com
> >     <mailto:stefan.johansson at oracle.com>
> >      > <mailto:stefan.johansson at oracle.com
> >     <mailto:stefan.johansson at oracle.com>>> ?2019?9?12??? ??5:34
> >     ???
> >      >
> >      >     Hi Haoyu,
> >      >
> >      >     I recently came across your patch and I would like to pick up
> on
> >      >     some of the things Kim mentioned in his mails. I especially
> want
> >      >     evaluate and investigate if this is a technique we can use to
> >      >     improve the other GCs as well. To start that work I want to
> >     take the
> >      >     patch for a spin in our internal performance testing. The
> patch
> >      >     doesn?t apply clean to the latest JDK repository, so if you
> could
> >      >     provide an updated patch that would be very helpful.
> >      >
> >      >     It would also be great if you could share some more
> information
> >      >     around the results presented in the paper. For example, it
> >     would be
> >      >     good to get the full command lines for the different
> >     benchmarks so
> >      >     we can run them locally and reproduce the results you?ve seen.
> >      >
> >      >     Thanks,
> >      >     Stefan
> >      >
> >      >>     12 mars 2019 kl. 03:21 skrev Haoyu Li <leihouyju at gmail.com
> >     <mailto:leihouyju at gmail.com>
> >      >>     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>>:
> >      >>
> >      >>     Hi Kim,
> >      >>
> >      >>     Thanks for reviewing and testing the patch. If there are any
> >      >>     failures or performance degradation relevant to the work,
> please
> >      >>     let me know and I'll be very happy to keep improving it.
> >     Also, any
> >      >>     suggestions about code improvements are well appreciated.
> >      >>
> >      >>     I'm not quite sure if both G1 and Shenandoah have the similar
> >      >>     region dependency issue, since I haven't studied their GC
> >      >>     behaviors before. If they have, I'm also willing to propose
> >     a more
> >      >>     general optimization.
> >      >>
> >      >>     As to the memory overhead, I believe it will be low because
> this
> >      >>     patch exploits empty regions in the young space rather than
> >      >>     off-heap memory to allocate shadow regions, and also reuses
> the
> >      >>     /_source_region/ field of each /RegionData /to record the
> >      >>     correspongding shadow region index. We only introduce a new
> >      >>     integer filed /_shadow /in the RegionData class to indicate
> the
> >      >>     status of a region, a global /GrowableArray _free_shadow/ to
> >     store
> >      >>     the indices of shadow regions, and a global /Monitor/ to
> protect
> >      >>     the array. These information might help if the memory
> overhead
> >      >>     need to be evaluated.
> >      >>
> >      >>     Looking forward to your insight.
> >      >>
> >      >>     Best Regrads,
> >      >>     Haoyu Li,
> >      >>     Institute of Parallel and Distributed Systems(IPADS),
> >      >>     School of Software,
> >      >>     Shanghai Jiao Tong University
> >      >>
> >      >>
> >      >>     Kim Barrett <kim.barrett at oracle.com
> >     <mailto:kim.barrett at oracle.com>
> >      >>     <mailto:kim.barrett at oracle.com
> >     <mailto:kim.barrett at oracle.com>>> ?2019?3?12??? ??6:11???
> >      >>
> >      >>         > On Mar 11, 2019, at 1:45 AM, Kim Barrett
> >      >>         <kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>
> >     <mailto:kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>>>
> wrote:
> >      >>         >
> >      >>         >> On Jan 24, 2019, at 3:58 AM, Haoyu Li
> >     <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
> >      >>         <mailto:leihouyju at gmail.com
> >     <mailto:leihouyju at gmail.com>>> wrote:
> >      >>         >>
> >      >>         >> Hi Kim,
> >      >>         >>
> >      >>         >> I have ported my patch to OpenJDK 13 according to your
> >      >>         instructions in your last mail, and the patch is
> attached in
> >      >>         this mail. The patch does not change much since PSGC is
> >     indeed
> >      >>         pretty stable.
> >      >>         >>
> >      >>         >> Also, I evaluate the correctness and performance of
> >     PS full
> >      >>         GC with benchmarks from DaCapo, SPECjvm2008, and JOlden
> >     suits
> >      >>         on a machine with dual Intel Xeon E5-2618L v3 CPUs(16
> >     physical
> >      >>         cores), 64G DRAM and linux kernel 4.17. The evaluation
> >     result,
> >      >>         indicating 1.9X GC throughput improvement on average, is
> >      >>         attached, too.
> >      >>         >>
> >      >>         >> However, I have no idea how to further test this
> >     patch for
> >      >>         both correctness and performance. Can I please get any
> >      >>         guidance from you or some sponsor?
> >      >>         >
> >      >>         > Sorry I missed that you had sent an updated version of
> the
> >      >>         patch.
> >      >>         >
> >      >>         > I?ve run the full regression suite across
> Oracle-supported
> >      >>         platforms.  There are some
> >      >>         > failures, but there are almost always some failures in
> the
> >      >>         later tiers right now.  I?ll start
> >      >>         > looking at them tomorrow to figure out whether any of
> them
> >      >>         are relevant.
> >      >>         >
> >      >>         > I?m also planning to run some of our performance
> >     benchmarks.
> >      >>         >
> >      >>         > I?ve lightly skimmed the proposed changes.  There
> might be
> >      >>         some code improvements
> >      >>         > to be made.
> >      >>         >
> >      >>         > I?m also wondering if this technique applies to other
> >      >>         collectors.  It seems like both G1 and
> >      >>         > Shenandoah full gc?s might have similar issues?  If
> so, a
> >      >>         solution that is ParallelGC-specific
> >      >>         > is less interesting than one that has broader
> >      >>         applicability.  Though maybe this optimization
> >      >>         > is less important for G1 and Shenandoah, since they
> >     actively
> >      >>         try to avoid full gc?s.
> >      >>         >
> >      >>         > I?m also not clear on how much additional memory might
> be
> >      >>         temporarily allocated by this
> >      >>         > mechanism.
> >      >>
> >      >>         I?ve created a CR for this:
> >      >> https://bugs.openjdk.java.net/browse/JDK-8220465
> >      >>
> >      >
> >
>


From maoliang.ml at alibaba-inc.com  Thu Oct 10 13:48:42 2019
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Thu, 10 Oct 2019 21:48:42 +0800
Subject: =?UTF-8?B?UmU6IEcxIHBhdGNoIG9mIGVsYXN0aWMgSmF2YSBoZWFw?=
In-Reply-To: <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>
References: <6270ce59-4a8e-431e-9ccf-f6d2c0f927eb.maoliang.ml@alibaba-inc.com>
 <d82e704831f0afbc61f8a3fb6b69bb1463b7ede8.camel@oracle.com>
 <e4ffd4d9-3ec0-4592-ac8c-d5a77c6b2e75.maoliang.ml@alibaba-inc.com>
 <1267a5dd2cf6cc1d03df64d07a06ba0f45195951.camel@oracle.com>
 <3140197d-8cab-4a86-af92-58431c74cb6b.maoliang.ml@alibaba-inc.com>
 <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>,
 <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>
Message-ID: <6cc5bdd7-c076-472f-8a36-8294c6cbfe21.maoliang.ml@alibaba-inc.com>

Hi Thomas,

Thank you for the feedback.
You are right about some points that the present code seems to separate the heap into young 
and old gen pools. In OpenJDK8, there's no adaptive-ihop so fixed ihop and MaxNewSize can clearly separate
young gen and old gen. I'm also thinking about how to design it better in upstream of OpenJDK G1.

There is a tradeoff between memory and GC frequency. More frequent GC uses less memory. We found 
our online service applications keep large young generation for potential query traffic but most
 of time the young GC frequency is quite low. Memory can be easily saved by using smaller young gen.
In Shenandoah or ZGC, there is only 1 generation and it's straightforward to determine if memory is
wasted and can be returned. G1 has 2 generations, in remark phase MinHeapFreeRatio/MaxHeapFreeRatio
cannot tell the young generation is rather wasted for running 2 minutes without a young GC and we can 
return a lot of memory. Each generation's GC interval or time ratio spent on mutator/gc you mentioned
seems more intuitive.

The explicit limitation of generation may not be a good design from G1 GC's perspective. From the 
operation's point of view, it is easy for manipulating JVM. There is a simple relationship:
larger network traffic -> higher memory allocation rate -> larger young generation. So cluster
operation can easily set the young generation as 10% of max young gen size to every Java instance
if the network traffic is guanranteed to be below 10% for a period of time.

I'm not sticking to the current implementation to create clear boundary between young and old gen,
 especially for newer OpenJDK versions and I've been thinking of unifying the 2 generations' resizing 
within the single memory pool of heap along with Xms. The periodic uncommit mode does not strickly 
separate the young/old gen. Current implementation calculates the average GC interval and keep it in 
a certain range between a low bound and high bound and will immediately trigger an expansion if a 
single GC interval smaller than a threshould. We can use a similar policy to estimate a target young 
generation capacity and adjust the capacity of old generation after a concurrent cycle. The 2 parts
 together can be the target heap capacity. The capacity can vary between Xms and Xmx. The difference
 with current G1 is it can be resized in a young GC not only remark.

In order to do swift heap resizing we have to conquer the over head of memory request/release from OS.
 The memory unmap and map(including the page fault) cost significant time. So we use an intuitive way
to have a concurrent thread to do the map/unmap/pretouch. The free regions will be synchronized in GC
pause. In our applications, a typical G1 remark cost ~100ms of pause. I haven't tested latest G1 but 
based on our experimental data, the pause can be easily doubled if done considerable map/unmaps. 


All of above are our thoughts and the present implementation is kind of reference. Please let me know if 
I answered all your questions. Hope we can come to an agreement in some points and conceive a good design
 in latest G1 GC :)

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2019 Oct. 9 (Wed.) 22:12
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: G1 patch of elastic Java heap

Hi,

   sorry for the late reply.

First, I have a more general question: lots of changes deal with 
providing options to separately change properties generations at 
runtime. Like if there were separate pools of young and old gen memory.

G1 is kind of built upon the idea that you pass a pause time goal and 
then modifies generation sizes and takes memory for the generations from 
a single memory pool as needed.

To me this indicates that automatic sizing is not working correctly, but 
there are many(?) use cases where it does not work as expected. This 
requires manual tuning in generation sizes for whatever reason.

Can you share your thoughts about this? There seems to be some bit of 
information missing to me - this is probably the reason for some of the 
dumb questions about the flags, and me being not too fond of them.

On 26.09.19 08:49, Liang Mao wrote:
> 
> Hi All,
> 
> Here is the user guide of G1ElasticHeap patch. Hope it will help to 
> understand.
> 
> G1ElasticHeap
> G1ElasticHeap is a GC feature to return memory of Java heap to OS to reduce the 
> 
> memory footprint of Java process. To enable this feature, you need to use G1 GC 
> 
> by options: -XX:+UseG1GC -XX:+G1ElasticHeap.
> 
> ## Usage
> There are 3 modes which can be enabled in G1ElasticHeap.
> ### 1. Periodic uncommit
> Memory will be uncommitted by periodic GC. To enable periodic uncommit, use option 
> 
> -XX:+ElasticHeapPeriodicUncommit or dynamically enable the option via jinfo:
> 
> `jinfo -flag +ElasticHeapPeriodicUncommit PID`

As far as I can tell, this setting periodically scans the heap for (too 
many?) uncommitted regions and, well, uncommits them.

Not completely sure if that is better than doing periodic gcs - as we do 
not expect to gain memory outside of a GC; in JDK12+ (I think) G1 alwasy 
uncommits at the remark pause which should give most of the benefits.

There *may* be reason to also try to uncommit after the last mixed GC, 
but not sure if uncommit is that urgent - to some degree the existing 
JEP 346: Promptly return unused committed memory from G1 
(https://openjdk.java.net/jeps/346) should cover some of the use cases. 
I.e. after some delay (and inactivity) there will be another Remark 
pause anyway.

The main reason why Remark has been chosen to uncommit memory is because 
we assume that the heap size at Remark (this is what adaptive IHOP 
shoots for) is the "target heap size".


> Related options:
> 
>> ElasticHeapPeriodicYGCIntervalMillis, 15000 \
> (target young GC interval 15 seconds in default) \
> (eg, if Java runs with MaxNewSize=4g, young GC every 30 seconds, G1ElasticHeap will keep 15s
>   GC interval and make a max 2g young generation to uncommit 2g memory)
> 
>> ElasticHeapPeriodicInitialMarkIntervalMillis, 3600000 \
> (Target initial mark interval, 1 hour in default. Unused memory of old generation will be uncommitted
>   after last mixed GC.)

This sesm to implement an unconditional concurrent cycle like with the 
CMSTriggerInterval flag for CMS.

Maybe there is a more clever alternative on triggering concurrent cycles 
like ZGC does based on the ratio between time spent by the mutator and 
the gc.

> 
>> ElasticHeapPeriodicUncommitStartupDelay, 300 \
> (Delay after startup to do memory uncommit, 300 seconds in default)
> 
>> ElasticHeapPeriodicMinYoungCommitPercent, 50 \
> (Percentage of young generation to keep, default 50% of the young generation will not be uncommitted)

See above about separating young/old.

> 
> ### 2. Generation limit
> To limit the young/old generation separately. Use jcmd or MXBean to enable.

I do not understand the reason for those, see above.

[...]
> 
> ### 3. Softmx mode
> Dynamically to limit the heap as a percentage of origin Xmx.
> 
> Use jcmd:
> 
> `jcmd PID ElasticHeap softmx_percent=60`
> 
> Use MXBean:
> 
> `elasticHeapMXBean.setSoftmxPercent(70);`

That one sounds good, and actually there is a flag SoftMaxHeapSize 
already in the VM. Only ZGC implements it though.

I think this idea matches the specifications in 
https://bugs.openjdk.java.net/browse/JDK-8222145 (i.e. as far as I can 
tell, the softmxpercent is a "soft"/target heap size), so I think this 
could be implemented under the SoftMaxHeapSize flag.

SoftMaxHeapSize is already manageable too, so could be modified already. 
Only the implementation is missing in G1 :)

> 
> ### Other G1ElasticHeap advanced options:
>> ElasticHeapMinYoungCommitPercent, 10 \
>   (Mininum percentage of young generation)
> 
>> ElasticHeapYGCIntervalMinMillis, 5000 \
>   (Mininum young GC interval)
> 
>> ElasticHeapInitialMarkIntervalMinMillis, 60000 \
> (Mininum initial mark interval)
> 
>> ElasticHeapEagerMixedGCIntervalMillis, 15000 \
> (Guaranteed mixed GC interval, to make sure the mixed will happen in time to uncommit memory after last mixed GC)

These options seem to be mostly useful for when the allocation rate of 
the mutator is not high enough to advance the collection cycle.

Would that feature provide the requested feature? Maybe it needs some 
minor improvement, but to me it seems very burdensome to specify so many 
options...

> 
>> ElasticHeapOldGenReservePercent, 5 \
> (To keep a mininum percentage of Xmx for old generation in the uncommitment after last mixed GC)

That seems to be related to some strict separation of young/old again.

> 
>> ElasticHeapPeriodicYGCIntervalCeilingPercent, 25 \
> ElasticHeapPeriodicYGCIntervalFloorPercent, 25 \
> (The actual young GC interval will fluctuate between \
> ElasticHeapPeriodicYGCIntervalMillis * (100 - ElasticHeapPeriodicYGCIntervalFloorPercent) / 100 and \
> ElasticHeapPeriodicYGCIntervalMillis * (100 + ElasticHeapPeriodicYGCIntervalCeilingPercent) / 100 )
> 

Thanks,
   Thomas


From stefan.johansson at oracle.com  Thu Oct 10 13:50:56 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 10 Oct 2019 15:50:56 +0200
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
 <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
Message-ID: <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>

Thanks for the clarification =)

Moving on to the next part, the code in the patch. So this won't be a 
full review of the patch but just an initial comment that I would like 
to be addressed first.

The new function PSParallelCompact::fill_shadow_region() is more or less 
a copy of PSParallelCompact::fill_region() and I understand that from a 
proof of concept point of view it was the easy (and right) way to do it. 
I would prefer if the code could be refactored so that fill_region() and 
fill_shadow_region() share more code. There might be reasons that I've 
missed, that prevents it, but we should at least explore how much code 
can be shared.

Thanks,
Stefan

On 2019-10-10 15:10, Haoyu Li wrote:
> Hi Stefan,
> 
> Thanks for your quick response! As to your concern about the OCA, I am 
> the sole author of the patch. And it is the case as what the agreement 
> states.
> Best Regrads,
> Haoyu Li,
> 
> 
> Stefan Johansson <stefan.johansson at oracle.com 
> <mailto:stefan.johansson at oracle.com>> ?2019?10?10??? ??8:37???
> 
>     Hi,
> 
>     On 2019-10-10 13:06, Haoyu Li wrote:
>      > Hi Stefan,
>      >
>      > Thanks for your testing! One possible reason for the regressions in
>      > simple tests is that the region dependencies maybe not heavy enough.
>      > Because the locality of shadow regions is lower than that of heap
>      > regions, writing to shadow regions will be slower than to normal
>      > regions, and this is a part of the reason why I reuse shadow
>     regions.
>      > Therefore, if only a few shadow regions are created and not
>     reused, the
>      > overhead may not be amortized.
> 
>     I guess it is something like this. I thought that for "easy" heaps the
>     shadow regions won't be used at all, and should therefor not really
>     cost
>     anything.
> 
>      >
>      > As to the OCA, it is the case that I'm the only person signing the
>      > agreement. Please let me know if you have any further questions.
>     Thanks
>      > again!
> 
>     Ok, so you are the sole author of the patch. The important part, as the
>     agreement states, is:
>     "no other person or entity, including my employer, has or will have
>     rights with respect my contributions"
> 
>     Is that the case?
> 
>     Thanks,
>     Stefan
> 
>      >
>      > Best Regrads,
>      > Haoyu Li
>      >
>      > Stefan Johansson <stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      > <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>> ?2019?10?8??? ??6:49
>     ???
>      >
>      >? ? ?Hi Haoyu,
>      >
>      >? ? ?I've done some more testing and I haven't seen any issues
>     with the
>      >? ? ?patch
>      >? ? ?so far and the performance looks promising in most cases. For
>     simple
>      >? ? ?tests I've seen some regressions, but I'm not really sure
>     why. Will do
>      >? ? ?some more digging.
>      >
>      >? ? ?To move forward with this the first thing we need to do is
>     making sure
>      >? ? ?that you being covered by the Oracle Contributor Agreement is
>     enough.
>      >? ? ? ?From what we can see it is only you as an individual that
>     has signed
>      >? ? ?the OCA and in that case it is important that this statement
>     from the
>      >? ? ?OCA is fulfilled: "no other person or entity, including my
>     employer,
>      >? ? ?has
>      >? ? ?or will have rights with respect my contributions"
>      >
>      >? ? ?Is this the case for this contribution or should we have the
>     university
>      >? ? ?sign the OCA as well? For more information regarding the OCA
>     please
>      >? ? ?refer to:
>      > https://www.oracle.com/technetwork/oca-faq-405384.pdf
>      >
>      >? ? ?Thanks,
>      >? ? ?Stefan
>      >
>      >? ? ?On 2019-09-16 16:02, Haoyu Li wrote:
>      >? ? ? > FYI, the evaluation results on OpenJDK 14 are plotted in the
>      >? ? ?attachment.
>      >? ? ? > I compute the full GC throughput by dividing the heap size
>     before
>      >? ? ?full
>      >? ? ? > GC by the GC pause time, and the results are arithmetic mean
>      >? ? ?values of
>      >? ? ? > ten runs after a warm-up run.?The evaluation is conducted on a
>      >? ? ?machine
>      >? ? ? > with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16
>     physical
>      >? ? ?cores
>      >? ? ? > with SMT enabled) and 64G DRAM.
>      >? ? ? >
>      >? ? ? > Best Regrads,
>      >? ? ? > Haoyu Li,
>      >? ? ? > Institute of Parallel and Distributed Systems(IPADS),
>      >? ? ? > School of Software,
>      >? ? ? > Shanghai Jiao Tong University
>      >? ? ? >
>      >? ? ? >
>      >? ? ? > Stefan Johansson <stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >? ? ?<mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>
>      >? ? ? > <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >? ? ?<mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>>> ?2019?9?12??? ??5:34
>      >? ? ????
>      >? ? ? >
>      >? ? ? >? ? ?Hi Haoyu,
>      >? ? ? >
>      >? ? ? >? ? ?I recently came across your patch and I would like to
>     pick up on
>      >? ? ? >? ? ?some of the things Kim mentioned in his mails. I
>     especially want
>      >? ? ? >? ? ?evaluate and?investigate if this is a technique we can
>     use to
>      >? ? ? >? ? ?improve the other?GCs as well. To start?that work I
>     want to
>      >? ? ?take the
>      >? ? ? >? ? ?patch for a spin in our internal performance testing.
>     The patch
>      >? ? ? >? ? ?doesn?t apply clean to the latest JDK repository, so
>     if you could
>      >? ? ? >? ? ?provide an updated patch that would be very helpful.
>      >? ? ? >
>      >? ? ? >? ? ?It would also be great if you could share some more
>     information
>      >? ? ? >? ? ?around the results presented in the paper. For example, it
>      >? ? ?would be
>      >? ? ? >? ? ?good to get the full?command lines for the different
>      >? ? ?benchmarks so
>      >? ? ? >? ? ?we can run them locally and reproduce the
>     results?you?ve?seen.
>      >? ? ? >
>      >? ? ? >? ? ?Thanks,
>      >? ? ? >? ? ?Stefan
>      >? ? ? >
>      >? ? ? >>? ? ?12 mars 2019 kl. 03:21 skrev Haoyu Li
>     <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>      >? ? ?<mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>      >? ? ? >>? ? ?<mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com> <mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com>>>>:
>      >? ? ? >>
>      >? ? ? >>? ? ?Hi Kim,
>      >? ? ? >>
>      >? ? ? >>? ? ?Thanks for reviewing and testing the patch. If there
>     are any
>      >? ? ? >>? ? ?failures or performance degradation relevant to the
>     work, please
>      >? ? ? >>? ? ?let me know and I'll be very happy to keep improving it.
>      >? ? ?Also, any
>      >? ? ? >>? ? ?suggestions about code improvements are well appreciated.
>      >? ? ? >>
>      >? ? ? >>? ? ?I'm not quite sure if both G1 and Shenandoah have the
>     similar
>      >? ? ? >>? ? ?region dependency issue, since I haven't studied their GC
>      >? ? ? >>? ? ?behaviors before. If they have, I'm also willing to
>     propose
>      >? ? ?a more
>      >? ? ? >>? ? ?general optimization.
>      >? ? ? >>
>      >? ? ? >>? ? ?As to the memory overhead, I believe it will be low
>     because this
>      >? ? ? >>? ? ?patch exploits empty regions in the young space
>     rather than
>      >? ? ? >>? ? ?off-heap memory to allocate shadow regions, and also
>     reuses the
>      >? ? ? >>? ? ?/_source_region/ field of each /RegionData /to record the
>      >? ? ? >>? ? ?correspongding shadow region index. We only introduce
>     a new
>      >? ? ? >>? ? ?integer filed /_shadow /in the RegionData class to
>     indicate the
>      >? ? ? >>? ? ?status of a region, a global /GrowableArray
>     _free_shadow/?to
>      >? ? ?store
>      >? ? ? >>? ? ?the indices of shadow regions, and a global
>     /Monitor/?to protect
>      >? ? ? >>? ? ?the array. These information might help if the memory
>     overhead
>      >? ? ? >>? ? ?need to be evaluated.
>      >? ? ? >>
>      >? ? ? >>? ? ?Looking forward to your insight.
>      >? ? ? >>
>      >? ? ? >>? ? ?Best Regrads,
>      >? ? ? >>? ? ?Haoyu Li,
>      >? ? ? >>? ? ?Institute of Parallel and Distributed Systems(IPADS),
>      >? ? ? >>? ? ?School of Software,
>      >? ? ? >>? ? ?Shanghai Jiao Tong University
>      >? ? ? >>
>      >? ? ? >>
>      >? ? ? >>? ? ?Kim Barrett <kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >? ? ?<mailto:kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>>
>      >? ? ? >>? ? ?<mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >? ? ?<mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>>> ?2019?3?12??? ??6:11???
>      >? ? ? >>
>      >? ? ? >>? ? ? ? ?> On Mar 11, 2019, at 1:45 AM, Kim Barrett
>      >? ? ? >>? ? ? ? ?<kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>
>      >? ? ?<mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>>> wrote:
>      >? ? ? >>? ? ? ? ?>
>      >? ? ? >>? ? ? ? ?>> On Jan 24, 2019, at 3:58 AM, Haoyu Li
>      >? ? ?<leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>      >? ? ? >>? ? ? ? ?<mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com>
>      >? ? ?<mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>>>
>     wrote:
>      >? ? ? >>? ? ? ? ?>>
>      >? ? ? >>? ? ? ? ?>> Hi Kim,
>      >? ? ? >>? ? ? ? ?>>
>      >? ? ? >>? ? ? ? ?>> I have ported my patch to OpenJDK 13 according
>     to your
>      >? ? ? >>? ? ? ? ?instructions in your last mail, and the patch is
>     attached in
>      >? ? ? >>? ? ? ? ?this mail. The patch does not change much since
>     PSGC is
>      >? ? ?indeed
>      >? ? ? >>? ? ? ? ?pretty stable.
>      >? ? ? >>? ? ? ? ?>>
>      >? ? ? >>? ? ? ? ?>> Also, I evaluate the correctness and
>     performance of
>      >? ? ?PS full
>      >? ? ? >>? ? ? ? ?GC with benchmarks from DaCapo, SPECjvm2008, and
>     JOlden
>      >? ? ?suits
>      >? ? ? >>? ? ? ? ?on a machine with dual Intel Xeon E5-2618L v3 CPUs(16
>      >? ? ?physical
>      >? ? ? >>? ? ? ? ?cores), 64G DRAM and linux kernel 4.17. The
>     evaluation
>      >? ? ?result,
>      >? ? ? >>? ? ? ? ?indicating 1.9X GC throughput improvement on
>     average, is
>      >? ? ? >>? ? ? ? ?attached, too.
>      >? ? ? >>? ? ? ? ?>>
>      >? ? ? >>? ? ? ? ?>> However, I have no idea how to further test this
>      >? ? ?patch for
>      >? ? ? >>? ? ? ? ?both correctness and performance. Can I please
>     get any
>      >? ? ? >>? ? ? ? ?guidance from you or some sponsor?
>      >? ? ? >>? ? ? ? ?>
>      >? ? ? >>? ? ? ? ?> Sorry I missed that you had sent an updated
>     version of the
>      >? ? ? >>? ? ? ? ?patch.
>      >? ? ? >>? ? ? ? ?>
>      >? ? ? >>? ? ? ? ?> I?ve run the full regression suite across
>     Oracle-supported
>      >? ? ? >>? ? ? ? ?platforms.? There are some
>      >? ? ? >>? ? ? ? ?> failures, but there are almost always some
>     failures in the
>      >? ? ? >>? ? ? ? ?later tiers right now.? I?ll start
>      >? ? ? >>? ? ? ? ?> looking at them tomorrow to figure out whether
>     any of them
>      >? ? ? >>? ? ? ? ?are relevant.
>      >? ? ? >>? ? ? ? ?>
>      >? ? ? >>? ? ? ? ?> I?m also planning to run some of our performance
>      >? ? ?benchmarks.
>      >? ? ? >>? ? ? ? ?>
>      >? ? ? >>? ? ? ? ?> I?ve lightly skimmed the proposed changes. 
>     There might be
>      >? ? ? >>? ? ? ? ?some code improvements
>      >? ? ? >>? ? ? ? ?> to be made.
>      >? ? ? >>? ? ? ? ?>
>      >? ? ? >>? ? ? ? ?> I?m also wondering if this technique applies to
>     other
>      >? ? ? >>? ? ? ? ?collectors.? It seems like both G1 and
>      >? ? ? >>? ? ? ? ?> Shenandoah full gc?s might have similar
>     issues?? If so, a
>      >? ? ? >>? ? ? ? ?solution that is ParallelGC-specific
>      >? ? ? >>? ? ? ? ?> is less interesting than one that has broader
>      >? ? ? >>? ? ? ? ?applicability.? Though maybe this optimization
>      >? ? ? >>? ? ? ? ?> is less important for G1 and Shenandoah, since they
>      >? ? ?actively
>      >? ? ? >>? ? ? ? ?try to avoid full gc?s.
>      >? ? ? >>? ? ? ? ?>
>      >? ? ? >>? ? ? ? ?> I?m also not clear on how much additional
>     memory might be
>      >? ? ? >>? ? ? ? ?temporarily allocated by this
>      >? ? ? >>? ? ? ? ?> mechanism.
>      >? ? ? >>
>      >? ? ? >>? ? ? ? ?I?ve created a CR for this:
>      >? ? ? >> https://bugs.openjdk.java.net/browse/JDK-8220465
>      >? ? ? >>
>      >? ? ? >
>      >
> 

From thomas.schatzl at oracle.com  Thu Oct 10 15:23:25 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 10 Oct 2019 17:23:25 +0200
Subject: RFR: 8232070: ZGC: Remove unused ZVerifyLoadBarriers
In-Reply-To: <1df387c3-bae2-45b3-7930-1baf56dea03c@oracle.com>
References: <1df387c3-bae2-45b3-7930-1baf56dea03c@oracle.com>
Message-ID: <0c0a91af-9be1-d879-bf60-fd3cf0888823@oracle.com>

Hi,

On 09.10.19 23:04, Per Liden wrote:
> After JDK-8230565, we left the develop flag ZVerifyLoadBarriers around, 
> which is no longer used and can be removed.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232070
> Webrev: http://cr.openjdk.java.net/~pliden/8232070/webrev.0

   looks good and trivial.

Thomas


From erik.osterlund at oracle.com  Thu Oct 10 15:31:14 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Thu, 10 Oct 2019 17:31:14 +0200
Subject: RFR: 8232116: ZGC: Remove redundant ZLock in ZNMethodTable
Message-ID: <99d79fed-1ec2-4e51-55eb-d6c7aa333a42@oracle.com>

Hi,

The safe memory reclamation technique used in the ZNMethodTable has an 
unnecessary ZLock. This lock is statically initialized, which creates 
some bootstrapping issues. We should remove the lock, as in the context 
it is used, we are always protected under the CodeCache_lock.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8232116

Webrev:
http://cr.openjdk.java.net/~eosterlund/8232116/webrev.00/

Thanks,
/Erik


From per.liden at oracle.com  Thu Oct 10 16:41:05 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 10 Oct 2019 18:41:05 +0200
Subject: RFR: 8232116: ZGC: Remove redundant ZLock in ZNMethodTable
In-Reply-To: <99d79fed-1ec2-4e51-55eb-d6c7aa333a42@oracle.com>
References: <99d79fed-1ec2-4e51-55eb-d6c7aa333a42@oracle.com>
Message-ID: <c31f29a1-57b4-bd5e-f67b-dd684d8bf5e5@oracle.com>

Looks good!

/Per

On 10/10/19 5:31 PM, erik.osterlund at oracle.com wrote:
> Hi,
> 
> The safe memory reclamation technique used in the ZNMethodTable has an 
> unnecessary ZLock. This lock is statically initialized, which creates 
> some bootstrapping issues. We should remove the lock, as in the context 
> it is used, we are always protected under the CodeCache_lock.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8232116
> 
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8232116/webrev.00/
> 
> Thanks,
> /Erik


From per.liden at oracle.com  Thu Oct 10 16:42:02 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 10 Oct 2019 18:42:02 +0200
Subject: RFR: 8232070: ZGC: Remove unused ZVerifyLoadBarriers
In-Reply-To: <0c0a91af-9be1-d879-bf60-fd3cf0888823@oracle.com>
References: <1df387c3-bae2-45b3-7930-1baf56dea03c@oracle.com>
 <0c0a91af-9be1-d879-bf60-fd3cf0888823@oracle.com>
Message-ID: <5b180c79-3ad2-993b-b24b-bc69b963eeb7@oracle.com>

Thanks Thomas!

/Per

On 10/10/19 5:23 PM, Thomas Schatzl wrote:
> Hi,
> 
> On 09.10.19 23:04, Per Liden wrote:
>> After JDK-8230565, we left the develop flag ZVerifyLoadBarriers 
>> around, which is no longer used and can be removed.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232070
>> Webrev: http://cr.openjdk.java.net/~pliden/8232070/webrev.0
> 
>  ? looks good and trivial.
> 
> Thomas
> 


From erik.osterlund at oracle.com  Thu Oct 10 16:51:27 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Thu, 10 Oct 2019 18:51:27 +0200
Subject: RFR: 8232116: ZGC: Remove redundant ZLock in ZNMethodTable
In-Reply-To: <c31f29a1-57b4-bd5e-f67b-dd684d8bf5e5@oracle.com>
References: <99d79fed-1ec2-4e51-55eb-d6c7aa333a42@oracle.com>
 <c31f29a1-57b4-bd5e-f67b-dd684d8bf5e5@oracle.com>
Message-ID: <A92CDAB1-614F-4C81-9435-1F5D90A91D9A@oracle.com>

Hi Per,

Thanks for the review.

/Erik

> On 10 Oct 2019, at 18:41, Per Liden <per.liden at oracle.com> wrote:
> 
> Looks good!
> 
> /Per
> 
>> On 10/10/19 5:31 PM, erik.osterlund at oracle.com wrote:
>> Hi,
>> The safe memory reclamation technique used in the ZNMethodTable has an unnecessary ZLock. This lock is statically initialized, which creates some bootstrapping issues. We should remove the lock, as in the context it is used, we are always protected under the CodeCache_lock.
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8232116
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8232116/webrev.00/
>> Thanks,
>> /Erik


From stefan.karlsson at oracle.com  Thu Oct 10 17:00:26 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 10 Oct 2019 19:00:26 +0200
Subject: RFR: 8232116: ZGC: Remove redundant ZLock in ZNMethodTable
In-Reply-To: <99d79fed-1ec2-4e51-55eb-d6c7aa333a42@oracle.com>
References: <99d79fed-1ec2-4e51-55eb-d6c7aa333a42@oracle.com>
Message-ID: <068b00ea-c8cb-606e-7d06-98a22bfcd214@oracle.com>

Looks good.

StefanK

On 2019-10-10 17:31, erik.osterlund at oracle.com wrote:
> Hi,
>
> The safe memory reclamation technique used in the ZNMethodTable has an 
> unnecessary ZLock. This lock is statically initialized, which creates 
> some bootstrapping issues. We should remove the lock, as in the 
> context it is used, we are always protected under the CodeCache_lock.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8232116
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8232116/webrev.00/
>
> Thanks,
> /Erik


From erik.osterlund at oracle.com  Thu Oct 10 17:08:43 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Thu, 10 Oct 2019 19:08:43 +0200
Subject: RFR: 8232116: ZGC: Remove redundant ZLock in ZNMethodTable
In-Reply-To: <068b00ea-c8cb-606e-7d06-98a22bfcd214@oracle.com>
References: <99d79fed-1ec2-4e51-55eb-d6c7aa333a42@oracle.com>
 <068b00ea-c8cb-606e-7d06-98a22bfcd214@oracle.com>
Message-ID: <D4EABDC0-7981-4F59-B227-BB6B193D3250@oracle.com>

Hi Stefan,

Thanks for the review.

/Erik

> On 10 Oct 2019, at 19:00, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> 
> Looks good.
> 
> StefanK
> 
>> On 2019-10-10 17:31, erik.osterlund at oracle.com wrote:
>> Hi,
>> 
>> The safe memory reclamation technique used in the ZNMethodTable has an unnecessary ZLock. This lock is statically initialized, which creates some bootstrapping issues. We should remove the lock, as in the context it is used, we are always protected under the CodeCache_lock.
>> 
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8232116
>> 
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8232116/webrev.00/
>> 
>> Thanks,
>> /Erik
> 


From kim.barrett at oracle.com  Thu Oct 10 23:34:27 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 10 Oct 2019 19:34:27 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
Message-ID: <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>

> On Oct 9, 2019, at 12:27 AM, sangheon.kim at oracle.com wrote:
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3.inc
> Testing: hs-tier 1~5, with/without UseNUMA

I agree with Stefan and Thomas; this is looking pretty good.

There are some naming issues that I'm not going to comment on here.
Stefan has already commented on some, and a bit of offline discussion
suggests there's a larger naming discussion needed, but which can
follow getting the functionality we want.

There has been further discission offline toward collapsing
G1MemoryNodeManager to one class without virtual dispatch, and using
G1NUMA name. I won't bother to re-iterate any of that here.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1Allocator.cpp
 186   assert(Heap_lock->owner() != NULL, "Should be owned on this thread's behalf.");

Use assert_lock_strong(Heap_lock).

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp
  82       _storage.request_memory_on_node(page, _pages_per_region, node_index);
...
 153         _storage.request_memory_on_node(idx, 1, node_index);

I'm not sure request_memory_on_node belongs on the _storage object.
The current implementation just has the storage object (conditionally)
forward the request to the memory node manager object. These places in
the space mapper could just make the calls on the memory node manager
object directly (it is already being used nearby).  And these places
don't need the conditionalization.

I think making the space mapper directly call the memory node manager
here would remove the need for the proposed changes to the virtual
space class.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegion.cpp
 464     st->print("|Node ID %02d", node_ids[this->node_index()]);

The unchecked use of node_index() here can run afoul of an unset (so
UnknownNodeIndex) index.

Also, no need for `this->` in `this->node_index()`.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp 
  81   virtual const uint max_search_depth() const { return 1; }

s/const uint/uint/

Similarly for other declarations and definitions.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp 
  77   virtual void request_memory_on_node(char* aligned_address, size_t size_in_bytes, uint node_index) { }

Shouldn't the aligned_address argument be typed "void*" rather than "char*"?

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegionManager.cpp
 112   if (mgr->has_multi_nodes() && requested_node_index != G1MemoryNodeManager::AnyNodeIndex) {

I think it would be better to test the requested_node_index value
first.  The "any" case is a common case.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegionManager.cpp
 200   if(AlwaysPreTouch) {

Add space after "if".

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegionManager.cpp
 311   return region_node_index == preferred_node_index;

Fix indentation.

------------------------------------------------------------------------------
src/hotspot/share/runtime/os.hpp
 393   static const int InvalidId = -1;

This should probably be "InvalidNUMAId" or something like that.

------------------------------------------------------------------------------


From sangheon.kim at oracle.com  Fri Oct 11 03:23:47 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Thu, 10 Oct 2019 20:23:47 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <93592401-FC69-4B7F-95BE-DE9A0F070F3A@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <93592401-FC69-4B7F-95BE-DE9A0F070F3A@oracle.com>
Message-ID: <48528a18-9da0-a69e-135d-8e56b78ecca3@oracle.com>

Hi Stefan,

On 10/9/19 2:40 PM, Stefan Johansson wrote:
> Hi Sangheon,
>
> Thanks again for a much improved version. Some comments below.
>
>> 9 okt. 2019 kl. 06:27 skrev sangheon.kim at oracle.com:
>>
>> ...
>>
>> Here's the major change list at the webrev. Or arguable list :)
>> 1) Verification at HRM::allocate_free_region() is removed and it will be added somewhere at safepoint by JDK-8220312 (3/3 which is part of this JEP). Probably at the end of young gc?
>> 2) Node id printing is changed. Removed old one and added at HeapRegion::print_on() with new column. Node id is only printed when UseNUMA is enabled and gc+heap+region=trace. If there's single active node, it will print the node id and this is intentional. Another approach would be printing only if there are multiple nodes.
>> 3) If AlwaysPreTouch is enabled, HeapRegion will have actual node index instead of preferred node index.
>> 4) HeapRegion::_node_index is set at HRM::make_regions_available() as there is the only place initializing HeapRegion. Another approach would be setting the index at HeapRegion::initialize(we have to pollute HR with G1MNM stuff) or conditionally(*) setting the index at HeapRegion::node_index(). (*) if the index is unknown etc..
>> 5) G1NUMA class is merged into G1MemoryNodeManager.
> I saw your comment above about suggestions around this area and I can try out one thought I had, something I think Thomas mentioned as well. Making the non-NUMA case work exactly as a the NUMA case with one node. I?ll need some more time for that, but below are my comments on the current patch.
For the record, Stefan provided me a patch showing above idea of 
'non-NUMA case work exactly as a the NUMA case with one node'. The next 
webrev will include this change.

>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3.inc
> src/hotspot/os/linux/os_linux.cpp
> ?
> 3026     warning("Failed to get numa id at " PTR_FORMAT " with errno=%d", p2i((void*)address), errno);
>
> The cast here is no longer needed.
Done

> ?
>
> src/hotspot/share/gc/g1/g1Allocator.hpp
> ?
>   44   G1MemoryNodeManager* _mnm;
>
> I would prefer a more descriptive name like _memory_node_manager.
After changing to G1NUMA, all members will be _numa.

> ?
>
> src/hotspot/share/gc/g1/g1CollectedHeap.hpp
> ?
>   196   // Manages single or multi node memory.
>   197   G1MemoryNodeManager* _mem_node_mgr;
>   ...
>   558   G1MemoryNodeManager* mem_node_mgr() const { return _mem_node_mgr; }
>
> As above, I would prefer spelling out the names to memory_node_manager().
Same as above.

> ?
>
> src/hotspot/share/gc/g1/g1_globals.hpp
> ?
> Last line still removed a ?\?, please revert this change.
Done

> ?
>
> src/hotspot/share/gc/g1/heapRegion.cpp
> ?
>   462   if (UseNUMA) {
>   463     const int* node_ids = G1MemoryNodeManager::mgr()->node_ids();
>   464     st->print("|Node ID %02d", node_ids[this->node_index()]);
>   465   }
>   466   st->print_cr("?);
>
> I would prefer having a function that returns the node id given the index. Like the inverse of index_of_node_id().
>
> I also think it would be more informative to say "NUMA id" or "NUMA node?.
I don't strong opinion on this but as Thomas suggests not to have such 
word, I removed it.
It will print something like, "| 00". Hope you are okay with this.

> ?
>
> src/hotspot/share/gc/g1/heapRegionManager.cpp
> ?
>   195 // Set node index of the given HeapRegion.
>   196 // If AlwaysPreTouch is enabled, set with actual node index.
>   197 // If it is disabled, set with preferred node index which is already decided.
>   198 static void set_heapregion_node_index(HeapRegion* hr) {
>   199   uint node_index;
>   200   if(AlwaysPreTouch) {
>   201     // If we already pretouched, we can check actual node index here.
>   202     node_index = G1MemoryNodeManager::mgr()->index_of_address(hr->bottom());
>   203   } else {
>   204     node_index = G1MemoryNodeManager::mgr()->preferred_node_index_for_index(hr->hrm_index());
>   205   }
>   206   hr->set_node_index(node_index);
>   207 }
>
> I would prefer to have a helper for calculating the index to set not a helper for setting the index. If you agree, you could move this logic to G1MemoryNodeManager::index_for_region() and then you can change:
>   233     // Set node index of the heap region after initialization but before inserting
>   234     // to free list.
>   235     set_heapregion_node_index(hr);
>
> To just:
>   235     hr->set_node_index(G1MemoryNodeManager::mgr()->index_for_region(hr));
> ?
>   309  bool HeapRegionManager::is_on_preferred_index(uint region_index, uint preferred_node_index) {
>   310    uint region_node_index = G1MemoryNodeManager::mgr()->preferred_node_index_for_index(region_index);
>   311   return region_node_index == preferred_node_index;
>   312  }
>
> Indentation on row 311.
Changed as you suggested.
I had same opinion but the reason that I didn't choose was I wanted to 
avoid dependency for HeapRegion at G1NUMA.

> ?
>
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
> ?
>   44   static G1MemoryNodeManager* mgr() { return _inst; }
>
> I think we should change the name of this getter to manager(), to avoid unnecessary shortenings.
N/A

> ?
> 57   virtual bool has_multi_nodes() const { return false; }
>
> Same as above I would prefer has_multiple_nodes()
N/A

I will post next webrev after applying others' comments.

Thanks,
Sangheon


> ?
>
> Thanks,
> Stefan
>
>> Testing: hs-tier 1~5, with/without UseNUMA
>>
>> Thanks,
>> Sangheon


From sangheon.kim at oracle.com  Fri Oct 11 03:24:44 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Thu, 10 Oct 2019 20:24:44 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <ad57d17b-ae69-1319-25f9-32e48b40cbe1@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <93592401-FC69-4B7F-95BE-DE9A0F070F3A@oracle.com>
 <ad57d17b-ae69-1319-25f9-32e48b40cbe1@oracle.com>
Message-ID: <041edc5a-73d5-27f4-68ab-32c497f930dd@oracle.com>

Hi Thomas,

On 10/10/19 2:48 AM, Thomas Schatzl wrote:
> Hi,
>
> On 09.10.19 23:40, Stefan Johansson wrote:
>> Hi Sangheon,
>>
>> Thanks again for a much improved version. Some comments below.
>
> ? agree, it looks quite nice now.
:)

>
> [...]
>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3.inc
> [...]
>>
>> src/hotspot/share/gc/g1/heapRegion.cpp
>> ?
>> ? 462?? if (UseNUMA) {
>> ? 463???? const int* node_ids = G1MemoryNodeManager::mgr()->node_ids();
>> ? 464???? st->print("|Node ID %02d", node_ids[this->node_index()]);
>> ? 465?? }
>> ? 466?? st->print_cr("?);
>>
>> I would prefer having a function that returns the node id given the 
>> index. Like the inverse of index_of_node_id().
>>
>> I also think it would be more informative to say "NUMA id" or "NUMA 
>> node?.
>
> I would also remove the "Node ID" string here as it does not convey 
> any information. Most other columns also do not carry their description.
Done.

I will post the webrev after addressing Kim's comment.

Thanks,
Sangheon


>
> Thanks,
> ? Thomas
>
>


From thomas.schatzl at oracle.com  Fri Oct 11 11:02:01 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 11 Oct 2019 13:02:01 +0200
Subject: G1 patch of elastic Java heap
In-Reply-To: <6cc5bdd7-c076-472f-8a36-8294c6cbfe21.maoliang.ml@alibaba-inc.com>
References: <6270ce59-4a8e-431e-9ccf-f6d2c0f927eb.maoliang.ml@alibaba-inc.com>
 <d82e704831f0afbc61f8a3fb6b69bb1463b7ede8.camel@oracle.com>
 <e4ffd4d9-3ec0-4592-ac8c-d5a77c6b2e75.maoliang.ml@alibaba-inc.com>
 <1267a5dd2cf6cc1d03df64d07a06ba0f45195951.camel@oracle.com>
 <3140197d-8cab-4a86-af92-58431c74cb6b.maoliang.ml@alibaba-inc.com>
 <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>
 <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>
 <6cc5bdd7-c076-472f-8a36-8294c6cbfe21.maoliang.ml@alibaba-inc.com>
Message-ID: <5f02f337-f479-55f6-351e-867507845f65@oracle.com>

Hi,

On 10.10.19 15:48, Liang Mao wrote:
> Hi Thomas,
> 
> Thank you for the feedback.
> You are right about some points that the present code seems to separate 
> the heap into young and old gen pools. In OpenJDK8, there's no adaptive-ihop so fixed ihop 
> and MaxNewSize can clearly separate young gen and old gen. I'm also thinking about how to design it better 
> in upstream of OpenJDK G1.
> 
> There is a tradeoff between memory and GC frequency. More frequent GC 
> uses less memory. We found our online service applications keep large young generation for 
> potential query traffic but most of time the young GC frequency is quite low. Memory can be easily saved 
> by using smaller young gen
> In Shenandoah or ZGC, there is only 1 generation and it's 
> straightforward to determine if memory is wasted and can be returned. G1 has 2 generations, in remark phase 
> MinHeapFreeRatio/MaxHeapFreeRatio cannot tell the young generation is rather wasted for running 2 minutes 
> without a young GC and we can return a lot of memory. Each generation's GC interval or time ratio 
> spent on mutator/gc you mentioned seems more intuitive.
> 
> The explicit limitation of generation may not be a good design from G1 
> GC's perspective. From the operation's point of view, it is easy for manipulating JVM. There is a 
> simple relationship: larger network traffic -> higher memory allocation rate -> larger young 
> generation. So cluster operation can easily set the young generation as 10% of max young gen 
> size to every Java instance if the network traffic is guanranteed to be below 10% for a period of time.
> 
> I'm not sticking to the current implementation to create clear boundary 
> between young and old gen, especially for newer OpenJDK versions and I've been thinking of unifying 
> the 2 generations' resizing within the single memory pool of heap along with Xms. The periodic 
> uncommit mode does not strickly separate the young/old gen. Current implementation calculates the 
> average GC interval and keep it in a certain range between a low bound and high bound and will immediately 
> trigger an expansion if a single GC interval smaller than a threshould. We can use a similar 
> policy to estimate a target young generation capacity and adjust the capacity of old generation after a 
> concurrent cycle. The 2 parts together can be the target heap capacity. The capacity can vary between 
> Xms and Xmx. The difference with current G1 is it can be resized in a young GC not only remark.

Thank you for presenting your problem (and not insisting on a particular 
solution upfront).

Summary of this long text:

In case of "low" activity the user wants to limit the heap resulting in 
giving back memory. Currently, all the user can do is specifying the 
maximum amount of work the gc is allowed to use (GCTimeRatio). At least 
G1, as soon as the time spent in gc compared to mutator time is lower 
than GCTimeRatio (typically achieved by expanding the heap), it "never" 
shrinks the heap back (at least not based on that ratio). Which wastes 
lots of space, which is the problem.

We all agree that this is a problem :) I believe we only differ on what 
knobs the user should have available to achieve this.

Here are my current suggestions:

One option that I suggested earlier, is that instead of setting 
generation sizes (or heap sizes) manually (which could be fine in some 
cases for other reasons) could be thinking a bit differently about 
GCTimeRatio than now: currently it is the maximum amount of GC activity 
the user can bear, so we should make the GC to use less.
The slight tweak here could be that we assume that any GC activity below 
that is fine :)

Ie. if current GC activity is very low compared to mutator activity (far 
below what GCTimeRatio allows), and expected additional GC activity 
caused by this forced GC cycle would not exceed that GCTimeRatio, why 
not do the GC?

Think of a "minimum" GCTimeRatio; in some way this is very much like 
minimum and maximum GC intervals only with much more flexibility for the 
GC to meet (also this metric is independent of the environment, e.g. 
hardware, while setting actual values of sizes needs tuning).

I agree that there is then not an immediately obvious relation between 
external input (the traffic in your example) to what you should set that 
"minimum" GCTimeRatio to. However since there is a relation to young gen 
size and GCTimeRatio I think this can be figured out.

This is what ZGC does and I think would be worth trying out before 
thinking about adding a G1 specific way of achieving this or a similar 
effect.

The other option which is more direct would be implementing and changing 
target heap size during runtime: it would also automatically shrink the 
heap. I believe that if you were able to modify the current adaptive 
IHOP's "target" heap size from outside, G1 would already automatically 
give back memory; in conjunction with the "Promptly Return ...", it 
would also make sure that in very low mutator activity cases the GC 
cycle would continue.

As for whether this feature would be accepted for inclusion into G1: 
there is already a SoftMaxHeapSize switch in the JDK, so I guess this is 
a non-issue.

Note that you can *already*, if you know that from a particular time on 
there will be little activity, modify the "Promptly Return..." settings 
so that it will immediately start cleaning up and compacting the heap; 
you can even force maximum compaction at that time by issuing a full gc 
if service interruption is not an issue.

> 
> In order to do swift heap resizing we have to conquer the over head of 
> memory request/release from OS. The memory unmap and map(including the page fault) cost significant 
> time. So we use an intuitive way to have a concurrent thread to do the map/unmap/pretouch. The free 
> regions will be synchronized in GC pause. In our applications, a typical G1 remark cost ~100ms of pause. I 
> haven't tested latest G1 but based on our experimental data, the pause can be easily doubled if done 
> considerable map/unmaps.
> 

That's a related but distinct problem and a solution that seems at least 
worth trying :)

> 
> All of above are our thoughts and the present implementation is kind of 
> reference. Please let me know if
> I answered all your questions. Hope we can come to an agreement in some 
> points and conceive a good design
> in latest G1 GC :)
> 

Thanks,
   Thomas


From zgu at redhat.com  Fri Oct 11 12:30:23 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 11 Oct 2019 08:30:23 -0400
Subject: RFR 8232010: Shenandoah: implement self-fixing native barrier
Message-ID: <6cecca8a-a477-53b4-48de-f504a2100955@redhat.com>

Please review this patch that implements self-fixing LRB for in native oops.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232010
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232010/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release) with x86_32 and x86_64 
JVM on Linux.

Thanks,

-Zhengyu


From leihouyju at gmail.com  Fri Oct 11 12:49:17 2019
From: leihouyju at gmail.com (Haoyu Li)
Date: Fri, 11 Oct 2019 20:49:17 +0800
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
 <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
 <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>
Message-ID: <CAKSDcxu_t+Ka0LUU3WSzdR52-_+rsvKo_Vxn8av=duFmS4EPyA@mail.gmail.com>

Hi Stefan,

Thanks for your suggestion! It is very redundant that
PSParallelCompact::fill_shadow_region() copies most code from
PSParallelCompact::fill_region(), and therefore I've refactored these
two functions to share code as many as possible. And the attachment is
the updated patch.

Specifically, the closure, which moves objects, in
PSParallelCompact::fill_region() is now declared as a template of
either MoveAndUpdateClosure or ShadowClosure. So by controlling the
type of closure when invoking the function, we can decide whether to
fill a normal region or a shadow one. Thus, almost all code in
PSParallelCompact::fill_region() can be reused.

Besides, a virtual function named complete_region() is added in both
closures to do some work after the filling, such setting states and
copying the shadow region back.

Thanks again for reviewing the patch, looking forward to your insights
and suggestions!

Best Regards,
Haoyu Li

2019-10-10 21:50 GMT+08:00, Stefan Johansson <stefan.johansson at oracle.com>:
> Thanks for the clarification =)
>
> Moving on to the next part, the code in the patch. So this won't be a
> full review of the patch but just an initial comment that I would like
> to be addressed first.
>
> The new function PSParallelCompact::fill_shadow_region() is more or less
> a copy of PSParallelCompact::fill_region() and I understand that from a
> proof of concept point of view it was the easy (and right) way to do it.
> I would prefer if the code could be refactored so that fill_region() and
> fill_shadow_region() share more code. There might be reasons that I've
> missed, that prevents it, but we should at least explore how much code
> can be shared.
>
> Thanks,
> Stefan
>
> On 2019-10-10 15:10, Haoyu Li wrote:
>> Hi Stefan,
>>
>> Thanks for your quick response! As to your concern about the OCA, I am
>> the sole author of the patch. And it is the case as what the agreement
>> states.
>> Best Regrads,
>> Haoyu Li,
>>
>>
>> Stefan Johansson <stefan.johansson at oracle.com
>> <mailto:stefan.johansson at oracle.com>> ?2019?10?10??? ??8:37???
>>
>>     Hi,
>>
>>     On 2019-10-10 13:06, Haoyu Li wrote:
>>      > Hi Stefan,
>>      >
>>      > Thanks for your testing! One possible reason for the regressions
>> in
>>      > simple tests is that the region dependencies maybe not heavy
>> enough.
>>      > Because the locality of shadow regions is lower than that of heap
>>      > regions, writing to shadow regions will be slower than to normal
>>      > regions, and this is a part of the reason why I reuse shadow
>>     regions.
>>      > Therefore, if only a few shadow regions are created and not
>>     reused, the
>>      > overhead may not be amortized.
>>
>>     I guess it is something like this. I thought that for "easy" heaps
>> the
>>     shadow regions won't be used at all, and should therefor not really
>>     cost
>>     anything.
>>
>>      >
>>      > As to the OCA, it is the case that I'm the only person signing the
>>      > agreement. Please let me know if you have any further questions.
>>     Thanks
>>      > again!
>>
>>     Ok, so you are the sole author of the patch. The important part, as
>> the
>>     agreement states, is:
>>     "no other person or entity, including my employer, has or will have
>>     rights with respect my contributions"
>>
>>     Is that the case?
>>
>>     Thanks,
>>     Stefan
>>
>>      >
>>      > Best Regrads,
>>      > Haoyu Li
>>      >
>>      > Stefan Johansson <stefan.johansson at oracle.com
>>     <mailto:stefan.johansson at oracle.com>
>>      > <mailto:stefan.johansson at oracle.com
>>     <mailto:stefan.johansson at oracle.com>>> ?2019?10?8??? ??6:49
>>     ???
>>      >
>>      >     Hi Haoyu,
>>      >
>>      >     I've done some more testing and I haven't seen any issues
>>     with the
>>      >     patch
>>      >     so far and the performance looks promising in most cases. For
>>     simple
>>      >     tests I've seen some regressions, but I'm not really sure
>>     why. Will do
>>      >     some more digging.
>>      >
>>      >     To move forward with this the first thing we need to do is
>>     making sure
>>      >     that you being covered by the Oracle Contributor Agreement is
>>     enough.
>>      >       From what we can see it is only you as an individual that
>>     has signed
>>      >     the OCA and in that case it is important that this statement
>>     from the
>>      >     OCA is fulfilled: "no other person or entity, including my
>>     employer,
>>      >     has
>>      >     or will have rights with respect my contributions"
>>      >
>>      >     Is this the case for this contribution or should we have the
>>     university
>>      >     sign the OCA as well? For more information regarding the OCA
>>     please
>>      >     refer to:
>>      > https://www.oracle.com/technetwork/oca-faq-405384.pdf
>>      >
>>      >     Thanks,
>>      >     Stefan
>>      >
>>      >     On 2019-09-16 16:02, Haoyu Li wrote:
>>      >      > FYI, the evaluation results on OpenJDK 14 are plotted in
>> the
>>      >     attachment.
>>      >      > I compute the full GC throughput by dividing the heap size
>>     before
>>      >     full
>>      >      > GC by the GC pause time, and the results are arithmetic
>> mean
>>      >     values of
>>      >      > ten runs after a warm-up run. The evaluation is conducted on
>> a
>>      >     machine
>>      >      > with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16
>>     physical
>>      >     cores
>>      >      > with SMT enabled) and 64G DRAM.
>>      >      >
>>      >      > Best Regrads,
>>      >      > Haoyu Li,
>>      >      > Institute of Parallel and Distributed Systems(IPADS),
>>      >      > School of Software,
>>      >      > Shanghai Jiao Tong University
>>      >      >
>>      >      >
>>      >      > Stefan Johansson <stefan.johansson at oracle.com
>>     <mailto:stefan.johansson at oracle.com>
>>      >     <mailto:stefan.johansson at oracle.com
>>     <mailto:stefan.johansson at oracle.com>>
>>      >      > <mailto:stefan.johansson at oracle.com
>>     <mailto:stefan.johansson at oracle.com>
>>      >     <mailto:stefan.johansson at oracle.com
>>     <mailto:stefan.johansson at oracle.com>>>> ?2019?9?12??? ??5:34
>>      >     ???
>>      >      >
>>      >      >     Hi Haoyu,
>>      >      >
>>      >      >     I recently came across your patch and I would like to
>>     pick up on
>>      >      >     some of the things Kim mentioned in his mails. I
>>     especially want
>>      >      >     evaluate and investigate if this is a technique we can
>>     use to
>>      >      >     improve the other GCs as well. To start that work I
>>     want to
>>      >     take the
>>      >      >     patch for a spin in our internal performance testing.
>>     The patch
>>      >      >     doesn?t apply clean to the latest JDK repository, so
>>     if you could
>>      >      >     provide an updated patch that would be very helpful.
>>      >      >
>>      >      >     It would also be great if you could share some more
>>     information
>>      >      >     around the results presented in the paper. For example,
>> it
>>      >     would be
>>      >      >     good to get the full command lines for the different
>>      >     benchmarks so
>>      >      >     we can run them locally and reproduce the
>>     results you?ve seen.
>>      >      >
>>      >      >     Thanks,
>>      >      >     Stefan
>>      >      >
>>      >      >>     12 mars 2019 kl. 03:21 skrev Haoyu Li
>>     <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>>      >     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>>      >      >>     <mailto:leihouyju at gmail.com
>>     <mailto:leihouyju at gmail.com> <mailto:leihouyju at gmail.com
>>     <mailto:leihouyju at gmail.com>>>>:
>>      >      >>
>>      >      >>     Hi Kim,
>>      >      >>
>>      >      >>     Thanks for reviewing and testing the patch. If there
>>     are any
>>      >      >>     failures or performance degradation relevant to the
>>     work, please
>>      >      >>     let me know and I'll be very happy to keep improving
>> it.
>>      >     Also, any
>>      >      >>     suggestions about code improvements are well
>> appreciated.
>>      >      >>
>>      >      >>     I'm not quite sure if both G1 and Shenandoah have the
>>     similar
>>      >      >>     region dependency issue, since I haven't studied their
>> GC
>>      >      >>     behaviors before. If they have, I'm also willing to
>>     propose
>>      >     a more
>>      >      >>     general optimization.
>>      >      >>
>>      >      >>     As to the memory overhead, I believe it will be low
>>     because this
>>      >      >>     patch exploits empty regions in the young space
>>     rather than
>>      >      >>     off-heap memory to allocate shadow regions, and also
>>     reuses the
>>      >      >>     /_source_region/ field of each /RegionData /to record
>> the
>>      >      >>     correspongding shadow region index. We only introduce
>>     a new
>>      >      >>     integer filed /_shadow /in the RegionData class to
>>     indicate the
>>      >      >>     status of a region, a global /GrowableArray
>>     _free_shadow/ to
>>      >     store
>>      >      >>     the indices of shadow regions, and a global
>>     /Monitor/ to protect
>>      >      >>     the array. These information might help if the memory
>>     overhead
>>      >      >>     need to be evaluated.
>>      >      >>
>>      >      >>     Looking forward to your insight.
>>      >      >>
>>      >      >>     Best Regrads,
>>      >      >>     Haoyu Li,
>>      >      >>     Institute of Parallel and Distributed Systems(IPADS),
>>      >      >>     School of Software,
>>      >      >>     Shanghai Jiao Tong University
>>      >      >>
>>      >      >>
>>      >      >>     Kim Barrett <kim.barrett at oracle.com
>>     <mailto:kim.barrett at oracle.com>
>>      >     <mailto:kim.barrett at oracle.com
>> <mailto:kim.barrett at oracle.com>>
>>      >      >>     <mailto:kim.barrett at oracle.com
>>     <mailto:kim.barrett at oracle.com>
>>      >     <mailto:kim.barrett at oracle.com
>>     <mailto:kim.barrett at oracle.com>>>> ?2019?3?12??? ??6:11???
>>      >      >>
>>      >      >>         > On Mar 11, 2019, at 1:45 AM, Kim Barrett
>>      >      >>         <kim.barrett at oracle.com
>>     <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
>>     <mailto:kim.barrett at oracle.com>>
>>      >     <mailto:kim.barrett at oracle.com
>>     <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
>>     <mailto:kim.barrett at oracle.com>>>> wrote:
>>      >      >>         >
>>      >      >>         >> On Jan 24, 2019, at 3:58 AM, Haoyu Li
>>      >     <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>>     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>>      >      >>         <mailto:leihouyju at gmail.com
>>     <mailto:leihouyju at gmail.com>
>>      >     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>>>
>>     wrote:
>>      >      >>         >>
>>      >      >>         >> Hi Kim,
>>      >      >>         >>
>>      >      >>         >> I have ported my patch to OpenJDK 13 according
>>     to your
>>      >      >>         instructions in your last mail, and the patch is
>>     attached in
>>      >      >>         this mail. The patch does not change much since
>>     PSGC is
>>      >     indeed
>>      >      >>         pretty stable.
>>      >      >>         >>
>>      >      >>         >> Also, I evaluate the correctness and
>>     performance of
>>      >     PS full
>>      >      >>         GC with benchmarks from DaCapo, SPECjvm2008, and
>>     JOlden
>>      >     suits
>>      >      >>         on a machine with dual Intel Xeon E5-2618L v3
>> CPUs(16
>>      >     physical
>>      >      >>         cores), 64G DRAM and linux kernel 4.17. The
>>     evaluation
>>      >     result,
>>      >      >>         indicating 1.9X GC throughput improvement on
>>     average, is
>>      >      >>         attached, too.
>>      >      >>         >>
>>      >      >>         >> However, I have no idea how to further test
>> this
>>      >     patch for
>>      >      >>         both correctness and performance. Can I please
>>     get any
>>      >      >>         guidance from you or some sponsor?
>>      >      >>         >
>>      >      >>         > Sorry I missed that you had sent an updated
>>     version of the
>>      >      >>         patch.
>>      >      >>         >
>>      >      >>         > I?ve run the full regression suite across
>>     Oracle-supported
>>      >      >>         platforms.  There are some
>>      >      >>         > failures, but there are almost always some
>>     failures in the
>>      >      >>         later tiers right now.  I?ll start
>>      >      >>         > looking at them tomorrow to figure out whether
>>     any of them
>>      >      >>         are relevant.
>>      >      >>         >
>>      >      >>         > I?m also planning to run some of our performance
>>      >     benchmarks.
>>      >      >>         >
>>      >      >>         > I?ve lightly skimmed the proposed changes.
>>     There might be
>>      >      >>         some code improvements
>>      >      >>         > to be made.
>>      >      >>         >
>>      >      >>         > I?m also wondering if this technique applies to
>>     other
>>      >      >>         collectors.  It seems like both G1 and
>>      >      >>         > Shenandoah full gc?s might have similar
>>     issues?  If so, a
>>      >      >>         solution that is ParallelGC-specific
>>      >      >>         > is less interesting than one that has broader
>>      >      >>         applicability.  Though maybe this optimization
>>      >      >>         > is less important for G1 and Shenandoah, since
>> they
>>      >     actively
>>      >      >>         try to avoid full gc?s.
>>      >      >>         >
>>      >      >>         > I?m also not clear on how much additional
>>     memory might be
>>      >      >>         temporarily allocated by this
>>      >      >>         > mechanism.
>>      >      >>
>>      >      >>         I?ve created a CR for this:
>>      >      >> https://bugs.openjdk.java.net/browse/JDK-8220465
>>      >      >>
>>      >      >
>>      >
>>
>


-- 
Best Regrads,
Haoyu Li,
Institute of Parallel and Distributed Systems(IPADS),
School of Software,
Shanghai Jiao Tong University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shadow-region.patch
Type: text/x-patch
Size: 23000 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20191011/d8a3cfce/shadow-region.patch>

From zgu at redhat.com  Fri Oct 11 17:11:57 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 11 Oct 2019 13:11:57 -0400
Subject: RFR 8232009: Shenandoah: C2 load barrier does not match interpreter
 version
Message-ID: <689f5b52-de3b-6a1a-0032-365dedf58414@redhat.com>

Please review this patch that matches C2 load barrier to interpreter's 
implementation.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232009
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232009/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release) with x86_32 and x86_64 
JVMs on Linux


Thanks,

-Zhengyu


From sangheon.kim at oracle.com  Fri Oct 11 17:34:03 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Fri, 11 Oct 2019 10:34:03 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
Message-ID: <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>

Hi Kim,

On 10/10/19 4:34 PM, Kim Barrett wrote:
>> On Oct 9, 2019, at 12:27 AM, sangheon.kim at oracle.com wrote:
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.3.inc
>> Testing: hs-tier 1~5, with/without UseNUMA
> I agree with Stefan and Thomas; this is looking pretty good.
:)

>
> There are some naming issues that I'm not going to comment on here.
> Stefan has already commented on some, and a bit of offline discussion
> suggests there's a larger naming discussion needed, but which can
> follow getting the functionality we want.
>
> There has been further discission offline toward collapsing
> G1MemoryNodeManager to one class without virtual dispatch, and using
> G1NUMA name. I won't bother to re-iterate any of that here.
Okay.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1Allocator.cpp
>   186   assert(Heap_lock->owner() != NULL, "Should be owned on this thread's behalf.");
>
> Use assert_lock_strong(Heap_lock).
It didn't work.
assert_lock_string() checks "lock->owned_by_self()" which is not 
equivalent to "lock::owner() != NULL". Am I missing something?

Since this is pre-existing code, I would like to leave as is.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp
>    82       _storage.request_memory_on_node(page, _pages_per_region, node_index);
> ...
>   153         _storage.request_memory_on_node(idx, 1, node_index);
>
> I'm not sure request_memory_on_node belongs on the _storage object.
> The current implementation just has the storage object (conditionally)
> forward the request to the memory node manager object. These places in
> the space mapper could just make the calls on the memory node manager
> object directly (it is already being used nearby).  And these places
> don't need the conditionalization.
>
> I think making the space mapper directly call the memory node manager
> here would remove the need for the proposed changes to the virtual
> space class.
Fixed to directly call G1NUMA::request_memory_on_node() (previously 
G1MemoryNodeManager).
But G1NUMA can't calculate raw address, so I had to add base address at 
G1NUMA to get that.

When I implemented it, I had similar opinion (not good fit for _storage) 
but I also wanted to avoid adding extra dependency at G1NUMA. But anyway 
I realized we can achieve it easily if we have base address.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegion.cpp
>   464     st->print("|Node ID %02d", node_ids[this->node_index()]);
>
> The unchecked use of node_index() here can run afoul of an unset (so
> UnknownNodeIndex) index.
Added such checking.
>
> Also, no need for `this->` in `this->node_index()`.
Removed.
I'm aware but tried to follow local style which uses 'this->' in that code.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>    81   virtual const uint max_search_depth() const { return 1; }
>
> s/const uint/uint/
>
> Similarly for other declarations and definitions.
Done.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>    77   virtual void request_memory_on_node(char* aligned_address, size_t size_in_bytes, uint node_index) { }
>
> Shouldn't the aligned_address argument be typed "void*" rather than "char*"?
The signature of that method changed to page based and newly added 
member is void*.
i.e. G1NUMA, void* _base_address
But eventually we need char* to call numa_make_local(char*, , ).

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionManager.cpp
>   112   if (mgr->has_multi_nodes() && requested_node_index != G1MemoryNodeManager::AnyNodeIndex) {
>
> I think it would be better to test the requested_node_index value
> first.  The "any" case is a common case.
Done

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionManager.cpp
>   200   if(AlwaysPreTouch) {
>
> Add space after "if".
Done

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionManager.cpp
>   311   return region_node_index == preferred_node_index;
>
> Fix indentation.
Done

>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/os.hpp
>   393   static const int InvalidId = -1;
>
> This should probably be "InvalidNUMAId" or something like that.
Changed to InvalidNUMAId.

FYI, I filed JDK-8232156 for further investigation of initialization 
order related to G1NUMA. i.e. about removing G1NUMA::set_region_info().

New webrev includes:
1. Addressed most comments from Kim, Stefan and Thomas.
2. Rename G1MemoryNodeManager to G1NUMA with removing virtual calls.

webrev:
http://cr.openjdk.java.net/~sangheki/8220310/webrev.4
http://cr.openjdk.java.net/~sangheki/8220310/webrev.4.inc

Testing: hs-tier 1 ~ 5 with/without UseNUMA

Thanks,
Sangheon


>
> ------------------------------------------------------------------------------
>


From kim.barrett at oracle.com  Fri Oct 11 18:30:00 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 11 Oct 2019 14:30:00 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
Message-ID: <8CA80180-7C9A-423D-8804-653CA59E3DF1@oracle.com>

> On Oct 11, 2019, at 1:34 PM, sangheon.kim at oracle.com wrote:
> On 10/10/19 4:34 PM, Kim Barrett wrote:
>> src/hotspot/share/gc/g1/g1Allocator.cpp
>>  186   assert(Heap_lock->owner() != NULL, "Should be owned on this thread's behalf.");
>> 
>> Use assert_lock_strong(Heap_lock).
> It didn't work.
> assert_lock_string() checks "lock->owned_by_self()" which is not equivalent to "lock::owner() != NULL". Am I missing something?
> 
> Since this is pre-existing code, I would like to leave as is.

Oh, bleh, you are right.  I didn?t read the existing code carefully enough.

>> src/hotspot/share/gc/g1/heapRegion.cpp
>>  464     st->print("|Node ID %02d", node_ids[this->node_index()]);
>> 
>> The unchecked use of node_index() here can run afoul of an unset (so
>> UnknownNodeIndex) index.
> Added such checking.
>> 
>> Also, no need for `this->` in `this->node_index()`.
> Removed.
> I'm aware but tried to follow local style which uses 'this->' in that code.

There is one other use of this-> in that function (and one additional one in the whole file).
The *vast* majority of accesses use the implicit this.  So I wouldn?t describe that as the
local style, rather a couple of weirdnesses that probably should be cleaned up.

> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.4
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.4.inc
> 
> Testing: hs-tier 1 ~ 5 with/without UseNUMA

I?ve started looking at the new webrev.  Looking good, and no comments yet, but not done yet either.


From shade at redhat.com  Fri Oct 11 18:36:02 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 11 Oct 2019 20:36:02 +0200
Subject: RFR (XS/T) 8232176: Shenandoah: new assert in
 ShenandoahEvacuationTask is too strong
Message-ID: <f5b5969e-62cd-1a40-74d8-a42cf6c97931@redhat.com>

Recent regression:
  https://bugs.openjdk.java.net/browse/JDK-8232176

JDK-8231947 added the assert in ShenandoahEvacuationTask that is too strong. There is a corner case
when the region is collection-set-pinned (CSP), and the oom-evac-protocol waits for GC thread to
complete the evacuation. There is a short window where GC thread can see the CSP region before
seeing cancellation request.

It seems easier to remove the too strong assert for now. is_conc_move_allowed() == true is a lie
right now. We can add cancelled_gc() check inside of it, but that would only be safe if we know that
caller holds oom-evac-scope.

The assertion failure reliably reproduces with -XX:ShenandoahGCHeuristics=aggressive
-XX:+ShenandoahOOMDuringEvacALot on SPECjvm2008.

Fix:
  https://cr.openjdk.java.net/~shade/8232176/webrev.01/

Testing: broken tests

-- 
Thanks,
-Aleksey


From kim.barrett at oracle.com  Fri Oct 11 20:12:44 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 11 Oct 2019 16:12:44 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <8CA80180-7C9A-423D-8804-653CA59E3DF1@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <8CA80180-7C9A-423D-8804-653CA59E3DF1@oracle.com>
Message-ID: <4D393A46-3ADC-42DC-8C3D-D2132AB68D67@oracle.com>

> On Oct 11, 2019, at 2:30 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>>> src/hotspot/share/gc/g1/heapRegion.cpp
>>> 464     st->print("|Node ID %02d", node_ids[this->node_index()]);
>>> 
>>> The unchecked use of node_index() here can run afoul of an unset (so
>>> UnknownNodeIndex) index.
>> Added such checking.
>>> 
>>> Also, no need for `this->` in `this->node_index()`.
>> Removed.
>> I'm aware but tried to follow local style which uses 'this->' in that code.
> 
> There is one other use of this-> in that function (and one additional one in the whole file).
> The *vast* majority of accesses use the implicit this.  So I wouldn?t describe that as the
> local style, rather a couple of weirdnesses that probably should be cleaned up.

Looks like this has been fixed in the latest version.

>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.4
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.4.inc
>> 
>> Testing: hs-tier 1 ~ 5 with/without UseNUMA
> 
> I?ve started looking at the new webrev.  Looking good, and no comments yet, but not done yet either.

The only other thing I found was this:

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1NUMA.hpp
  85   // Print current active memory node count.
  86   uint num_active_nodes() const;
 
"Print"?  Also, "current"? It doesn't change, I think.

------------------------------------------------------------------------------

Other than that, looks good to me.  I don't need another webrev for
a fix to that comment.


From sangheon.kim at oracle.com  Fri Oct 11 22:07:55 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Fri, 11 Oct 2019 15:07:55 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <4D393A46-3ADC-42DC-8C3D-D2132AB68D67@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <8CA80180-7C9A-423D-8804-653CA59E3DF1@oracle.com>
 <4D393A46-3ADC-42DC-8C3D-D2132AB68D67@oracle.com>
Message-ID: <f215266f-bf38-4894-db3f-eee7be00a8a4@oracle.com>

Hi Kim,

On 10/11/19 1:12 PM, Kim Barrett wrote:
>> On Oct 11, 2019, at 2:30 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>
>>>> src/hotspot/share/gc/g1/heapRegion.cpp
>>>> 464     st->print("|Node ID %02d", node_ids[this->node_index()]);
>>>>
>>>> The unchecked use of node_index() here can run afoul of an unset (so
>>>> UnknownNodeIndex) index.
>>> Added such checking.
>>>> Also, no need for `this->` in `this->node_index()`.
>>> Removed.
>>> I'm aware but tried to follow local style which uses 'this->' in that code.
>> There is one other use of this-> in that function (and one additional one in the whole file).
>> The *vast* majority of accesses use the implicit this.  So I wouldn?t describe that as the
>> local style, rather a couple of weirdnesses that probably should be cleaned up.
> Looks like this has been fixed in the latest version.
Yes

>
>>> webrev:
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.4
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.4.inc
>>>
>>> Testing: hs-tier 1 ~ 5 with/without UseNUMA
>> I?ve started looking at the new webrev.  Looking good, and no comments yet, but not done yet either.
> The only other thing I found was this:
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1NUMA.hpp
>    85   // Print current active memory node count.
>    86   uint num_active_nodes() const;
>   
> "Print"?  Also, "current"? It doesn't change, I think.
Okay, changed 'Returns active memory node count'.

>
> ------------------------------------------------------------------------------
>
> Other than that, looks good to me.  I don't need another webrev for
> a fix to that comment.
Nice to hear!
Many thanks for your thorough all reviews.

Thanks,
Sangheon


From maoliang.ml at alibaba-inc.com  Sat Oct 12 11:51:26 2019
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Sat, 12 Oct 2019 19:51:26 +0800
Subject: =?UTF-8?B?UmU6IEcxIHBhdGNoIG9mIGVsYXN0aWMgSmF2YSBoZWFw?=
In-Reply-To: <5f02f337-f479-55f6-351e-867507845f65@oracle.com>
References: <6270ce59-4a8e-431e-9ccf-f6d2c0f927eb.maoliang.ml@alibaba-inc.com>
 <d82e704831f0afbc61f8a3fb6b69bb1463b7ede8.camel@oracle.com>
 <e4ffd4d9-3ec0-4592-ac8c-d5a77c6b2e75.maoliang.ml@alibaba-inc.com>
 <1267a5dd2cf6cc1d03df64d07a06ba0f45195951.camel@oracle.com>
 <3140197d-8cab-4a86-af92-58431c74cb6b.maoliang.ml@alibaba-inc.com>
 <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>
 <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>
 <6cc5bdd7-c076-472f-8a36-8294c6cbfe21.maoliang.ml@alibaba-inc.com>,
 <5f02f337-f479-55f6-351e-867507845f65@oracle.com>
Message-ID: <66393648-73b1-4a45-9d48-c8fcf94789fa.maoliang.ml@alibaba-inc.com>

Hi Thomas,

The manual generation limit can be put aside currently since we know it might not be so general for
 a GC. We can focus on how to change heap size and return memory in runtime first. 

GCTimeRatio is a good metric to measure the health of a Java application and I have considered
 to use that. But finally I chose a simple way just like the periodic old GC. Guarantee a long 
enough young GC interval is an alternative way to make sure the GCTimeRatio at a heathy state. 
I'm absolutely ok to use GCTimeRatio instead of the fixed young GC interval. This part is same
 to ZGC or Shenandoah for how to balance the desired memory size and GC frequency. I'm open to 
any good solution and we are already in the same page for this issue I think:)

A big difference of our implementation is evaluating heap resizing in any young GC instead of a 
concurrent gc cycle which I think is swifter and more immmediate. The concurrent map/unmap 
mechanism gets rid of the additional pause time. My thought is the heap shrink/expand can be
 all determined in young GC pause and performed in concurrent thread which could exclude the 
considerable time cost by OS interface. Most of our Java users are intolerant to those pause
 pikes caused by page fault which can be up to seconds. And we also found the issue of time 
cost by map/unmap in ZGC.

A direct advantage of the young GC resizing and concurrent memory free machanism is for implementing
SoftMaxHeapSize. The heap size can be changed after last mixed GC. The young GC won't have longer
 pause and the memory can be freed concurrently without side effect.

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2019 Oct. 11 (Fri.) 19:02
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: G1 patch of elastic Java heap

Hi,

On 10.10.19 15:48, Liang Mao wrote:
> Hi Thomas,
> 
> Thank you for the feedback.
> You are right about some points that the present code seems to separate 
> the heap into young and old gen pools. In OpenJDK8, there's no adaptive-ihop so fixed ihop 
> and MaxNewSize can clearly separate young gen and old gen. I'm also thinking about how to design it better 
> in upstream of OpenJDK G1.
> 
> There is a tradeoff between memory and GC frequency. More frequent GC 
> uses less memory. We found our online service applications keep large young generation for 
> potential query traffic but most of time the young GC frequency is quite low. Memory can be easily saved 
> by using smaller young gen
> In Shenandoah or ZGC, there is only 1 generation and it's 
> straightforward to determine if memory is wasted and can be returned. G1 has 2 generations, in remark phase 
> MinHeapFreeRatio/MaxHeapFreeRatio cannot tell the young generation is rather wasted for running 2 minutes 
> without a young GC and we can return a lot of memory. Each generation's GC interval or time ratio 
> spent on mutator/gc you mentioned seems more intuitive.
> 
> The explicit limitation of generation may not be a good design from G1 
> GC's perspective. From the operation's point of view, it is easy for manipulating JVM. There is a 
> simple relationship: larger network traffic -> higher memory allocation rate -> larger young 
> generation. So cluster operation can easily set the young generation as 10% of max young gen 
> size to every Java instance if the network traffic is guanranteed to be below 10% for a period of time.
> 
> I'm not sticking to the current implementation to create clear boundary 
> between young and old gen, especially for newer OpenJDK versions and I've been thinking of unifying 
> the 2 generations' resizing within the single memory pool of heap along with Xms. The periodic 
> uncommit mode does not strickly separate the young/old gen. Current implementation calculates the 
> average GC interval and keep it in a certain range between a low bound and high bound and will immediately 
> trigger an expansion if a single GC interval smaller than a threshould. We can use a similar 
> policy to estimate a target young generation capacity and adjust the capacity of old generation after a 
> concurrent cycle. The 2 parts together can be the target heap capacity. The capacity can vary between 
> Xms and Xmx. The difference with current G1 is it can be resized in a young GC not only remark.

Thank you for presenting your problem (and not insisting on a particular 
solution upfront).

Summary of this long text:

In case of "low" activity the user wants to limit the heap resulting in 
giving back memory. Currently, all the user can do is specifying the 
maximum amount of work the gc is allowed to use (GCTimeRatio). At least 
G1, as soon as the time spent in gc compared to mutator time is lower 
than GCTimeRatio (typically achieved by expanding the heap), it "never" 
shrinks the heap back (at least not based on that ratio). Which wastes 
lots of space, which is the problem.

We all agree that this is a problem :) I believe we only differ on what 
knobs the user should have available to achieve this.

Here are my current suggestions:

One option that I suggested earlier, is that instead of setting 
generation sizes (or heap sizes) manually (which could be fine in some 
cases for other reasons) could be thinking a bit differently about 
GCTimeRatio than now: currently it is the maximum amount of GC activity 
the user can bear, so we should make the GC to use less.
The slight tweak here could be that we assume that any GC activity below 
that is fine :)

Ie. if current GC activity is very low compared to mutator activity (far 
below what GCTimeRatio allows), and expected additional GC activity 
caused by this forced GC cycle would not exceed that GCTimeRatio, why 
not do the GC?

Think of a "minimum" GCTimeRatio; in some way this is very much like 
minimum and maximum GC intervals only with much more flexibility for the 
GC to meet (also this metric is independent of the environment, e.g. 
hardware, while setting actual values of sizes needs tuning).

I agree that there is then not an immediately obvious relation between 
external input (the traffic in your example) to what you should set that 
"minimum" GCTimeRatio to. However since there is a relation to young gen 
size and GCTimeRatio I think this can be figured out.

This is what ZGC does and I think would be worth trying out before 
thinking about adding a G1 specific way of achieving this or a similar 
effect.

The other option which is more direct would be implementing and changing 
target heap size during runtime: it would also automatically shrink the 
heap. I believe that if you were able to modify the current adaptive 
IHOP's "target" heap size from outside, G1 would already automatically 
give back memory; in conjunction with the "Promptly Return ...", it 
would also make sure that in very low mutator activity cases the GC 
cycle would continue.

As for whether this feature would be accepted for inclusion into G1: 
there is already a SoftMaxHeapSize switch in the JDK, so I guess this is 
a non-issue.

Note that you can *already*, if you know that from a particular time on 
there will be little activity, modify the "Promptly Return..." settings 
so that it will immediately start cleaning up and compacting the heap; 
you can even force maximum compaction at that time by issuing a full gc 
if service interruption is not an issue.

> 
> In order to do swift heap resizing we have to conquer the over head of 
> memory request/release from OS. The memory unmap and map(including the page fault) cost significant 
> time. So we use an intuitive way to have a concurrent thread to do the map/unmap/pretouch. The free 
> regions will be synchronized in GC pause. In our applications, a typical G1 remark cost ~100ms of pause. I 
> haven't tested latest G1 but based on our experimental data, the pause can be easily doubled if done 
> considerable map/unmaps.
> 

That's a related but distinct problem and a solution that seems at least 
worth trying :)

> 
> All of above are our thoughts and the present implementation is kind of 
> reference. Please let me know if
> I answered all your questions. Hope we can come to an agreement in some 
> points and conceive a good design
> in latest G1 GC :)
> 

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Sat Oct 12 15:00:19 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Sat, 12 Oct 2019 17:00:19 +0200
Subject: G1 patch of elastic Java heap
In-Reply-To: <66393648-73b1-4a45-9d48-c8fcf94789fa.maoliang.ml@alibaba-inc.com>
References: <6270ce59-4a8e-431e-9ccf-f6d2c0f927eb.maoliang.ml@alibaba-inc.com>
 <d82e704831f0afbc61f8a3fb6b69bb1463b7ede8.camel@oracle.com>
 <e4ffd4d9-3ec0-4592-ac8c-d5a77c6b2e75.maoliang.ml@alibaba-inc.com>
 <1267a5dd2cf6cc1d03df64d07a06ba0f45195951.camel@oracle.com>
 <3140197d-8cab-4a86-af92-58431c74cb6b.maoliang.ml@alibaba-inc.com>
 <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>
 <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>
 <6cc5bdd7-c076-472f-8a36-8294c6cbfe21.maoliang.ml@alibaba-inc.com>
 ,<5f02f337-f479-55f6-351e-867507845f65@oracle.com>
 <66393648-73b1-4a45-9d48-c8fcf94789fa.maoliang.ml@alibaba-inc.com>
Message-ID: <858f43f72ea9325907a6bd6955768af3d64e57fc.camel@oracle.com>

Hi,

On Sat, 2019-10-12 at 19:51 +0800, Liang Mao wrote:
> Hi Thomas,
> 
> The manual generation limit can be put aside currently since we know
> it might not be so general for a GC. We can focus on how to change
> heap size and return memory in runtime first. 
> 
> GCTimeRatio is a good metric to measure the health of a Java
> application and I have considered to use that. But finally I chose
> a simple way just like the periodic old GC. Guarantee a long 
> enough young GC interval is an alternative way to make sure the
> GCTimeRatio at a heathy state. 
> I'm absolutely ok to use GCTimeRatio instead of the fixed young GC
> interval. This part is same to ZGC or Shenandoah for how to balance
> the desired memory size and GC frequency. I'm open to  any good
> solution and we are already in the same page for this issue
> I think:)

+1

> A big difference of our implementation is evaluating heap resizing in
> any young GC instead of a concurrent gc cycle which I think is
> swifter and more immmediate. The concurrent map/unmap 
> mechanism gets rid of the additional pause time. My thought is the
> heap shrink/expand can be all determined in young GC pause and
> performed in concurrent thread which could exclude the 
> considerable time cost by OS interface. Most of our Java users are
> intolerant to those pause pikes caused by page fault which can be up
> to seconds. And we also found the issue of time cost by map/unmap in
> ZGC.
>
> A direct advantage of the young GC resizing and concurrent memory
> free machanism is for implementing SoftMaxHeapSize. The heap size can
> be changed after last mixed GC. The young GC won't have longer
> pause and the memory can be freed concurrently without side effect.

Agree and agree. Both evaluating and giving back memory at any gc
sounds nice, and doing that without incurring the costs in the pause is
even better :)

Thanks,
  Thomas


From sangheon.kim at oracle.com  Sun Oct 13 06:00:18 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Sat, 12 Oct 2019 23:00:18 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
Message-ID: <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>

Hi all,

Previous patch conflicts, so I'm posting rebased one.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220311/webrev.2
Testing: hs-tier 1 ~ 5, with/without UseNUMA

Thanks,
Sangheon


On 10/1/19 9:53 AM, sangheon.kim at oracle.com wrote:
> Hi all,
>
> As JDK-8220310 changed a lot, I'm posting next webrev.
> Previous webrev just conflicts.
>
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.1
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.1.inc
> Testing: hs-tier 1 ~ 5 with +- UseNUMA
>
> Thanks,
> Sangheon
>
>
> On 9/4/19 12:16 AM, sangheon.kim at oracle.com wrote:
>> Hi all,
>>
>> Please review this patch making G1 NUMA aware.
>> This is the second part of G1 NUMA implementation:
>> - Making Survivor region NUMA aware.
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8220311
>> Webrev: http://cr.openjdk.java.net/~sangheki/8220311/webrev.0
>> Testing: hs-tier 1 ~ 5 with +- UseNUMA
>>
>> Thanks,
>> Sangheon
>


From sangheon.kim at oracle.com  Sun Oct 13 06:16:27 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Sat, 12 Oct 2019 23:16:27 -0700
Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for
 G1, Logging (3/3)
In-Reply-To: <e903223b-90a5-9d01-5421-a47011bd5985@oracle.com>
References: <e7c52f60-a5c7-072a-4e3b-65c608907679@oracle.com>
 <e903223b-90a5-9d01-5421-a47011bd5985@oracle.com>
Message-ID: <ba8c3fa4-9ee1-6a98-d13f-ffaacc59025c@oracle.com>

Hi all,

Previous patch conflicts because of JDK-8220310, I'm posting rebased one 
with some refactoring.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220312/webrev.2
Testing: hs-tier 1 ~ 5, with/without UseNUMA

Here's the full patch of 8220310, 8220311 and 8220312.
http://cr.openjdk.java.net/~sangheki/8220312/webrev.full.2/

Thanks,
Sangheon


On 10/2/19 10:11 AM, sangheon.kim at oracle.com wrote:
> Hi,
>
> Here's the rebased webrev with minor changes.
>
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.1
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.1.inc
> Testing: hs-tier 1 ~ 5 with +- UseNUMA
>
> FYI, here's the full patch including JDK-8220310, 8220311, 8220312.
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.full/
>
> Thanks,
> Sangheon
>
>
> On 9/4/19 12:16 AM, sangheon.kim at oracle.com wrote:
>> Hi all,
>>
>> Please review this patch making G1 NUMA aware.
>> This is the last part of G1 NUMA implementation:
>> - Adding logs and stat.
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8220312
>> Webrev: http://cr.openjdk.java.net/~sangheki/8220312/webrev.0
>> Testing: hs-tier 1 ~ 8 with +- UseNUMA
>>
>> Thanks,
>> Sangheon
>


From maoliang.ml at alibaba-inc.com  Mon Oct 14 03:52:19 2019
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Mon, 14 Oct 2019 11:52:19 +0800
Subject: =?UTF-8?B?UmU6IEcxIHBhdGNoIG9mIGVsYXN0aWMgSmF2YSBoZWFw?=
In-Reply-To: <858f43f72ea9325907a6bd6955768af3d64e57fc.camel@oracle.com>
References: <6270ce59-4a8e-431e-9ccf-f6d2c0f927eb.maoliang.ml@alibaba-inc.com>	
 <d82e704831f0afbc61f8a3fb6b69bb1463b7ede8.camel@oracle.com>	
 <e4ffd4d9-3ec0-4592-ac8c-d5a77c6b2e75.maoliang.ml@alibaba-inc.com>	
 <1267a5dd2cf6cc1d03df64d07a06ba0f45195951.camel@oracle.com>	
 <3140197d-8cab-4a86-af92-58431c74cb6b.maoliang.ml@alibaba-inc.com>	
 <a201e27d-d231-4787-8bba-55f5266206d1.maoliang.ml@alibaba-inc.com>	
 <b691e0bd-dd9a-32e2-c950-9c84de29101c@oracle.com>	
 <6cc5bdd7-c076-472f-8a36-8294c6cbfe21.maoliang.ml@alibaba-inc.com>	,
 <5f02f337-f479-55f6-351e-867507845f65@oracle.com>	
 <66393648-73b1-4a45-9d48-c8fcf94789fa.maoliang.ml@alibaba-inc.com>,
 <858f43f72ea9325907a6bd6955768af3d64e57fc.camel@oracle.com>
Message-ID: <77e0e95e-8500-46e6-8b80-6f25b33f6c7f.maoliang.ml@alibaba-inc.com>

Hi Thomas,

Thank you for the recognition:) Since we both agree on some clear specific points,
 I will try to extract them from current implementation and create a patch in OpenJDK
upstream branch so we can continue discussion on the code level.

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2019 Oct. 12 (Sat.) 23:00
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: G1 patch of elastic Java heap

Hi,

On Sat, 2019-10-12 at 19:51 +0800, Liang Mao wrote:
> Hi Thomas,
> 
> The manual generation limit can be put aside currently since we know
> it might not be so general for a GC. We can focus on how to change
> heap size and return memory in runtime first. 
> 
> GCTimeRatio is a good metric to measure the health of a Java
> application and I have considered to use that. But finally I chose
> a simple way just like the periodic old GC. Guarantee a long 
> enough young GC interval is an alternative way to make sure the
> GCTimeRatio at a heathy state. 
> I'm absolutely ok to use GCTimeRatio instead of the fixed young GC
> interval. This part is same to ZGC or Shenandoah for how to balance
> the desired memory size and GC frequency. I'm open to  any good
> solution and we are already in the same page for this issue
> I think:)

+1

> A big difference of our implementation is evaluating heap resizing in
> any young GC instead of a concurrent gc cycle which I think is
> swifter and more immmediate. The concurrent map/unmap 
> mechanism gets rid of the additional pause time. My thought is the
> heap shrink/expand can be all determined in young GC pause and
> performed in concurrent thread which could exclude the 
> considerable time cost by OS interface. Most of our Java users are
> intolerant to those pause pikes caused by page fault which can be up
> to seconds. And we also found the issue of time cost by map/unmap in
> ZGC.
>
> A direct advantage of the young GC resizing and concurrent memory
> free machanism is for implementing SoftMaxHeapSize. The heap size can
> be changed after last mixed GC. The young GC won't have longer
> pause and the memory can be freed concurrently without side effect.

Agree and agree. Both evaluating and giving back memory at any gc
sounds nice, and doing that without incurring the costs in the pause is
even better :)

Thanks,
  Thomas


From rkennke at redhat.com  Mon Oct 14 09:02:45 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 14 Oct 2019 11:02:45 +0200
Subject: RFR (XS/T) 8232176: Shenandoah: new assert in
 ShenandoahEvacuationTask is too strong
In-Reply-To: <f5b5969e-62cd-1a40-74d8-a42cf6c97931@redhat.com>
References: <f5b5969e-62cd-1a40-74d8-a42cf6c97931@redhat.com>
Message-ID: <82671652-4969-f7d2-2a0b-a8d869f9904e@redhat.com>

Hmm, ok.

Roman


> Recent regression:
>   https://bugs.openjdk.java.net/browse/JDK-8232176
> 
> JDK-8231947 added the assert in ShenandoahEvacuationTask that is too strong. There is a corner case
> when the region is collection-set-pinned (CSP), and the oom-evac-protocol waits for GC thread to
> complete the evacuation. There is a short window where GC thread can see the CSP region before
> seeing cancellation request.
> 
> It seems easier to remove the too strong assert for now. is_conc_move_allowed() == true is a lie
> right now. We can add cancelled_gc() check inside of it, but that would only be safe if we know that
> caller holds oom-evac-scope.
> 
> The assertion failure reliably reproduces with -XX:ShenandoahGCHeuristics=aggressive
> -XX:+ShenandoahOOMDuringEvacALot on SPECjvm2008.
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232176/webrev.01/
> 
> Testing: broken tests
> 


From shade at redhat.com  Mon Oct 14 09:07:12 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 14 Oct 2019 11:07:12 +0200
Subject: RFR (XS/T) 8232176: Shenandoah: new assert in
 ShenandoahEvacuationTask is too strong
In-Reply-To: <82671652-4969-f7d2-2a0b-a8d869f9904e@redhat.com>
References: <f5b5969e-62cd-1a40-74d8-a42cf6c97931@redhat.com>
 <82671652-4969-f7d2-2a0b-a8d869f9904e@redhat.com>
Message-ID: <db394145-3621-62a6-2bb9-f0df3b226b60@redhat.com>

Thanks, pushed.

-Aleksey

On 10/14/19 11:02 AM, Roman Kennke wrote:
> Hmm, ok.
> 
> Roman
> 
> 
>> Recent regression:
>>   https://bugs.openjdk.java.net/browse/JDK-8232176
>>
>> JDK-8231947 added the assert in ShenandoahEvacuationTask that is too strong. There is a corner case
>> when the region is collection-set-pinned (CSP), and the oom-evac-protocol waits for GC thread to
>> complete the evacuation. There is a short window where GC thread can see the CSP region before
>> seeing cancellation request.
>>
>> It seems easier to remove the too strong assert for now. is_conc_move_allowed() == true is a lie
>> right now. We can add cancelled_gc() check inside of it, but that would only be safe if we know that
>> caller holds oom-evac-scope.
>>
>> The assertion failure reliably reproduces with -XX:ShenandoahGCHeuristics=aggressive
>> -XX:+ShenandoahOOMDuringEvacALot on SPECjvm2008.
>>
>> Fix:
>>   https://cr.openjdk.java.net/~shade/8232176/webrev.01/
>>
>> Testing: broken tests
>>
> 


From shade at redhat.com  Mon Oct 14 09:20:32 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 14 Oct 2019 11:20:32 +0200
Subject: RFR (XS) 8232205: Shenandoah: missing "Update References" -> "Update
 Roots" tracing
Message-ID: <d4caad65-03fc-fe9e-89c0-62b364bdd560@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8232205

Noticed that -Xlog:gc+stats does not print "Update Roots" section for "Update References". This is a
regression since JDK-8223951.

Fix:
  https://cr.openjdk.java.net/~shade/8232205/webrev.01/

Testing: hotspot_gc_shenandoah, eyeballing gc+stats

-- 
Thanks,
-Aleksey


From stefan.johansson at oracle.com  Mon Oct 14 13:00:22 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 14 Oct 2019 15:00:22 +0200
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <CAKSDcxu_t+Ka0LUU3WSzdR52-_+rsvKo_Vxn8av=duFmS4EPyA@mail.gmail.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
 <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
 <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>
 <CAKSDcxu_t+Ka0LUU3WSzdR52-_+rsvKo_Vxn8av=duFmS4EPyA@mail.gmail.com>
Message-ID: <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com>

Thanks for the quick update Haoyu,

This is a great improvement and I will try to find time to look into the 
patch in more detail the coming weeks.

Thanks,
Stefan

On 2019-10-11 14:49, Haoyu Li wrote:
> Hi Stefan,
> 
> Thanks for your suggestion! It is very redundant that
> PSParallelCompact::fill_shadow_region() copies most code from
> PSParallelCompact::fill_region(), and therefore I've refactored these
> two functions to share code as many as possible. And the attachment is
> the updated patch.
> 
> Specifically, the closure, which moves objects, in
> PSParallelCompact::fill_region() is now declared as a template of
> either MoveAndUpdateClosure or ShadowClosure. So by controlling the
> type of closure when invoking the function, we can decide whether to
> fill a normal region or a shadow one. Thus, almost all code in
> PSParallelCompact::fill_region() can be reused.
> 
> Besides, a virtual function named complete_region() is added in both
> closures to do some work after the filling, such setting states and
> copying the shadow region back.
> 
> Thanks again for reviewing the patch, looking forward to your insights
> and suggestions!
> 
> Best Regards,
> Haoyu Li
> 
> 2019-10-10 21:50 GMT+08:00, Stefan Johansson <stefan.johansson at oracle.com>:
>> Thanks for the clarification =)
>>
>> Moving on to the next part, the code in the patch. So this won't be a
>> full review of the patch but just an initial comment that I would like
>> to be addressed first.
>>
>> The new function PSParallelCompact::fill_shadow_region() is more or less
>> a copy of PSParallelCompact::fill_region() and I understand that from a
>> proof of concept point of view it was the easy (and right) way to do it.
>> I would prefer if the code could be refactored so that fill_region() and
>> fill_shadow_region() share more code. There might be reasons that I've
>> missed, that prevents it, but we should at least explore how much code
>> can be shared.
>>
>> Thanks,
>> Stefan
>>
>> On 2019-10-10 15:10, Haoyu Li wrote:
>>> Hi Stefan,
>>>
>>> Thanks for your quick response! As to your concern about the OCA, I am
>>> the sole author of the patch. And it is the case as what the agreement
>>> states.
>>> Best Regrads,
>>> Haoyu Li,
>>>
>>>
>>> Stefan Johansson <stefan.johansson at oracle.com
>>> <mailto:stefan.johansson at oracle.com>> ?2019?10?10??? ??8:37???
>>>
>>>      Hi,
>>>
>>>      On 2019-10-10 13:06, Haoyu Li wrote:
>>>       > Hi Stefan,
>>>       >
>>>       > Thanks for your testing! One possible reason for the regressions
>>> in
>>>       > simple tests is that the region dependencies maybe not heavy
>>> enough.
>>>       > Because the locality of shadow regions is lower than that of heap
>>>       > regions, writing to shadow regions will be slower than to normal
>>>       > regions, and this is a part of the reason why I reuse shadow
>>>      regions.
>>>       > Therefore, if only a few shadow regions are created and not
>>>      reused, the
>>>       > overhead may not be amortized.
>>>
>>>      I guess it is something like this. I thought that for "easy" heaps
>>> the
>>>      shadow regions won't be used at all, and should therefor not really
>>>      cost
>>>      anything.
>>>
>>>       >
>>>       > As to the OCA, it is the case that I'm the only person signing the
>>>       > agreement. Please let me know if you have any further questions.
>>>      Thanks
>>>       > again!
>>>
>>>      Ok, so you are the sole author of the patch. The important part, as
>>> the
>>>      agreement states, is:
>>>      "no other person or entity, including my employer, has or will have
>>>      rights with respect my contributions"
>>>
>>>      Is that the case?
>>>
>>>      Thanks,
>>>      Stefan
>>>
>>>       >
>>>       > Best Regrads,
>>>       > Haoyu Li
>>>       >
>>>       > Stefan Johansson <stefan.johansson at oracle.com
>>>      <mailto:stefan.johansson at oracle.com>
>>>       > <mailto:stefan.johansson at oracle.com
>>>      <mailto:stefan.johansson at oracle.com>>> ?2019?10?8??? ??6:49
>>>      ???
>>>       >
>>>       >     Hi Haoyu,
>>>       >
>>>       >     I've done some more testing and I haven't seen any issues
>>>      with the
>>>       >     patch
>>>       >     so far and the performance looks promising in most cases. For
>>>      simple
>>>       >     tests I've seen some regressions, but I'm not really sure
>>>      why. Will do
>>>       >     some more digging.
>>>       >
>>>       >     To move forward with this the first thing we need to do is
>>>      making sure
>>>       >     that you being covered by the Oracle Contributor Agreement is
>>>      enough.
>>>       >       From what we can see it is only you as an individual that
>>>      has signed
>>>       >     the OCA and in that case it is important that this statement
>>>      from the
>>>       >     OCA is fulfilled: "no other person or entity, including my
>>>      employer,
>>>       >     has
>>>       >     or will have rights with respect my contributions"
>>>       >
>>>       >     Is this the case for this contribution or should we have the
>>>      university
>>>       >     sign the OCA as well? For more information regarding the OCA
>>>      please
>>>       >     refer to:
>>>       > https://www.oracle.com/technetwork/oca-faq-405384.pdf
>>>       >
>>>       >     Thanks,
>>>       >     Stefan
>>>       >
>>>       >     On 2019-09-16 16:02, Haoyu Li wrote:
>>>       >      > FYI, the evaluation results on OpenJDK 14 are plotted in
>>> the
>>>       >     attachment.
>>>       >      > I compute the full GC throughput by dividing the heap size
>>>      before
>>>       >     full
>>>       >      > GC by the GC pause time, and the results are arithmetic
>>> mean
>>>       >     values of
>>>       >      > ten runs after a warm-up run. The evaluation is conducted on
>>> a
>>>       >     machine
>>>       >      > with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16
>>>      physical
>>>       >     cores
>>>       >      > with SMT enabled) and 64G DRAM.
>>>       >      >
>>>       >      > Best Regrads,
>>>       >      > Haoyu Li,
>>>       >      > Institute of Parallel and Distributed Systems(IPADS),
>>>       >      > School of Software,
>>>       >      > Shanghai Jiao Tong University
>>>       >      >
>>>       >      >
>>>       >      > Stefan Johansson <stefan.johansson at oracle.com
>>>      <mailto:stefan.johansson at oracle.com>
>>>       >     <mailto:stefan.johansson at oracle.com
>>>      <mailto:stefan.johansson at oracle.com>>
>>>       >      > <mailto:stefan.johansson at oracle.com
>>>      <mailto:stefan.johansson at oracle.com>
>>>       >     <mailto:stefan.johansson at oracle.com
>>>      <mailto:stefan.johansson at oracle.com>>>> ?2019?9?12??? ??5:34
>>>       >     ???
>>>       >      >
>>>       >      >     Hi Haoyu,
>>>       >      >
>>>       >      >     I recently came across your patch and I would like to
>>>      pick up on
>>>       >      >     some of the things Kim mentioned in his mails. I
>>>      especially want
>>>       >      >     evaluate and investigate if this is a technique we can
>>>      use to
>>>       >      >     improve the other GCs as well. To start that work I
>>>      want to
>>>       >     take the
>>>       >      >     patch for a spin in our internal performance testing.
>>>      The patch
>>>       >      >     doesn?t apply clean to the latest JDK repository, so
>>>      if you could
>>>       >      >     provide an updated patch that would be very helpful.
>>>       >      >
>>>       >      >     It would also be great if you could share some more
>>>      information
>>>       >      >     around the results presented in the paper. For example,
>>> it
>>>       >     would be
>>>       >      >     good to get the full command lines for the different
>>>       >     benchmarks so
>>>       >      >     we can run them locally and reproduce the
>>>      results you?ve seen.
>>>       >      >
>>>       >      >     Thanks,
>>>       >      >     Stefan
>>>       >      >
>>>       >      >>     12 mars 2019 kl. 03:21 skrev Haoyu Li
>>>      <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>>>       >     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>>>       >      >>     <mailto:leihouyju at gmail.com
>>>      <mailto:leihouyju at gmail.com> <mailto:leihouyju at gmail.com
>>>      <mailto:leihouyju at gmail.com>>>>:
>>>       >      >>
>>>       >      >>     Hi Kim,
>>>       >      >>
>>>       >      >>     Thanks for reviewing and testing the patch. If there
>>>      are any
>>>       >      >>     failures or performance degradation relevant to the
>>>      work, please
>>>       >      >>     let me know and I'll be very happy to keep improving
>>> it.
>>>       >     Also, any
>>>       >      >>     suggestions about code improvements are well
>>> appreciated.
>>>       >      >>
>>>       >      >>     I'm not quite sure if both G1 and Shenandoah have the
>>>      similar
>>>       >      >>     region dependency issue, since I haven't studied their
>>> GC
>>>       >      >>     behaviors before. If they have, I'm also willing to
>>>      propose
>>>       >     a more
>>>       >      >>     general optimization.
>>>       >      >>
>>>       >      >>     As to the memory overhead, I believe it will be low
>>>      because this
>>>       >      >>     patch exploits empty regions in the young space
>>>      rather than
>>>       >      >>     off-heap memory to allocate shadow regions, and also
>>>      reuses the
>>>       >      >>     /_source_region/ field of each /RegionData /to record
>>> the
>>>       >      >>     correspongding shadow region index. We only introduce
>>>      a new
>>>       >      >>     integer filed /_shadow /in the RegionData class to
>>>      indicate the
>>>       >      >>     status of a region, a global /GrowableArray
>>>      _free_shadow/ to
>>>       >     store
>>>       >      >>     the indices of shadow regions, and a global
>>>      /Monitor/ to protect
>>>       >      >>     the array. These information might help if the memory
>>>      overhead
>>>       >      >>     need to be evaluated.
>>>       >      >>
>>>       >      >>     Looking forward to your insight.
>>>       >      >>
>>>       >      >>     Best Regrads,
>>>       >      >>     Haoyu Li,
>>>       >      >>     Institute of Parallel and Distributed Systems(IPADS),
>>>       >      >>     School of Software,
>>>       >      >>     Shanghai Jiao Tong University
>>>       >      >>
>>>       >      >>
>>>       >      >>     Kim Barrett <kim.barrett at oracle.com
>>>      <mailto:kim.barrett at oracle.com>
>>>       >     <mailto:kim.barrett at oracle.com
>>> <mailto:kim.barrett at oracle.com>>
>>>       >      >>     <mailto:kim.barrett at oracle.com
>>>      <mailto:kim.barrett at oracle.com>
>>>       >     <mailto:kim.barrett at oracle.com
>>>      <mailto:kim.barrett at oracle.com>>>> ?2019?3?12??? ??6:11???
>>>       >      >>
>>>       >      >>         > On Mar 11, 2019, at 1:45 AM, Kim Barrett
>>>       >      >>         <kim.barrett at oracle.com
>>>      <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
>>>      <mailto:kim.barrett at oracle.com>>
>>>       >     <mailto:kim.barrett at oracle.com
>>>      <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
>>>      <mailto:kim.barrett at oracle.com>>>> wrote:
>>>       >      >>         >
>>>       >      >>         >> On Jan 24, 2019, at 3:58 AM, Haoyu Li
>>>       >     <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>>>      <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>>>       >      >>         <mailto:leihouyju at gmail.com
>>>      <mailto:leihouyju at gmail.com>
>>>       >     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>>>
>>>      wrote:
>>>       >      >>         >>
>>>       >      >>         >> Hi Kim,
>>>       >      >>         >>
>>>       >      >>         >> I have ported my patch to OpenJDK 13 according
>>>      to your
>>>       >      >>         instructions in your last mail, and the patch is
>>>      attached in
>>>       >      >>         this mail. The patch does not change much since
>>>      PSGC is
>>>       >     indeed
>>>       >      >>         pretty stable.
>>>       >      >>         >>
>>>       >      >>         >> Also, I evaluate the correctness and
>>>      performance of
>>>       >     PS full
>>>       >      >>         GC with benchmarks from DaCapo, SPECjvm2008, and
>>>      JOlden
>>>       >     suits
>>>       >      >>         on a machine with dual Intel Xeon E5-2618L v3
>>> CPUs(16
>>>       >     physical
>>>       >      >>         cores), 64G DRAM and linux kernel 4.17. The
>>>      evaluation
>>>       >     result,
>>>       >      >>         indicating 1.9X GC throughput improvement on
>>>      average, is
>>>       >      >>         attached, too.
>>>       >      >>         >>
>>>       >      >>         >> However, I have no idea how to further test
>>> this
>>>       >     patch for
>>>       >      >>         both correctness and performance. Can I please
>>>      get any
>>>       >      >>         guidance from you or some sponsor?
>>>       >      >>         >
>>>       >      >>         > Sorry I missed that you had sent an updated
>>>      version of the
>>>       >      >>         patch.
>>>       >      >>         >
>>>       >      >>         > I?ve run the full regression suite across
>>>      Oracle-supported
>>>       >      >>         platforms.  There are some
>>>       >      >>         > failures, but there are almost always some
>>>      failures in the
>>>       >      >>         later tiers right now.  I?ll start
>>>       >      >>         > looking at them tomorrow to figure out whether
>>>      any of them
>>>       >      >>         are relevant.
>>>       >      >>         >
>>>       >      >>         > I?m also planning to run some of our performance
>>>       >     benchmarks.
>>>       >      >>         >
>>>       >      >>         > I?ve lightly skimmed the proposed changes.
>>>      There might be
>>>       >      >>         some code improvements
>>>       >      >>         > to be made.
>>>       >      >>         >
>>>       >      >>         > I?m also wondering if this technique applies to
>>>      other
>>>       >      >>         collectors.  It seems like both G1 and
>>>       >      >>         > Shenandoah full gc?s might have similar
>>>      issues?  If so, a
>>>       >      >>         solution that is ParallelGC-specific
>>>       >      >>         > is less interesting than one that has broader
>>>       >      >>         applicability.  Though maybe this optimization
>>>       >      >>         > is less important for G1 and Shenandoah, since
>>> they
>>>       >     actively
>>>       >      >>         try to avoid full gc?s.
>>>       >      >>         >
>>>       >      >>         > I?m also not clear on how much additional
>>>      memory might be
>>>       >      >>         temporarily allocated by this
>>>       >      >>         > mechanism.
>>>       >      >>
>>>       >      >>         I?ve created a CR for this:
>>>       >      >> https://bugs.openjdk.java.net/browse/JDK-8220465
>>>       >      >>
>>>       >      >
>>>       >
>>>
>>
> 
> 


From stefan.johansson at oracle.com  Mon Oct 14 15:29:43 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 14 Oct 2019 17:29:43 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
Message-ID: <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>

Hi Sangheon (and Kim),

On 2019-10-11 19:34, sangheon.kim at oracle.com wrote:
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp
>> ?? 82?????? _storage.request_memory_on_node(page, _pages_per_region, 
>> node_index);
>> ...
>> ? 153???????? _storage.request_memory_on_node(idx, 1, node_index);
>>
>> I'm not sure request_memory_on_node belongs on the _storage object.
>> The current implementation just has the storage object (conditionally)
>> forward the request to the memory node manager object. These places in
>> the space mapper could just make the calls on the memory node manager
>> object directly (it is already being used nearby).? And these places
>> don't need the conditionalization.
>>
>> I think making the space mapper directly call the memory node manager
>> here would remove the need for the proposed changes to the virtual
>> space class.
> Fixed to directly call G1NUMA::request_memory_on_node() (previously 
> G1MemoryNodeManager).
> But G1NUMA can't calculate raw address, so I had to add base address at 
> G1NUMA to get that.
> 
> When I implemented it, I had similar opinion (not good fit for _storage) 
> but I also wanted to avoid adding extra dependency at G1NUMA. But anyway 
> I realized we can achieve it easily if we have base address.

I don't fully I agree here. I think having the storage do the call to 
G1NUMA does make sense because it knows how to translate a page index to 
a real address. It also goes along the same lines as the pretouch() call 
in commit_regions(), but I won't object if we want to leave it in the 
mapper.

If we do that, there are still some changes required, because we 
currently will call G1NUMA::request_memory_on_node() for all mappers and 
all mappers will then use the heaps base address when calling 
numa_make_local(). So I propose two changes:
1. Expose G1PageBasedVirtualSpace::page_start() or use 
G1CollectedHeap::bottom_addr_for_region(uint index) and let the mapper 
use it to call request_memory_on_node() with a real address rather than 
a page index. Another solution could be to change the function even more 
and call it request_heap_region_on_node() and just pass in the region 
index and then use G1CollectedHeap::bottom_addr_for_region(uint index) 
in G1NUMA.
2. Add a state to the mappers to say if they are NUMA aware or not, and 
currently only the heap mapper should be NUMA aware. We could either set 
this state to true using the mtJavaHeap type as we have checked before 
or add an explicit setter that we only call for the heap mapper.

I know that only doing 2) will fix the current problem, but I think it 
would be nice to avoid having the base address in G1NUMA, thoughts?

> 
> 
> FYI, I filed JDK-8232156 for further investigation of initialization 
> order related to G1NUMA. i.e. about removing G1NUMA::set_region_info().
> 
Thanks for filing this.

> New webrev includes:
> 1. Addressed most comments from Kim, Stefan and Thomas.
> 2. Rename G1MemoryNodeManager to G1NUMA with removing virtual calls.
> 
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.4
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.4.inc
Apart from my comment above I think this looks really good, just one 
small additional comment:
src/hotspot/os/linux/os_linux.cpp
---
3021 #endif
3022
3023   int id = InvalidNUMAId;

Extra whitespace on line 3022.
---

Thanks,
Stefan

> 
> Testing: hs-tier 1 ~ 5 with/without UseNUMA
> 
> Thanks,
> Sangheon
> 
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>>
> 


From kim.barrett at oracle.com  Mon Oct 14 21:03:58 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 14 Oct 2019 17:03:58 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
Message-ID: <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>

> On Oct 14, 2019, at 11:29 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> Hi Sangheon (and Kim),
> 
> On 2019-10-11 19:34, sangheon.kim at oracle.com wrote:
>>> ------------------------------------------------------------------------------ 
>>> src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp
>>>    82       _storage.request_memory_on_node(page, _pages_per_region, node_index);
>>> ...
>>>   153         _storage.request_memory_on_node(idx, 1, node_index);
>>> 
>>> I'm not sure request_memory_on_node belongs on the _storage object.
>>> The current implementation just has the storage object (conditionally)
>>> forward the request to the memory node manager object. These places in
>>> the space mapper could just make the calls on the memory node manager
>>> object directly (it is already being used nearby).  And these places
>>> don't need the conditionalization.
>>> 
>>> I think making the space mapper directly call the memory node manager
>>> here would remove the need for the proposed changes to the virtual
>>> space class.
>> Fixed to directly call G1NUMA::request_memory_on_node() (previously G1MemoryNodeManager).
>> But G1NUMA can't calculate raw address, so I had to add base address at G1NUMA to get that.
>> When I implemented it, I had similar opinion (not good fit for _storage) but I also wanted to avoid adding extra dependency at G1NUMA. But anyway I realized we can achieve it easily if we have base address.
> 
> I don't fully I agree here. I think having the storage do the call to G1NUMA does make sense because it knows how to translate a page index to a real address. It also goes along the same lines as the pretouch() call in commit_regions(), but I won't object if we want to leave it in the mapper.
> 
> If we do that, there are still some changes required, because we currently will call G1NUMA::request_memory_on_node() for all mappers and all mappers will then use the heaps base address when calling numa_make_local(). So I propose two changes:
> 1. Expose G1PageBasedVirtualSpace::page_start() or use G1CollectedHeap::bottom_addr_for_region(uint index) and let the mapper use it to call request_memory_on_node() with a real address rather than a page index. Another solution could be to change the function even more and call it request_heap_region_on_node() and just pass in the region index and then use G1CollectedHeap::bottom_addr_for_region(uint index) in G1NUMA.

I overlooked part of how my suggestion was handled.

Yeah, I don't think I like having the base address added to G1NUMA.  I
like Stefan's change #1 (specifically, adding G1PBVS::page_start()).

I also missed that there seems to be a units mismatch in the call to
request_memory_on_node in G1RegionsSmallerThanCommitSizeMapper. It's
passing a region index rather than a page index (before above change)
or address (after above change).

> 2. Add a state to the mappers to say if they are NUMA aware or not, and currently only the heap mapper should be NUMA aware. We could either set this state to true using the mtJavaHeap type as we have checked before or add an explicit setter that we only call for the heap mapper.
> 
> I know that only doing 2) will fix the current problem, but I think it would be nice to avoid having the base address in G1NUMA, thoughts?

I don't understand the point about mappers needing to know if they are
NUMA or not. request_memory_on_node is only called by the two relevant
region->space mappers, with the memory involved always in the Java
heap (after fixing the units mismatch mentioned above). That is,
G1NUMA::request_memory_on_node should only be called for Java heap
memory. (It might be able to assert is_in_reserved or something like
that, though initialization order might prevent that.)


From kim.barrett at oracle.com  Mon Oct 14 21:31:01 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 14 Oct 2019 17:31:01 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
Message-ID: <23E3BBAA-A298-42FE-B594-7061DC3E0FD9@oracle.com>

> On Oct 14, 2019, at 5:03 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> I also missed that there seems to be a units mismatch in the call to
> request_memory_on_node in G1RegionsSmallerThanCommitSizeMapper. It's
> passing a region index rather than a page index (before above change)
> or address (after above change).

That?s wrong; there are some problematic variable namings here.  start_idx is a
heap region index, idx is a page index.  Sangheon and I discussed this offline
and he?s planning to change some variable names here.


From kim.barrett at oracle.com  Mon Oct 14 22:20:04 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 14 Oct 2019 18:20:04 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
Message-ID: <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>

> On Oct 14, 2019, at 5:03 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>> 2. Add a state to the mappers to say if they are NUMA aware or not, and currently only the heap mapper should be NUMA aware. We could either set this state to true using the mtJavaHeap type as we have checked before or add an explicit setter that we only call for the heap mapper.
>> 
>> I know that only doing 2) will fix the current problem, but I think it would be nice to avoid having the base address in G1NUMA, thoughts?
> 
> I don't understand the point about mappers needing to know if they are
> NUMA or not. request_memory_on_node is only called by the two relevant
> region->space mappers, with the memory involved always in the Java
> heap (after fixing the units mismatch mentioned above). That is,
> G1NUMA::request_memory_on_node should only be called for Java heap
> memory. (It might be able to assert is_in_reserved or something like
> that, though initialization order might prevent that.)

I was confused here too.  Sangheon has repaired my confusion, and he?s
got another change in the works to tidy things up here in a way that I think
will make both me and Stefan happy.


From rs at jelastic.com  Mon Oct 14 22:46:07 2019
From: rs at jelastic.com (Ruslan Synytsky)
Date: Mon, 14 Oct 2019 18:46:07 -0400
Subject: G1 patch of elastic Java heap
In-Reply-To: <mailman.4044.1571058573.25747.hotspot-gc-dev@openjdk.java.net>
References: <mailman.4044.1571058573.25747.hotspot-gc-dev@openjdk.java.net>
Message-ID: <CA++bR4O2_W=JZ5cmLdt8bGAgST4+mkwRR226+yiQRpaXvQVZ6Q@mail.gmail.com>

Dear Liang and Thomas, thank you for your contribution to Java elasticity.

I would like to pay attention to the softmx option which is planned to be
renamed to SoftMaxHeapSize as I understand. According to the feedback in
another thread, if the memory usage reaches the softmx limit then JVM will
throw OOM Error. It differs from the logic described at
https://bugs.openjdk.java.net/browse/JDK-8222145. Personally I believe OOM
Error inside JVM is a little bit safer approach compared to the potential
termination of java process by OOM Killer. But how can we avoid confusions?
Should we use different naming?

Thanks
-- 
Ruslan Synytsky

Date: Mon, 14 Oct 2019 11:52:19 +0800
> Subject: Re: G1 patch of elastic Java heap
> Hi Thomas,
>
> Thank you for the recognition:) Since we both agree on some clear specific
> points,
>  I will try to extract them from current implementation and create a patch
> in OpenJDK
> upstream branch so we can continue discussion on the code level.
>
> Thanks,
> Liang
>
>
>
>
>
>
> ------------------------------------------------------------------
> From:Thomas Schatzl <thomas.schatzl at oracle.com>
> Send Time:2019 Oct. 12 (Sat.) 23:00
> To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <
> hotspot-gc-dev at openjdk.java.net>
> Subject:Re: G1 patch of elastic Java heap
>
> Hi,
>
> On Sat, 2019-10-12 at 19:51 +0800, Liang Mao wrote:
> > Hi Thomas,
> >
> > The manual generation limit can be put aside currently since we know
> > it might not be so general for a GC. We can focus on how to change
> > heap size and return memory in runtime first.
> >
> > GCTimeRatio is a good metric to measure the health of a Java
> > application and I have considered to use that. But finally I chose
> > a simple way just like the periodic old GC. Guarantee a long
> > enough young GC interval is an alternative way to make sure the
> > GCTimeRatio at a heathy state.
> > I'm absolutely ok to use GCTimeRatio instead of the fixed young GC
> > interval. This part is same to ZGC or Shenandoah for how to balance
> > the desired memory size and GC frequency. I'm open to  any good
> > solution and we are already in the same page for this issue
> > I think:)
>
> +1
>
> > A big difference of our implementation is evaluating heap resizing in
> > any young GC instead of a concurrent gc cycle which I think is
> > swifter and more immmediate. The concurrent map/unmap
> > mechanism gets rid of the additional pause time. My thought is the
> > heap shrink/expand can be all determined in young GC pause and
> > performed in concurrent thread which could exclude the
> > considerable time cost by OS interface. Most of our Java users are
> > intolerant to those pause pikes caused by page fault which can be up
> > to seconds. And we also found the issue of time cost by map/unmap in
> > ZGC.
> >
> > A direct advantage of the young GC resizing and concurrent memory
> > free machanism is for implementing SoftMaxHeapSize. The heap size can
> > be changed after last mixed GC. The young GC won't have longer
> > pause and the memory can be freed concurrently without side effect.
>
> Agree and agree. Both evaluating and giving back memory at any gc
> sounds nice, and doing that without incurring the costs in the pause is
> even better :)
>
> Thanks,
>   Thomas
>
>
>


From timberonce at gmail.com  Tue Oct 15 01:33:44 2019
From: timberonce at gmail.com (Mingyu Wu)
Date: Tue, 15 Oct 2019 09:33:44 +0800
Subject: G1GC: The design choice of prefetching
Message-ID: <CAN81=WR-q5HMkGVgzB0JJ33tMmzc0V2kanymv2jr_m821fv7vg@mail.gmail.com>

Hi all,
I find that G1GC (in OpenJDK12) implements a method named
*prefetch_and_push*, which prefetches the header and the first field of an
object referenced by a pointer *p *while *p* is about to be enqueued.
However, the effect of this prefetch instruction can be unstable as the
time when the object is processed is unknown. It is possible that many
references are enqueued before *p *(the data structure is actually
First-In-Last-Out) and finally evict the cache line storing the object,
making the prefetch useless. Therefore, what is the design choice of those
prefetch instructions? Do they stand for some tradeoffs related to the
overhead of prefetching?

Thanks,
Mingyu


From manc at google.com  Tue Oct 15 01:56:12 2019
From: manc at google.com (Man Cao)
Date: Mon, 14 Oct 2019 18:56:12 -0700
Subject: RFR(S): 8232232: G1RemSetSummary::_rs_threads_vtimes is not
 initialized to zero
Message-ID: <CA+w6HxYyx7iRXHRYUjbuSwwkiwKWsATSMNyVovOqPLNYuEXRVQ@mail.gmail.com>

Hi all,

Can I have reviews for this fix for logging messages of "Concurrent
refinement threads times (s)", and code cleanup?

Webrev: https://cr.openjdk.java.net/~manc/8232232/webrev.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8232232

-Man


From kim.barrett at oracle.com  Tue Oct 15 05:57:33 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 15 Oct 2019 01:57:33 -0400
Subject: RFR(S): 8232232: G1RemSetSummary::_rs_threads_vtimes is not
 initialized to zero
In-Reply-To: <CA+w6HxYyx7iRXHRYUjbuSwwkiwKWsATSMNyVovOqPLNYuEXRVQ@mail.gmail.com>
References: <CA+w6HxYyx7iRXHRYUjbuSwwkiwKWsATSMNyVovOqPLNYuEXRVQ@mail.gmail.com>
Message-ID: <1A6E72C3-A0F1-4683-809A-EB8436485715@oracle.com>

> On Oct 14, 2019, at 9:56 PM, Man Cao <manc at google.com> wrote:
> 
> Hi all,
> 
> Can I have reviews for this fix for logging messages of "Concurrent
> refinement threads times (s)", and code cleanup?
> 
> Webrev: https://cr.openjdk.java.net/~manc/8232232/webrev.00/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232232
> 
> -Man

Looks good.


From maoliang.ml at alibaba-inc.com  Tue Oct 15 06:10:52 2019
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Tue, 15 Oct 2019 14:10:52 +0800
Subject: =?UTF-8?B?UmU6IEcxIHBhdGNoIG9mIGVsYXN0aWMgSmF2YSBoZWFw?=
In-Reply-To: <CA++bR4O2_W=JZ5cmLdt8bGAgST4+mkwRR226+yiQRpaXvQVZ6Q@mail.gmail.com>
References: <mailman.4044.1571058573.25747.hotspot-gc-dev@openjdk.java.net>,
 <CA++bR4O2_W=JZ5cmLdt8bGAgST4+mkwRR226+yiQRpaXvQVZ6Q@mail.gmail.com>
Message-ID: <9e1ea9d1-2340-4c47-9249-12cb04886230.maoliang.ml@alibaba-inc.com>

Hi Ruslan and OpenJDK developers,

I noticed this difference too. The softmx in OpenJ9 seems to not allow the application beyong
the new limit while JDK-8222145 treats the SoftMaxHeapSize as a *soft* limit which can be 
exceeded. Personally I prefer the former a little bit. But introducing another name seems more
confused to users. Maybe use an option to control? Like "bool SoftMaxHeapSizeOOM" ?

Thanks,
Liang


------------------------------------------------------------------
From:Ruslan Synytsky <rs at jelastic.com>
Send Time:2019 Oct. 15 (Tue.) 06:46
To:hotspot-gc-dev at openjdk.java.net openjdk.java.net <hotspot-gc-dev at openjdk.java.net>; "MAO, Liang" <maoliang.ml at alibaba-inc.com>; Thomas Schatzl <thomas.schatzl at oracle.com>
Subject:Re: G1 patch of elastic Java heap

Dear Liang and Thomas, thank you for your contribution to Java elasticity.

I would like to pay attention to the softmx option which is planned to be renamed to SoftMaxHeapSize as I understand. According to the feedback in another thread, if the memory usage reaches the softmx limit then JVM will throw OOM Error. It differs from the logic described at https://bugs.openjdk.java.net/browse/JDK-8222145. Personally I believe OOM Error inside JVM is a little bit safer approach compared to the potential termination of java process by OOM Killer. But how can we avoid confusions? Should we use different naming?  

Thanks 
-- 
Ruslan Synytsky

Date: Mon, 14 Oct 2019 11:52:19 +0800
Subject: Re: G1 patch of elastic Java heap
Hi Thomas,

 Thank you for the recognition:) Since we both agree on some clear specific points,
  I will try to extract them from current implementation and create a patch in OpenJDK
 upstream branch so we can continue discussion on the code level.

 Thanks,
 Liang


 ------------------------------------------------------------------
 From:Thomas Schatzl <thomas.schatzl at oracle.com>
 Send Time:2019 Oct. 12 (Sat.) 23:00
 To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
 Subject:Re: G1 patch of elastic Java heap

 Hi,

 On Sat, 2019-10-12 at 19:51 +0800, Liang Mao wrote:
 > Hi Thomas,
 > 
 > The manual generation limit can be put aside currently since we know
 > it might not be so general for a GC. We can focus on how to change
 > heap size and return memory in runtime first. 
 > 
 > GCTimeRatio is a good metric to measure the health of a Java
 > application and I have considered to use that. But finally I chose
 > a simple way just like the periodic old GC. Guarantee a long 
 > enough young GC interval is an alternative way to make sure the
 > GCTimeRatio at a heathy state. 
 > I'm absolutely ok to use GCTimeRatio instead of the fixed young GC
 > interval. This part is same to ZGC or Shenandoah for how to balance
 > the desired memory size and GC frequency. I'm open to  any good
 > solution and we are already in the same page for this issue
 > I think:)

 +1

 > A big difference of our implementation is evaluating heap resizing in
 > any young GC instead of a concurrent gc cycle which I think is
 > swifter and more immmediate. The concurrent map/unmap 
 > mechanism gets rid of the additional pause time. My thought is the
 > heap shrink/expand can be all determined in young GC pause and
 > performed in concurrent thread which could exclude the 
 > considerable time cost by OS interface. Most of our Java users are
 > intolerant to those pause pikes caused by page fault which can be up
 > to seconds. And we also found the issue of time cost by map/unmap in
 > ZGC.
 >
 > A direct advantage of the young GC resizing and concurrent memory
 > free machanism is for implementing SoftMaxHeapSize. The heap size can
 > be changed after last mixed GC. The young GC won't have longer
 > pause and the memory can be freed concurrently without side effect.

 Agree and agree. Both evaluating and giving back memory at any gc
 sounds nice, and doing that without incurring the costs in the pause is
 even better :)

 Thanks,
   Thomas


From maoliang.ml at alibaba-inc.com  Tue Oct 15 06:18:47 2019
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Tue, 15 Oct 2019 14:18:47 +0800
Subject: =?UTF-8?B?UmU6IEcxR0M6IFRoZSBkZXNpZ24gY2hvaWNlIG9mIHByZWZldGNoaW5n?=
In-Reply-To: <CAN81=WR-q5HMkGVgzB0JJ33tMmzc0V2kanymv2jr_m821fv7vg@mail.gmail.com>
References: <CAN81=WR-q5HMkGVgzB0JJ33tMmzc0V2kanymv2jr_m821fv7vg@mail.gmail.com>
Message-ID: <4e9e89f4-69e7-4429-93e5-09f09088b64b.maoliang.ml@alibaba-inc.com>

Hi Mingyu,

The prefetch design is not only available in new versions of G1 GC but introduced
in very early years in hotspot and other GCs like ParNew. It is kind of aggressive
prefecting imho which prefetches all the addresses in the ref queue which contains
*grey pointers* and also creates enough latency between issuing prefetch instructions
and memory access to maximize the cache utilization. 
There could be the problem you mentioned that cache is evicted if overflowed.
Maintaining the proper length of the ref queue is the way to avoid this. You can
 look into the issue below which fixed this problem and improved performance in G1.
https://bugs.openjdk.java.net/browse/JDK-6672778
OpenJDK developers may correct me if there's something I misunderstood.

Thanks,
Liang


------------------------------------------------------------------
From:Mingyu Wu <timberonce at gmail.com>
Send Time:2019 Oct. 15 (Tue.) 09:34
To:hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:G1GC: The design choice of prefetching

Hi all,
I find that G1GC (in OpenJDK12) implements a method named
*prefetch_and_push*, which prefetches the header and the first field of an
object referenced by a pointer *p *while *p* is about to be enqueued.
However, the effect of this prefetch instruction can be unstable as the
time when the object is processed is unknown. It is possible that many
references are enqueued before *p *(the data structure is actually
First-In-Last-Out) and finally evict the cache line storing the object,
making the prefetch useless. Therefore, what is the design choice of those
prefetch instructions? Do they stand for some tradeoffs related to the
overhead of prefetching?

Thanks,
Mingyu

From per.liden at oracle.com  Tue Oct 15 06:46:46 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 08:46:46 +0200
Subject: RFR: 8232235: ZGC: Move ZValue inline funtions to zValue.inline.hpp
Message-ID: <0cafef55-7ae5-f2c5-e6b5-a6db5c0facc3@oracle.com>

Please review this clean up patch to move ZValue inline funtions to 
zValue.inline.hpp.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232235
Webrev: http://cr.openjdk.java.net/~pliden/8232235/webrev.0

/Per


From per.liden at oracle.com  Tue Oct 15 06:47:06 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 08:47:06 +0200
Subject: RFR: 8232236: ZGC: Move ZThread inline funtions to zThread.inline.hpp
Message-ID: <9528eb9e-ae58-fb7b-c593-434f97ec1c0e@oracle.com>

Please review this clean up patch to move ZThread inline funtions to 
zThread.inline.hpp.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232236
Webrev: http://cr.openjdk.java.net/~pliden/8232236/webrev.0

/Per


From per.liden at oracle.com  Tue Oct 15 06:47:21 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 08:47:21 +0200
Subject: RFR: 8232237: ZGC: Move ZArray inline funtions to zArray.inline.hpp
Message-ID: <02bf6bce-9e00-ed88-b5b1-f7e50c218446@oracle.com>

Please review this clean up patch to move ZArray inline funtions to 
zArray.inline.hpp.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232237
Webrev: http://cr.openjdk.java.net/~pliden/8232237/webrev.0

/Per


From per.liden at oracle.com  Tue Oct 15 06:47:35 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 08:47:35 +0200
Subject: RFR: 8232238: ZGC: Move ZList inline funtions to zList.inline.hpp
Message-ID: <41ab05b7-01e3-3e3f-cf1b-7a5a358763ac@oracle.com>

Please review this clean up patch to move ZList inline funtions to 
zList.inline.hpp.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232238
Webrev: http://cr.openjdk.java.net/~pliden/8232238/webrev.0

/Per


From per.liden at oracle.com  Tue Oct 15 06:48:57 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 08:48:57 +0200
Subject: RFR: 8232239: ZGC: Inline ZCPU::count() and ZCPU:id()
Message-ID: <cb85b13c-4a46-6639-8470-9c6a36caa55d@oracle.com>

Please review this patch to enable inlining of ZCPU::count() and 
ZCPU:id(), which are used in some fairly hot paths.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232239
Webrev: http://cr.openjdk.java.net/~pliden/8232239/webrev.0

/Per


From per.liden at oracle.com  Tue Oct 15 08:12:15 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 10:12:15 +0200
Subject: G1 patch of elastic Java heap
In-Reply-To: <9e1ea9d1-2340-4c47-9249-12cb04886230.maoliang.ml@alibaba-inc.com>
References: <mailman.4044.1571058573.25747.hotspot-gc-dev@openjdk.java.net>
 <CA++bR4O2_W=JZ5cmLdt8bGAgST4+mkwRR226+yiQRpaXvQVZ6Q@mail.gmail.com>
 <9e1ea9d1-2340-4c47-9249-12cb04886230.maoliang.ml@alibaba-inc.com>
Message-ID: <e0d84051-114e-66a9-6c06-85deb74cc6ae@oracle.com>

Hi,

On 10/15/19 8:10 AM, Liang Mao wrote:
> Hi Ruslan and OpenJDK developers,
> 
> I noticed this difference too. The softmx in OpenJ9 seems to not allow the application beyong
> the new limit while JDK-8222145 treats the SoftMaxHeapSize as a *soft* limit which can be
> exceeded. Personally I prefer the former a little bit. But introducing another name seems more
> confused to users. Maybe use an option to control? Like "bool SoftMaxHeapSizeOOM" ?

I personally think the OpenJ9 softmx option is misnamed, as it's not a 
*soft* limit, but a *hard* limit. Hotspot's SoftMaxHeapSize is *soft* by 
design. Today's hard limit in Hotspot is of course MaxHeapSize (-Xmx). 
The only problem is that isn't not a manageable flag so it can't be 
changed at runtime. Making it manageable is tricky for GCs that size 
data structures at startup based on MaxHeapSize. One option could be to 
simply reject changes to MaxHeapSize unless the currently used GC 
declares that it supports changing it. Another option could be to keep 
MaxHeapSize as is, and introduce a separate flag (e.g. HardMaxHeapSize 
or CurrentMaxHeapSize). In that case MaxHeapSize would act as the upper 
limit for a the "hard limit" flag.

cheers,
Per

> 
> Thanks,
> Liang
> 
> 
> 
> 
> 
> 
> ------------------------------------------------------------------
> From:Ruslan Synytsky <rs at jelastic.com>
> Send Time:2019 Oct. 15 (Tue.) 06:46
> To:hotspot-gc-dev at openjdk.java.net openjdk.java.net <hotspot-gc-dev at openjdk.java.net>; "MAO, Liang" <maoliang.ml at alibaba-inc.com>; Thomas Schatzl <thomas.schatzl at oracle.com>
> Subject:Re: G1 patch of elastic Java heap
> 
> Dear Liang and Thomas, thank you for your contribution to Java elasticity.
> 
> I would like to pay attention to the softmx option which is planned to be renamed to SoftMaxHeapSize as I understand. According to the feedback in another thread, if the memory usage reaches the softmx limit then JVM will throw OOM Error. It differs from the logic described at https://bugs.openjdk.java.net/browse/JDK-8222145. Personally I believe OOM Error inside JVM is a little bit safer approach compared to the potential termination of java process by OOM Killer. But how can we avoid confusions? Should we use different naming?
> 
> Thanks
> 


From thomas.schatzl at oracle.com  Tue Oct 15 08:17:20 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 10:17:20 +0200
Subject: G1GC: The design choice of prefetching
In-Reply-To: <4e9e89f4-69e7-4429-93e5-09f09088b64b.maoliang.ml@alibaba-inc.com>
References: <CAN81=WR-q5HMkGVgzB0JJ33tMmzc0V2kanymv2jr_m821fv7vg@mail.gmail.com>
 <4e9e89f4-69e7-4429-93e5-09f09088b64b.maoliang.ml@alibaba-inc.com>
Message-ID: <05ce27a3-4e4b-0a46-b9b9-5a7155ac8314@oracle.com>

Hi Mingyuh,

 >> ------------------------------------------------------------------
 >> From:Mingyu Wu <timberonce at gmail.com>
 >> Send Time:2019 Oct. 15 (Tue.) 09:34
 >> To:hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
 >> Subject:G1GC: The design choice of prefetching
 >>
 >> Hi all,
 >> I find that G1GC (in OpenJDK12) implements a method named
 >> *prefetch_and_push*, which prefetches the header and the first field
 >> of an object referenced by a pointer *p *while *p* is about to be
 >> enqueued.
 >> However, the effect of this prefetch instruction can be unstable as
 >> the time when the object is processed is unknown. It is possible that
 >> many references are enqueued before *p *(the data structure is
 >> actually First-In-Last-Out) and finally evict the cache line storing
 >> the object, making the prefetch useless. Therefore, what is the
 >> design choice of those prefetch instructions? Do they stand for some
 >> tradeoffs related to the overhead of prefetching?
 >>
 >> Thanks,
 >> Mingyu
 >
On 15.10.19 08:18, Liang Mao wrote:
> Hi Mingyu,
> 
> The prefetch design is not only available in new versions of G1 GC but introduced
> in very early years in hotspot and other GCs like ParNew. It is kind of aggressive
> prefecting imho which prefetches all the addresses in the ref queue which contains
> *grey pointers* and also creates enough latency between issuing prefetch instructions
> and memory access to maximize the cache utilization.
> There could be the problem you mentioned that cache is evicted if overflowed.
> Maintaining the proper length of the ref queue is the way to avoid this. You can
>   look into the issue below which fixed this problem and improved performance in G1.
> https://bugs.openjdk.java.net/browse/JDK-6672778
> OpenJDK developers may correct me if there's something I misunderstood.
> 
> Thanks,
> Liang
> 

As Liang correctly pointed out, the current oop prefetch design in G1 is 
mostly based on existing precedence in other GCs and lots of testing. 
There are some differences noted below.

As you also pointed out correctly, there is a tradeoff to be made wrt to 
the complexity of this code vs. the actual gains. This code path is in 
my experience *extremely* sensitive to changes, so adding some simple 
heuristic here might nullify all the gains from more timely prefetching.

In my tests, when implementing JDK-6672778 I performed many tests with 
variants of this scheme. The currently implemented one (with the 
upper/lower "trim" bound) proved to be fastest overall.

Compared to other collectors, G1 also always prefetches and pushes as 
indicated in the

   // We're not going to even bother checking whether the object is
   // already forwarded or not, as this usually causes an immediate
   // stall. We'll try to prefetch the object (for write, given that
   // we might need to install the forwarding reference) and we'll
   // get back to it when pop it from the queue

comment in G1ScanClosureBase::prefetch_and_push, contrary to the other 
collectors which first check whether the reference has already been 
forwarded. The current code proved better for G1 at the time.

Other attempted changes like prepending a small entry FIFO in the 
push/pop path just made the whole evacution slower (to induce some 
"fixed" latency between prefetching and work on these reference). But 
maybe I did something wrong here.

These measurements might be invalid at this time, particularly because 
of changes how the java heap roots are traversed (JDK-8213108), so 
revisiting this may be interesting and fruitful.

It would be really interesting to me to hear back from you or anybody 
else in the future about experiments you did whatever the results are; 
even "failed" attempts can be learned from. :)

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Tue Oct 15 08:21:33 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 10:21:33 +0200
Subject: RFR: 8232235: ZGC: Move ZValue inline funtions to
 zValue.inline.hpp
In-Reply-To: <0cafef55-7ae5-f2c5-e6b5-a6db5c0facc3@oracle.com>
References: <0cafef55-7ae5-f2c5-e6b5-a6db5c0facc3@oracle.com>
Message-ID: <0e415dad-e040-749b-b241-cbe798851aa4@oracle.com>

Hi,

On 15.10.19 08:46, Per Liden wrote:
> Please review this clean up patch to move ZValue inline funtions to 
> zValue.inline.hpp.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232235
> Webrev: http://cr.openjdk.java.net/~pliden/8232235/webrev.0
> 
> /Per

zObjectAllocator.hpp: only seems to change the copyright dates, not 
actual change.

No need to re-review removal of this hunk for me.

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Tue Oct 15 08:23:02 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 10:23:02 +0200
Subject: RFR: 8232236: ZGC: Move ZThread inline funtions to
 zThread.inline.hpp
In-Reply-To: <9528eb9e-ae58-fb7b-c593-434f97ec1c0e@oracle.com>
References: <9528eb9e-ae58-fb7b-c593-434f97ec1c0e@oracle.com>
Message-ID: <6e82e947-598f-41db-96dd-eaaf5bc32a20@oracle.com>

Hi,

On 15.10.19 08:47, Per Liden wrote:
> Please review this clean up patch to move ZThread inline funtions to 
> zThread.inline.hpp.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232236
> Webrev: http://cr.openjdk.java.net/~pliden/8232236/webrev.0
> 
> /Per

   looks good.

Thomas


From thomas.schatzl at oracle.com  Tue Oct 15 08:23:50 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 10:23:50 +0200
Subject: RFR: 8232237: ZGC: Move ZArray inline funtions to
 zArray.inline.hpp
In-Reply-To: <02bf6bce-9e00-ed88-b5b1-f7e50c218446@oracle.com>
References: <02bf6bce-9e00-ed88-b5b1-f7e50c218446@oracle.com>
Message-ID: <4da7de41-ddc8-bf25-27aa-abcee4dfeb14@oracle.com>

Hi,

On 15.10.19 08:47, Per Liden wrote:
> Please review this clean up patch to move ZArray inline funtions to 
> zArray.inline.hpp.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232237
> Webrev: http://cr.openjdk.java.net/~pliden/8232237/webrev.0
> 
> /Per

   looks good (and trivial?).

Thomas


From thomas.schatzl at oracle.com  Tue Oct 15 08:32:23 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 10:32:23 +0200
Subject: RFR(S): 8232232: G1RemSetSummary::_rs_threads_vtimes is not
 initialized to zero
In-Reply-To: <CA+w6HxYyx7iRXHRYUjbuSwwkiwKWsATSMNyVovOqPLNYuEXRVQ@mail.gmail.com>
References: <CA+w6HxYyx7iRXHRYUjbuSwwkiwKWsATSMNyVovOqPLNYuEXRVQ@mail.gmail.com>
Message-ID: <75800cc4-9d67-bc11-c0b7-4b0a56b85ab5@oracle.com>

Hi Man,

On 15.10.19 03:56, Man Cao wrote:
> Hi all,
> 
> Can I have reviews for this fix for logging messages of "Concurrent
> refinement threads times (s)", and code cleanup?
> 
> Webrev: https://cr.openjdk.java.net/~manc/8232232/webrev.00/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232232
> 
> -Man
> 

   looks good.

Thanks,
   Thomas


From per.liden at oracle.com  Tue Oct 15 09:13:19 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 11:13:19 +0200
Subject: RFR: 8232237: ZGC: Move ZArray inline funtions to
 zArray.inline.hpp
In-Reply-To: <4da7de41-ddc8-bf25-27aa-abcee4dfeb14@oracle.com>
References: <02bf6bce-9e00-ed88-b5b1-f7e50c218446@oracle.com>
 <4da7de41-ddc8-bf25-27aa-abcee4dfeb14@oracle.com>
Message-ID: <f33e9edf-bb41-182f-ddca-4881f0c7a038@oracle.com>

Thanks Thomas!

/Per

On 10/15/19 10:23 AM, Thomas Schatzl wrote:
> Hi,
> 
> On 15.10.19 08:47, Per Liden wrote:
>> Please review this clean up patch to move ZArray inline funtions to 
>> zArray.inline.hpp.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232237
>> Webrev: http://cr.openjdk.java.net/~pliden/8232237/webrev.0
>>
>> /Per
> 
>  ? looks good (and trivial?).
> 
> Thomas


From per.liden at oracle.com  Tue Oct 15 09:13:30 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 11:13:30 +0200
Subject: RFR: 8232236: ZGC: Move ZThread inline funtions to
 zThread.inline.hpp
In-Reply-To: <6e82e947-598f-41db-96dd-eaaf5bc32a20@oracle.com>
References: <9528eb9e-ae58-fb7b-c593-434f97ec1c0e@oracle.com>
 <6e82e947-598f-41db-96dd-eaaf5bc32a20@oracle.com>
Message-ID: <0ad7e93c-af8a-d0be-202c-9dca7eafa27f@oracle.com>

Thanks Thomas!

/Per

On 10/15/19 10:23 AM, Thomas Schatzl wrote:
> Hi,
> 
> On 15.10.19 08:47, Per Liden wrote:
>> Please review this clean up patch to move ZThread inline funtions to 
>> zThread.inline.hpp.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232236
>> Webrev: http://cr.openjdk.java.net/~pliden/8232236/webrev.0
>>
>> /Per
> 
>  ? looks good.
> 
> Thomas


From per.liden at oracle.com  Tue Oct 15 09:14:28 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 11:14:28 +0200
Subject: RFR: 8232235: ZGC: Move ZValue inline funtions to
 zValue.inline.hpp
In-Reply-To: <0e415dad-e040-749b-b241-cbe798851aa4@oracle.com>
References: <0cafef55-7ae5-f2c5-e6b5-a6db5c0facc3@oracle.com>
 <0e415dad-e040-749b-b241-cbe798851aa4@oracle.com>
Message-ID: <8ec5a1ea-fb86-ae4c-86f1-95ac36ffe975@oracle.com>

On 10/15/19 10:21 AM, Thomas Schatzl wrote:
> Hi,
> 
> On 15.10.19 08:46, Per Liden wrote:
>> Please review this clean up patch to move ZValue inline funtions to 
>> zValue.inline.hpp.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232235
>> Webrev: http://cr.openjdk.java.net/~pliden/8232235/webrev.0
>>
>> /Per
> 
> zObjectAllocator.hpp: only seems to change the copyright dates, not 
> actual change.

Good catch, I'll revert that.

Thanks for reviewing, Thomas!

/Per

> 
> No need to re-review removal of this hunk for me.
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Tue Oct 15 09:21:37 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 11:21:37 +0200
Subject: RFR: 8232239: ZGC: Inline ZCPU::count() and ZCPU:id()
In-Reply-To: <cb85b13c-4a46-6639-8470-9c6a36caa55d@oracle.com>
References: <cb85b13c-4a46-6639-8470-9c6a36caa55d@oracle.com>
Message-ID: <89e96eb4-8ae3-ed24-8cf0-dccf95967ab6@oracle.com>

Hi,

On 15.10.19 08:48, Per Liden wrote:
> Please review this patch to enable inlining of ZCPU::count() and 
> ZCPU:id(), which are used in some fairly hot paths.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232239
> Webrev: http://cr.openjdk.java.net/~pliden/8232239/webrev.0
> 
> /Per

   looks good.

Thomas


From per.liden at oracle.com  Tue Oct 15 09:33:41 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 11:33:41 +0200
Subject: RFR: 8232239: ZGC: Inline ZCPU::count() and ZCPU:id()
In-Reply-To: <89e96eb4-8ae3-ed24-8cf0-dccf95967ab6@oracle.com>
References: <cb85b13c-4a46-6639-8470-9c6a36caa55d@oracle.com>
 <89e96eb4-8ae3-ed24-8cf0-dccf95967ab6@oracle.com>
Message-ID: <c3bc5035-9a4d-36d1-3f73-802ba895206a@oracle.com>

Thanks Thomas!

/Per

On 10/15/19 11:21 AM, Thomas Schatzl wrote:
> Hi,
> 
> On 15.10.19 08:48, Per Liden wrote:
>> Please review this patch to enable inlining of ZCPU::count() and 
>> ZCPU:id(), which are used in some fairly hot paths.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232239
>> Webrev: http://cr.openjdk.java.net/~pliden/8232239/webrev.0
>>
>> /Per
> 
>  ? looks good.
> 
> Thomas


From erik.osterlund at oracle.com  Tue Oct 15 10:43:15 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 15 Oct 2019 12:43:15 +0200
Subject: RFR: 8232238: ZGC: Move ZList inline funtions to zList.inline.hpp
In-Reply-To: <41ab05b7-01e3-3e3f-cf1b-7a5a358763ac@oracle.com>
References: <41ab05b7-01e3-3e3f-cf1b-7a5a358763ac@oracle.com>
Message-ID: <93d9ec60-2458-e6f2-d64b-e2857225f46a@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

On 10/15/19 8:47 AM, Per Liden wrote:
> Please review this clean up patch to move ZList inline funtions to 
> zList.inline.hpp.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232238
> Webrev: http://cr.openjdk.java.net/~pliden/8232238/webrev.0
>
> /Per


From per.liden at oracle.com  Tue Oct 15 13:07:56 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 15 Oct 2019 15:07:56 +0200
Subject: RFR: 8232238: ZGC: Move ZList inline funtions to zList.inline.hpp
In-Reply-To: <93d9ec60-2458-e6f2-d64b-e2857225f46a@oracle.com>
References: <93d9ec60-2458-e6f2-d64b-e2857225f46a@oracle.com>
Message-ID: <76AD753C-42A0-4743-A85F-4BFA4BFC2551@oracle.com>

Thanks Erik!

/Per

> On 15 Oct 2019, at 12:43, erik.osterlund at oracle.com wrote:
> 
> ?Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
>> On 10/15/19 8:47 AM, Per Liden wrote:
>> Please review this clean up patch to move ZList inline funtions to zList.inline.hpp.
>> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232238
>> Webrev: http://cr.openjdk.java.net/~pliden/8232238/webrev.0
>> 
>> /Per
> 


From zgu at redhat.com  Tue Oct 15 13:13:38 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 15 Oct 2019 09:13:38 -0400
Subject: RFR (XS) 8232205: Shenandoah: missing "Update References" ->
 "Update Roots" tracing
In-Reply-To: <d4caad65-03fc-fe9e-89c0-62b364bdd560@redhat.com>
References: <d4caad65-03fc-fe9e-89c0-62b364bdd560@redhat.com>
Message-ID: <eab8115c-1f0b-7a55-2b8d-871fb69f1b1b@redhat.com>

Looks good to me.

-Zhengyu

On 10/14/19 5:20 AM, Aleksey Shipilev wrote:
> Bug:
>    https://bugs.openjdk.java.net/browse/JDK-8232205
> 
> Noticed that -Xlog:gc+stats does not print "Update Roots" section for "Update References". This is a
> regression since JDK-8223951.
> 
> Fix:
>    https://cr.openjdk.java.net/~shade/8232205/webrev.01/
> 
> Testing: hotspot_gc_shenandoah, eyeballing gc+stats
> 


From thomas.schatzl at oracle.com  Tue Oct 15 13:13:47 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 15:13:47 +0200
Subject: RFR (XXS): 8232260: Remove g1 prefix in
 G1CollectedHeap::g1_hot_card_cache() getter
Message-ID: <6eda50a7-fe97-a101-5402-6a09c005e209@oracle.com>

Hi,

   can I have reviews for this small cleanup that removes some "g1_" 
prefix from some getter and some related unnecessary friend declaration.


CR:
https://bugs.openjdk.java.net/browse/JDK-8232260
Webrev:
http://cr.openjdk.java.net/~tschatzl/8232260/webrev/
Testing:
local compilation

Thanks,
   Thomas


From stefan.johansson at oracle.com  Tue Oct 15 13:20:25 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 15 Oct 2019 15:20:25 +0200
Subject: RFR (XXS): 8232260: Remove g1 prefix in
 G1CollectedHeap::g1_hot_card_cache() getter
In-Reply-To: <6eda50a7-fe97-a101-5402-6a09c005e209@oracle.com>
References: <6eda50a7-fe97-a101-5402-6a09c005e209@oracle.com>
Message-ID: <89805899-3c84-4632-c995-4d130408aebf@oracle.com>

Looks good!

On 2019-10-15 15:13, Thomas Schatzl wrote:
> Hi,
> 
>  ? can I have reviews for this small cleanup that removes some "g1_" 
> prefix from some getter and some related unnecessary friend declaration.
> 
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232260
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232260/webrev/
> Testing:
> local compilation
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Tue Oct 15 13:23:34 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 15:23:34 +0200
Subject: RFR (XXS): 8232260: Remove g1 prefix in
 G1CollectedHeap::g1_hot_card_cache() getter
In-Reply-To: <89805899-3c84-4632-c995-4d130408aebf@oracle.com>
References: <6eda50a7-fe97-a101-5402-6a09c005e209@oracle.com>
 <89805899-3c84-4632-c995-4d130408aebf@oracle.com>
Message-ID: <9ecc0e8a-d1a1-7858-7778-7eb709873d9a@oracle.com>

Hi Stefan,

On 15.10.19 15:20, Stefan Johansson wrote:
> Looks good!
> 
> On 2019-10-15 15:13, Thomas Schatzl wrote:
>> Hi,
>>
>> ?? can I have reviews for this small cleanup that removes some "g1_" 
>> prefix from some getter and some related unnecessary friend declaration.
>>

Thanks for your review,
   Thomas


From sangheon.kim at oracle.com  Tue Oct 15 14:33:07 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 15 Oct 2019 07:33:07 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
Message-ID: <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>

Hi all,

Here's revised webrev which addresses:
1) G1RegionToSpaceMapper checks mtJavaHeap and then conditionally calls 
G1NUMA::request_memory_on_node() (Kim)
2) The signature of G1NUMA::request_memory_on_node(void* address, ,) is 
changed to have actual address instead of page index. (Stefan)
3) Some local variable name changes at G1RegionToSpaceMapper. i -> 
region_idx, idx -> page_idx (for local style, used idx instead of index)

webrev:
http://cr.openjdk.java.net/~sangheki/8220310/webrev.5/
http://cr.openjdk.java.net/~sangheki/8220310/webrev.5.inc/
Testing: hs-tier 1 ~ 5, with/without UseNUMA

Thanks,
Sangheon


On 10/14/19 3:20 PM, Kim Barrett wrote:
>> On Oct 14, 2019, at 5:03 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>> 2. Add a state to the mappers to say if they are NUMA aware or not, and currently only the heap mapper should be NUMA aware. We could either set this state to true using the mtJavaHeap type as we have checked before or add an explicit setter that we only call for the heap mapper.
>>>
>>> I know that only doing 2) will fix the current problem, but I think it would be nice to avoid having the base address in G1NUMA, thoughts?
>> I don't understand the point about mappers needing to know if they are
>> NUMA or not. request_memory_on_node is only called by the two relevant
>> region->space mappers, with the memory involved always in the Java
>> heap (after fixing the units mismatch mentioned above). That is,
>> G1NUMA::request_memory_on_node should only be called for Java heap
>> memory. (It might be able to assert is_in_reserved or something like
>> that, though initialization order might prevent that.)
> I was confused here too.  Sangheon has repaired my confusion, and he?s
> got another change in the works to tidy things up here in a way that I think
> will make both me and Stefan happy.
>


From kim.barrett at oracle.com  Tue Oct 15 14:34:21 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 15 Oct 2019 10:34:21 -0400
Subject: RFR (XXS): 8232260: Remove g1 prefix in
 G1CollectedHeap::g1_hot_card_cache() getter
In-Reply-To: <6eda50a7-fe97-a101-5402-6a09c005e209@oracle.com>
References: <6eda50a7-fe97-a101-5402-6a09c005e209@oracle.com>
Message-ID: <07917E2B-A8CC-43E2-BCA5-F4E0548EB6EE@oracle.com>

> On Oct 15, 2019, at 9:13 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
>  can I have reviews for this small cleanup that removes some "g1_" prefix from some getter and some related unnecessary friend declaration.
> 
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232260
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232260/webrev/
> Testing:
> local compilation
> 
> Thanks,
>  Thomas

Looks good.


From thomas.schatzl at oracle.com  Tue Oct 15 14:40:44 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Oct 2019 16:40:44 +0200
Subject: RFR (XXS): 8232260: Remove g1 prefix in
 G1CollectedHeap::g1_hot_card_cache() getter
In-Reply-To: <07917E2B-A8CC-43E2-BCA5-F4E0548EB6EE@oracle.com>
References: <6eda50a7-fe97-a101-5402-6a09c005e209@oracle.com>
 <07917E2B-A8CC-43E2-BCA5-F4E0548EB6EE@oracle.com>
Message-ID: <47bb3be2-dffc-3822-d19c-8fad3a5dd986@oracle.com>

Hi Kim,

On 15.10.19 16:34, Kim Barrett wrote:
>> On Oct 15, 2019, at 9:13 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi,
>>
>>   can I have reviews for this small cleanup that removes some "g1_" prefix from some getter and some related unnecessary friend declaration.
>>
>>[...]
>>
>> Thanks,
>>   Thomas
> 
> Looks good.
> 

   thanks for your review.

Thomas


From rkennke at redhat.com  Tue Oct 15 14:45:12 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 15 Oct 2019 16:45:12 +0200
Subject: RFR (XS) 8232205: Shenandoah: missing "Update References" ->
 "Update Roots" tracing
In-Reply-To: <d4caad65-03fc-fe9e-89c0-62b364bdd560@redhat.com>
References: <d4caad65-03fc-fe9e-89c0-62b364bdd560@redhat.com>
Message-ID: <4da2e730-9649-4199-f7c2-a5fc1c64dde8@redhat.com>

Ok. Thanks!

Roman


> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8232205
> 
> Noticed that -Xlog:gc+stats does not print "Update Roots" section for "Update References". This is a
> regression since JDK-8223951.
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232205/webrev.01/
> 
> Testing: hotspot_gc_shenandoah, eyeballing gc+stats
> 


From rs at jelastic.com  Tue Oct 15 16:26:08 2019
From: rs at jelastic.com (Ruslan Synytsky)
Date: Tue, 15 Oct 2019 12:26:08 -0400
Subject: G1 patch of elastic Java heap
In-Reply-To: <e0d84051-114e-66a9-6c06-85deb74cc6ae@oracle.com>
References: <mailman.4044.1571058573.25747.hotspot-gc-dev@openjdk.java.net>
 <CA++bR4O2_W=JZ5cmLdt8bGAgST4+mkwRR226+yiQRpaXvQVZ6Q@mail.gmail.com>
 <9e1ea9d1-2340-4c47-9249-12cb04886230.maoliang.ml@alibaba-inc.com>
 <e0d84051-114e-66a9-6c06-85deb74cc6ae@oracle.com>
Message-ID: <CA++bR4N6MAMD+NvANYypYbgsKKV2z_A1aH13f6RSCWsGaOmSTw@mail.gmail.com>

HardMaxHeapSize sounds logical to me, so we will have SoftMaxHeapSize and
HardMaxHeapSize - easier to understand and remember.
Regards
-- 
Ruslan Synytsky

On Tue, 15 Oct 2019 at 04:12, Per Liden <per.liden at oracle.com> wrote:

> Hi,
>
> On 10/15/19 8:10 AM, Liang Mao wrote:
> > Hi Ruslan and OpenJDK developers,
> >
> > I noticed this difference too. The softmx in OpenJ9 seems to not allow
> the application beyong
> > the new limit while JDK-8222145 treats the SoftMaxHeapSize as a *soft*
> limit which can be
> > exceeded. Personally I prefer the former a little bit. But introducing
> another name seems more
> > confused to users. Maybe use an option to control? Like "bool
> SoftMaxHeapSizeOOM" ?
>
> I personally think the OpenJ9 softmx option is misnamed, as it's not a
> *soft* limit, but a *hard* limit. Hotspot's SoftMaxHeapSize is *soft* by
> design. Today's hard limit in Hotspot is of course MaxHeapSize (-Xmx).
> The only problem is that isn't not a manageable flag so it can't be
> changed at runtime. Making it manageable is tricky for GCs that size
> data structures at startup based on MaxHeapSize. One option could be to
> simply reject changes to MaxHeapSize unless the currently used GC
> declares that it supports changing it. Another option could be to keep
> MaxHeapSize as is, and introduce a separate flag (e.g. HardMaxHeapSize
> or CurrentMaxHeapSize). In that case MaxHeapSize would act as the upper
> limit for a the "hard limit" flag.
>
> cheers,
> Per
>
> >
> > Thanks,
> > Liang
> >
> >
> >
> >
> >
> >
> > ------------------------------------------------------------------
> > From:Ruslan Synytsky <rs at jelastic.com>
> > Send Time:2019 Oct. 15 (Tue.) 06:46
> > To:hotspot-gc-dev at openjdk.java.net openjdk.java.net <
> hotspot-gc-dev at openjdk.java.net>; "MAO, Liang" <
> maoliang.ml at alibaba-inc.com>; Thomas Schatzl <thomas.schatzl at oracle.com>
> > Subject:Re: G1 patch of elastic Java heap
> >
> > Dear Liang and Thomas, thank you for your contribution to Java
> elasticity.
> >
> > I would like to pay attention to the softmx option which is planned to
> be renamed to SoftMaxHeapSize as I understand. According to the feedback in
> another thread, if the memory usage reaches the softmx limit then JVM will
> throw OOM Error. It differs from the logic described at
> https://bugs.openjdk.java.net/browse/JDK-8222145. Personally I believe
> OOM Error inside JVM is a little bit safer approach compared to the
> potential termination of java process by OOM Killer. But how can we avoid
> confusions? Should we use different naming?
> >
> > Thanks
> >
>


From shade at redhat.com  Tue Oct 15 17:33:45 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 15 Oct 2019 19:33:45 +0200
Subject: RFR (S) 8232051: Epsilon should warn about Xms/Xmx/AlwaysPreTouch
 configuration
In-Reply-To: <dac4aee1-a4df-5ae1-ea95-57c4913c22b5@redhat.com>
References: <dac4aee1-a4df-5ae1-ea95-57c4913c22b5@redhat.com>
Message-ID: <4d818e96-e0b4-00f2-6a5a-85cdfabb91c9@redhat.com>

On 10/9/19 4:15 PM, Aleksey Shipilev wrote:
> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8232051
> 
> This is arguably the UX bug: users expect low latency, but may not be aware that additional
> configuration is needed for GCs to perform well in those conditions. Epsilon already enables LSM,
> and should warn about Xms/Xmx/AlwaysPreTouch config too. It cannot adjust these settings, though,
> because it would affect startup time -- users would have to opt-in.
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232051/webrev.01/

Friendly reminder.

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Tue Oct 15 17:42:40 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 15 Oct 2019 13:42:40 -0400
Subject: RFR (S) 8232051: Epsilon should warn about Xms/Xmx/AlwaysPreTouch
 configuration
In-Reply-To: <4d818e96-e0b4-00f2-6a5a-85cdfabb91c9@redhat.com>
References: <dac4aee1-a4df-5ae1-ea95-57c4913c22b5@redhat.com>
 <4d818e96-e0b4-00f2-6a5a-85cdfabb91c9@redhat.com>
Message-ID: <c16066f9-fb2d-2377-5caf-e9cd30b566db@redhat.com>

Looks good to me.

Thanks,

-Zhengyu


On 10/15/19 1:33 PM, Aleksey Shipilev wrote:
> On 10/9/19 4:15 PM, Aleksey Shipilev wrote:
>> RFE:
>>    https://bugs.openjdk.java.net/browse/JDK-8232051
>>
>> This is arguably the UX bug: users expect low latency, but may not be aware that additional
>> configuration is needed for GCs to perform well in those conditions. Epsilon already enables LSM,
>> and should warn about Xms/Xmx/AlwaysPreTouch config too. It cannot adjust these settings, though,
>> because it would affect startup time -- users would have to opt-in.
>>
>> Fix:
>>    https://cr.openjdk.java.net/~shade/8232051/webrev.01/
> 
> Friendly reminder.
> 


From shade at redhat.com  Tue Oct 15 18:00:08 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 15 Oct 2019 20:00:08 +0200
Subject: RFR (S) 8232051: Epsilon should warn about Xms/Xmx/AlwaysPreTouch
 configuration
In-Reply-To: <c16066f9-fb2d-2377-5caf-e9cd30b566db@redhat.com>
References: <dac4aee1-a4df-5ae1-ea95-57c4913c22b5@redhat.com>
 <4d818e96-e0b4-00f2-6a5a-85cdfabb91c9@redhat.com>
 <c16066f9-fb2d-2377-5caf-e9cd30b566db@redhat.com>
Message-ID: <c3e8a5c3-8617-86e0-7603-4eb799b2edd1@redhat.com>

Thank you, pushed.

-Aleksey

On 10/15/19 7:42 PM, Zhengyu Gu wrote:
> Looks good to me.
> 
> Thanks,
> 
> -Zhengyu
> 
> 
> On 10/15/19 1:33 PM, Aleksey Shipilev wrote:
>> On 10/9/19 4:15 PM, Aleksey Shipilev wrote:
>>> RFE:
>>> ?? https://bugs.openjdk.java.net/browse/JDK-8232051
>>>
>>> This is arguably the UX bug: users expect low latency, but may not be aware that additional
>>> configuration is needed for GCs to perform well in those conditions. Epsilon already enables LSM,
>>> and should warn about Xms/Xmx/AlwaysPreTouch config too. It cannot adjust these settings, though,
>>> because it would affect startup time -- users would have to opt-in.
>>>
>>> Fix:
>>> ?? https://cr.openjdk.java.net/~shade/8232051/webrev.01/
>>
>> Friendly reminder.


From kishor.kharbas at intel.com  Wed Oct 16 01:23:30 2019
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Wed, 16 Oct 2019 01:23:30 +0000
Subject: RFR(S): 8215893: Add better abstraction for pinning G1
 concurrent marking bitmaps.
In-Reply-To: <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
 <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F3CB57D45D@ORSMSX116.amr.corp.intel.com>

Thank you for the suggestions.
In this webrev I added a flag to ReservedSpace constructors to direct it to pin the memory space. So now G1PageBasedVirtualSpace does not have to do special handling.

http://cr.openjdk.java.net/~kkharbas/8215893/webrev.02/

To add more to Sangheon's reply to Stefan's question,
> Another thing, can you remind me why we need the bitmaps to be pinned but not other structures such as the card table?
When I implemented this feature I had run into issue with the default implementation of concurrent marking bitmaps.

Thanks,
Kishor

From: sangheon.kim at oracle.com [mailto:sangheon.kim at oracle.com]
Sent: Wednesday, October 9, 2019 2:42 PM
To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-gc-dev at openjdk.java.net
Cc: Stefan Johansson <stefan.johansson at oracle.com>
Subject: Re: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent marking bitmaps.

Hi Kishor,
On 10/4/19 4:15 PM, Kharbas, Kishor wrote:

Hi Stefan,

Thanks for the review. Some comments inline.

New webrev : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00_to_01/

                              http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/
I am reviewing the patch but have a question on top of Stefan's question[1].
Why the bimap mappers are committed? I think all troubles started from 'committing but treating as special here. Couldn't just treat the bitmap mappers as 'special' without commit?
If 'not committing' is doable, couldn't simply create ReservedSpace with 'special' enabled (independent to large page setting, which is same to Stefan's comment)? Or add PinnedResevedSpace to force 'special enabled'.

[1]: Another thing, can you remind me why we need the bitmaps to be pinned but not other structures such as the card table?

+HeterogeneousHeapRegionManager::initialize()

...

+  // We commit bitmap for all regions during initialization and mark the bitmap space as special.

+  // This allows regions to be un-committed while concurrent-marking threads are accesing the bitmap concurrently.


Thanks,
Sangheon


> Hi Kishor,

>

> On 04.10.19 03:00, Kharbas, Kishor wrote:

>> Hi,

>> When I worked on JDK-8211425<https://bugs.openjdk.java.net/browse/JDK-8211425><https://bugs.openjdk.java.net/browse/JDK-8211425>, there was a request for better abstraction for pinning G1's CM bitmaps. RFE for the request is here - JDK-8215893<https://bugs.openjdk.java.net/browse/JDK-8215893><https://bugs.openjdk.java.net/browse/JDK-8215893>.

>>

>> Here is a proposal : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/

>>

>> Here G1PageBasedVirtualSpace pins the entire reserved memory to memory during construction. The constructor takes an additional bool flag which says "does it need to pin the memory".

>> If the memory is pinned, '_special' flag is set to true. I piggy back on _special flag's behavior which is to not do actual OS (un-)commits on calls to (un)commit().

>> Rest of the changes is the mechanism to pass this flag from CM bitmaps creation in G1CollectedHeap all the way to G1PageBasedVirtualSpace.

>>

>> Let me know if this is a good abstraction and if there is any better way.

>>

>> Thanks

>> Kishor

>>

>

> Some comments:

>

> - in the parameter lists, if the parameters are already laid out

> line-by-line, if adding a new one, please put it on a new line as well.

>

Fixed in the new webrev.


> - this code

>

>    if (_special) {

>      if (!rs.special()) {

>        commit_internal(addr_to_page_index(_low_boundary),

> addr_to_page_index(_high_boundary));

>      }

>

> in g1PageBasedVirtualSpace looks very incomprehensible.  :)

>

> I would prefer (pending the second reviewer's comment) to either use the

> "pinned" flag here, or even better, move the necessary commit calls into

> the (now removed) HeterogeneousHeapRegionManager::initialize().

>

Made it little more comprehensible. Will see what other reviewers think about moving it somewhere else.


> - I would just purely from feeling prefer if the "pinned" flag parameter

> would be listed after the "type" parameter in the G1RegionToSpaceMapper.

> But that's probably just me.

>

I did it this way to logically group the parameters. MemTracker is a tracker used by the VM everywhere and does not pertain to this class as such, so I kept it in the end.


> Also, finally one parameter per line for the declaration/definition of

> the constructor would improve readability.

>

Done.

Thank you,

Kishor


> Thanks,

>    Thomas


From erik.osterlund at oracle.com  Wed Oct 16 06:51:08 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Wed, 16 Oct 2019 08:51:08 +0200
Subject: RFR: 8231940: ZGC: Print correct low/high capacity
In-Reply-To: <2742b8ba-7fa0-b789-a250-4c9de40e1fc0@oracle.com>
References: <2742b8ba-7fa0-b789-a250-4c9de40e1fc0@oracle.com>
Message-ID: <EE498126-A757-4209-94D8-72DE4D0A5B0F@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

> On 7 Oct 2019, at 13:37, Per Liden <per.liden at oracle.com> wrote:
> 
> ?After JDK-8222480, heap capacity can go down, not just up. The heap logging should take that into account when when printing capacity high/low numbers.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231940
> Webrev: http://cr.openjdk.java.net/~pliden/8231940/webrev.0
> 
> /Per


From erik.osterlund at oracle.com  Wed Oct 16 06:56:08 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Wed, 16 Oct 2019 08:56:08 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <0a2eee49-9bb4-a1be-f8fc-b2efcc01fd59@oracle.com>
References: <0a2eee49-9bb4-a1be-f8fc-b2efcc01fd59@oracle.com>
Message-ID: <7FC905BD-8F45-4D04-9C2C-C473AB0FA3DD@oracle.com>

+1

/Erik

> On 10 Oct 2019, at 14:28, Per Liden <per.liden at oracle.com> wrote:
> 
> ?(CC:ing serviceability-dev)
> 
>> On 10/7/19 2:38 PM, Per Liden wrote:
>> This test is currently disabled for ZGC, but it can easily be enabled by adjusting the expected log string. ZGC doesn't print "Pause Full", but it still prints the "(Diagnostic Command)" part.
>> Also, the test enables gc=debug logging, which is unnecessary since this is always printed on the gc=info level.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231943
>> Webrev: http://cr.openjdk.java.net/~pliden/8231943/webrev.0
>> Testing: Manually ran test with all GCs (except Epsilon)
>> /Per


From erik.osterlund at oracle.com  Wed Oct 16 07:01:48 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Wed, 16 Oct 2019 09:01:48 +0200
Subject: RFR: 8232001: ZGC: Ignore metaspace GC threshold until GC is warm
In-Reply-To: <bf25c1c7-c8a1-b7d6-85cd-d5ff96c189a7@oracle.com>
References: <bf25c1c7-c8a1-b7d6-85cd-d5ff96c189a7@oracle.com>
Message-ID: <489ABCE3-B37A-46D9-AC4B-5535957B7DCE@oracle.com>

Hi Per,

Looks good.

/Erik

> On 8 Oct 2019, at 15:03, Per Liden <per.liden at oracle.com> wrote:
> 
> ?As reported here:
> 
> https://mail.openjdk.java.net/pipermail/zgc-dev/2019-September/000736.html
> 
> The ZDirector heuristics can get of to a bad start if the statistics is contaminated by early "Metaspace GC Threshold" GC requests. To avoid this, we could simply ignore such requests until the GC is warm, at the potential cost of expanding metaspace a bit more during startup.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232001
> Webrev: http://cr.openjdk.java.net/~pliden/8232001/webrev.0
> 
> /Per


From erik.osterlund at oracle.com  Wed Oct 16 07:19:26 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Wed, 16 Oct 2019 09:19:26 +0200
Subject: RFR: 8231996: ZGC: Replace ZStatisticsForceTrace with check if
 JFR event is enabled
In-Reply-To: <b3f50d5d-002a-02c2-39d5-d07d4e9a0c27@oracle.com>
References: <b3f50d5d-002a-02c2-39d5-d07d4e9a0c27@oracle.com>
Message-ID: <32019E41-9014-450F-BA62-AB1B71A9B886@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

> On 10 Oct 2019, at 12:28, Per Liden <per.liden at oracle.com> wrote:
> 
> ?Remove and replace the diagnostic flag ZStatisticsForceTrace with a check if JFR event is enabled. This flag was introduced as a safety measure back when sending JFR events was problematic in some contexts. This is no longer the case, so we can just let the default.jfc/profile.jfc control when those events should be sent.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231996
> Webrev: http://cr.openjdk.java.net/~pliden/8231996/webrev.0
> 
> /Per


From per.liden at oracle.com  Wed Oct 16 07:44:13 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 16 Oct 2019 09:44:13 +0200
Subject: RFR: 8231940: ZGC: Print correct low/high capacity
In-Reply-To: <EE498126-A757-4209-94D8-72DE4D0A5B0F@oracle.com>
References: <2742b8ba-7fa0-b789-a250-4c9de40e1fc0@oracle.com>
 <EE498126-A757-4209-94D8-72DE4D0A5B0F@oracle.com>
Message-ID: <80469a8e-da74-dc06-3a0c-7b1c3dbdbd08@oracle.com>

Thanks Erik!

/Per

On 10/16/19 8:51 AM, Erik Osterlund wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
>> On 7 Oct 2019, at 13:37, Per Liden <per.liden at oracle.com> wrote:
>>
>> ?After JDK-8222480, heap capacity can go down, not just up. The heap logging should take that into account when when printing capacity high/low numbers.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231940
>> Webrev: http://cr.openjdk.java.net/~pliden/8231940/webrev.0
>>
>> /Per
> 


From per.liden at oracle.com  Wed Oct 16 07:44:33 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 16 Oct 2019 09:44:33 +0200
Subject: RFR: 8232001: ZGC: Ignore metaspace GC threshold until GC is warm
In-Reply-To: <489ABCE3-B37A-46D9-AC4B-5535957B7DCE@oracle.com>
References: <bf25c1c7-c8a1-b7d6-85cd-d5ff96c189a7@oracle.com>
 <489ABCE3-B37A-46D9-AC4B-5535957B7DCE@oracle.com>
Message-ID: <9e0cc3fe-3430-2503-295b-da3831ce7121@oracle.com>

Thanks Erik!

/Per

On 10/16/19 9:01 AM, Erik Osterlund wrote:
> Hi Per,
> 
> Looks good.
> 
> /Erik
> 
>> On 8 Oct 2019, at 15:03, Per Liden <per.liden at oracle.com> wrote:
>>
>> ?As reported here:
>>
>> https://mail.openjdk.java.net/pipermail/zgc-dev/2019-September/000736.html
>>
>> The ZDirector heuristics can get of to a bad start if the statistics is contaminated by early "Metaspace GC Threshold" GC requests. To avoid this, we could simply ignore such requests until the GC is warm, at the potential cost of expanding metaspace a bit more during startup.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232001
>> Webrev: http://cr.openjdk.java.net/~pliden/8232001/webrev.0
>>
>> /Per
> 


From per.liden at oracle.com  Wed Oct 16 07:44:43 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 16 Oct 2019 09:44:43 +0200
Subject: RFR: 8231996: ZGC: Replace ZStatisticsForceTrace with check if
 JFR event is enabled
In-Reply-To: <32019E41-9014-450F-BA62-AB1B71A9B886@oracle.com>
References: <b3f50d5d-002a-02c2-39d5-d07d4e9a0c27@oracle.com>
 <32019E41-9014-450F-BA62-AB1B71A9B886@oracle.com>
Message-ID: <17b35be2-a22f-f610-64ac-5c409890b6c5@oracle.com>

Thanks Erik!

/Per

On 10/16/19 9:19 AM, Erik Osterlund wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
>> On 10 Oct 2019, at 12:28, Per Liden <per.liden at oracle.com> wrote:
>>
>> ?Remove and replace the diagnostic flag ZStatisticsForceTrace with a check if JFR event is enabled. This flag was introduced as a safety measure back when sending JFR events was problematic in some contexts. This is no longer the case, so we can just let the default.jfc/profile.jfc control when those events should be sent.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231996
>> Webrev: http://cr.openjdk.java.net/~pliden/8231996/webrev.0
>>
>> /Per
> 


From per.liden at oracle.com  Wed Oct 16 07:44:21 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 16 Oct 2019 09:44:21 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <7FC905BD-8F45-4D04-9C2C-C473AB0FA3DD@oracle.com>
References: <0a2eee49-9bb4-a1be-f8fc-b2efcc01fd59@oracle.com>
 <7FC905BD-8F45-4D04-9C2C-C473AB0FA3DD@oracle.com>
Message-ID: <e431baec-8e9d-42e4-e552-26789e82901f@oracle.com>

Thanks Erik!

/Per

On 10/16/19 8:56 AM, Erik Osterlund wrote:
> +1
> 
> /Erik
> 
>> On 10 Oct 2019, at 14:28, Per Liden <per.liden at oracle.com> wrote:
>>
>> ?(CC:ing serviceability-dev)
>>
>>> On 10/7/19 2:38 PM, Per Liden wrote:
>>> This test is currently disabled for ZGC, but it can easily be enabled by adjusting the expected log string. ZGC doesn't print "Pause Full", but it still prints the "(Diagnostic Command)" part.
>>> Also, the test enables gc=debug logging, which is unnecessary since this is always printed on the gc=info level.
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231943
>>> Webrev: http://cr.openjdk.java.net/~pliden/8231943/webrev.0
>>> Testing: Manually ran test with all GCs (except Epsilon)
>>> /Per
> 


From thomas.schatzl at oracle.com  Wed Oct 16 08:07:03 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 16 Oct 2019 10:07:03 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <ebe51564-a8c5-1255-173f-e2038d5bd602@oracle.com>
References: <ebe51564-a8c5-1255-173f-e2038d5bd602@oracle.com>
Message-ID: <54d7c6e1-097d-0e70-0c6a-8fa12f788d74@oracle.com>

Hi,

On 07.10.19 14:38, Per Liden wrote:
> This test is currently disabled for ZGC, but it can easily be enabled by 
> adjusting the expected log string. ZGC doesn't print "Pause Full", but 
> it still prints the "(Diagnostic Command)" part.
> 
Not sure if that checking only for that satisfies the requirements of 
the test. I mean that this is a test to verify that jcmd executes (or 
starts) a GC. I do not think checking for "(Diagnostic Command)" is 
enough - it could be any diagnostic command that could be executed.

What does ZGC print here? Can the check be made more specific?

> Also, the test enables gc=debug logging, which is unnecessary since this 
> is always printed on the gc=info level.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231943
> Webrev: http://cr.openjdk.java.net/~pliden/8231943/webrev.0
> 

Thanks,
   Thomas


From per.liden at oracle.com  Wed Oct 16 08:41:57 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 16 Oct 2019 10:41:57 +0200
Subject: RFR: 8231552: ZGC: Refine address space reservation
In-Reply-To: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>
References: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>
Message-ID: <c412fdf3-8f74-390e-6c6d-0d6df4e273f5@oracle.com>

Latest version of this patch, rebased on today's jdk/jdk:

http://cr.openjdk.java.net/~pliden/8231552/webrev.2

/Per

On 10/3/19 11:45 AM, Per Liden wrote:
> We could be slightly more sophisticated and do a better job reserving 
> address space in situations where parts of the address space is already 
> occupied or when the process is running with address space limitations.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231552
> Webrev: http://cr.openjdk.java.net/~pliden/8231552/webrev.0
> 
> /Per


From per.liden at oracle.com  Wed Oct 16 10:27:32 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 16 Oct 2019 12:27:32 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <54d7c6e1-097d-0e70-0c6a-8fa12f788d74@oracle.com>
References: <ebe51564-a8c5-1255-173f-e2038d5bd602@oracle.com>
 <54d7c6e1-097d-0e70-0c6a-8fa12f788d74@oracle.com>
Message-ID: <674868d8-b391-9e86-698b-1e510b68dc36@oracle.com>

Hi Thomas,

On 10/16/19 10:07 AM, Thomas Schatzl wrote:
> Hi,
> 
> On 07.10.19 14:38, Per Liden wrote:
>> This test is currently disabled for ZGC, but it can easily be enabled 
>> by adjusting the expected log string. ZGC doesn't print "Pause Full", 
>> but it still prints the "(Diagnostic Command)" part.
>>
> Not sure if that checking only for that satisfies the requirements of 
> the test. I mean that this is a test to verify that jcmd executes (or 
> starts) a GC. I do not think checking for "(Diagnostic Command)" is 
> enough - it could be any diagnostic command that could be executed.

I don't think that's quite true, since the file we're greping in is the 
GC log (not stdout), which we know only contains stuff from gc=info. So, 
only if the GC itself is printing "(Diagnostic Command)" on gc=info 
level somewhere else is this a problem, which I would find somewhat 
surprising, no?

> 
> What does ZGC print here? Can the check be made more specific?

"Garbage Collection (Diagnostic Command)"

I opted to search for just "(Diagnostic Command)" mainly to keep the 
test GC agnostic. I don't have a strong opinion, but I don't believe 
making more specific greps will make the test more robust in practice, 
for the reason described above.

cheers,
Per

> 
>> Also, the test enables gc=debug logging, which is unnecessary since 
>> this is always printed on the gc=info level.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231943
>> Webrev: http://cr.openjdk.java.net/~pliden/8231943/webrev.0
>>
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Wed Oct 16 11:13:32 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 16 Oct 2019 13:13:32 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <674868d8-b391-9e86-698b-1e510b68dc36@oracle.com>
References: <ebe51564-a8c5-1255-173f-e2038d5bd602@oracle.com>
 <54d7c6e1-097d-0e70-0c6a-8fa12f788d74@oracle.com>
 <674868d8-b391-9e86-698b-1e510b68dc36@oracle.com>
Message-ID: <2fed46aa-f281-94f1-4aec-c5e45aed8fbc@oracle.com>

Hi,

On 16.10.19 12:27, Per Liden wrote:
> Hi Thomas,
> 
> On 10/16/19 10:07 AM, Thomas Schatzl wrote:
>> Hi,
>>
>> On 07.10.19 14:38, Per Liden wrote:
>>> This test is currently disabled for ZGC, but it can easily be enabled 
>>> by adjusting the expected log string. ZGC doesn't print "Pause Full", 
>>> but it still prints the "(Diagnostic Command)" part.
>>>
>> Not sure if that checking only for that satisfies the requirements of 
>> the test. I mean that this is a test to verify that jcmd executes (or 
>> starts) a GC. I do not think checking for "(Diagnostic Command)" is 
>> enough - it could be any diagnostic command that could be executed.
> 
> I don't think that's quite true, since the file we're greping in is the 
> GC log (not stdout), which we know only contains stuff from gc=info. So, 
> only if the GC itself is printing "(Diagnostic Command)" on gc=info 
> level somewhere else is this a problem, which I would find somewhat 
> surprising, no?

Okay, ship it then :) Thanks for the clarification.

Thomas


From stefan.johansson at oracle.com  Wed Oct 16 12:55:01 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 16 Oct 2019 14:55:01 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
Message-ID: <e949efa7-054a-dc5c-3bb5-c89328b34993@oracle.com>

Hi Sangheon,

On 2019-10-15 16:33, sangheon.kim at oracle.com wrote:
> Hi all,
> 
> Here's revised webrev which addresses:
> 1) G1RegionToSpaceMapper checks mtJavaHeap and then conditionally calls 
> G1NUMA::request_memory_on_node() (Kim)
> 2) The signature of G1NUMA::request_memory_on_node(void* address, ,) is 
> changed to have actual address instead of page index. (Stefan)
> 3) Some local variable name changes at G1RegionToSpaceMapper. i -> 
> region_idx, idx -> page_idx (for local style, used idx instead of index)
> 
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5/
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5.inc/

This looks good!

Thanks for all your hard work,
Stefan

> Testing: hs-tier 1 ~ 5, with/without UseNUMA
> 
> Thanks,
> Sangheon
> 
> 
> On 10/14/19 3:20 PM, Kim Barrett wrote:
>>> On Oct 14, 2019, at 5:03 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>>> 2. Add a state to the mappers to say if they are NUMA aware or not, 
>>>> and currently only the heap mapper should be NUMA aware. We could 
>>>> either set this state to true using the mtJavaHeap type as we have 
>>>> checked before or add an explicit setter that we only call for the 
>>>> heap mapper.
>>>>
>>>> I know that only doing 2) will fix the current problem, but I think 
>>>> it would be nice to avoid having the base address in G1NUMA, thoughts?
>>> I don't understand the point about mappers needing to know if they are
>>> NUMA or not. request_memory_on_node is only called by the two relevant
>>> region->space mappers, with the memory involved always in the Java
>>> heap (after fixing the units mismatch mentioned above). That is,
>>> G1NUMA::request_memory_on_node should only be called for Java heap
>>> memory. (It might be able to assert is_in_reserved or something like
>>> that, though initialization order might prevent that.)
>> I was confused here too.? Sangheon has repaired my confusion, and he?s
>> got another change in the works to tidy things up here in a way that I 
>> think
>> will make both me and Stefan happy.
>>
> 


From per.liden at oracle.com  Wed Oct 16 13:04:13 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 16 Oct 2019 15:04:13 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <2fed46aa-f281-94f1-4aec-c5e45aed8fbc@oracle.com>
References: <ebe51564-a8c5-1255-173f-e2038d5bd602@oracle.com>
 <54d7c6e1-097d-0e70-0c6a-8fa12f788d74@oracle.com>
 <674868d8-b391-9e86-698b-1e510b68dc36@oracle.com>
 <2fed46aa-f281-94f1-4aec-c5e45aed8fbc@oracle.com>
Message-ID: <6230542f-d42f-0672-a454-7cf65123e35e@oracle.com>


On 10/16/19 1:13 PM, Thomas Schatzl wrote:
> Hi,
> 
> On 16.10.19 12:27, Per Liden wrote:
>> Hi Thomas,
>>
>> On 10/16/19 10:07 AM, Thomas Schatzl wrote:
>>> Hi,
>>>
>>> On 07.10.19 14:38, Per Liden wrote:
>>>> This test is currently disabled for ZGC, but it can easily be 
>>>> enabled by adjusting the expected log string. ZGC doesn't print 
>>>> "Pause Full", but it still prints the "(Diagnostic Command)" part.
>>>>
>>> Not sure if that checking only for that satisfies the requirements of 
>>> the test. I mean that this is a test to verify that jcmd executes (or 
>>> starts) a GC. I do not think checking for "(Diagnostic Command)" is 
>>> enough - it could be any diagnostic command that could be executed.
>>
>> I don't think that's quite true, since the file we're greping in is 
>> the GC log (not stdout), which we know only contains stuff from 
>> gc=info. So, only if the GC itself is printing "(Diagnostic Command)" 
>> on gc=info level somewhere else is this a problem, which I would find 
>> somewhat surprising, no?
> 
> Okay, ship it then :) Thanks for the clarification.

Ok, thanks for reviewing, Thomas!

/Per


From kim.barrett at oracle.com  Wed Oct 16 14:00:35 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 16 Oct 2019 10:00:35 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
Message-ID: <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>

> On Oct 15, 2019, at 10:33 AM, sangheon.kim at oracle.com wrote:
> 
> Hi all,
> 
> Here's revised webrev which addresses:
> 1) G1RegionToSpaceMapper checks mtJavaHeap and then conditionally calls G1NUMA::request_memory_on_node() (Kim)
> 2) The signature of G1NUMA::request_memory_on_node(void* address, ,) is changed to have actual address instead of page index. (Stefan)
> 3) Some local variable name changes at G1RegionToSpaceMapper. i -> region_idx, idx -> page_idx (for local style, used idx instead of index)
> 
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5/
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5.inc/
> Testing: hs-tier 1 ~ 5, with/without UseNUMA

Looks good.

In g1PageBasedVirtualSpace.cpp, could the newly added definition of page_size()
be moved to be near the existing definition of page_start()?  I don?t need a new
webrev if you move it.


From thomas.schatzl at oracle.com  Wed Oct 16 14:05:45 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 16 Oct 2019 16:05:45 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
Message-ID: <0cfc451b-292c-2ea1-f275-08b186c1e044@oracle.com>

Hi,

On 15.10.19 16:33, sangheon.kim at oracle.com wrote:
> Hi all,
> 
> Here's revised webrev which addresses:
> 1) G1RegionToSpaceMapper checks mtJavaHeap and then conditionally calls 
> G1NUMA::request_memory_on_node() (Kim)
> 2) The signature of G1NUMA::request_memory_on_node(void* address, ,) is 
> changed to have actual address instead of page index. (Stefan)
> 3) Some local variable name changes at G1RegionToSpaceMapper. i -> 
> region_idx, idx -> page_idx (for local style, used idx instead of index)
> 
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5/
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5.inc/
> Testing: hs-tier 1 ~ 5, with/without UseNUMA
> 

   looks good.

Thomas


From zgu at redhat.com  Wed Oct 16 14:44:13 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 16 Oct 2019 10:44:13 -0400
Subject: RFR 8231999: Shenandoah: Traversal failed
 compiler/jsr292/CallSiteDepContextTest.java
Message-ID: <1721621f-a5df-58fc-e007-8d0bf713afa1@redhat.com>

This patch partially reverts JDK-8231293's fix, because it hides dead 
oops from GC, by returning NULL, which causes the failure of this test case.

The root cause of JDK-8231293 is that, Traversal deactivates SATB 
barrier too late, it should be turned off before weak root processing.

Bug: https://bugs.openjdk.java.net/browse/JDK-8231999
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231999/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release) on Linux

After this fix, CallSiteDepContextTest.java test hangs in traversal 
mode, but it is separate issue, tracked by JDK-8232380.

Thanks,

-Zhengyu


From rkennke at redhat.com  Wed Oct 16 15:25:08 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 16 Oct 2019 17:25:08 +0200
Subject: RFR 8231999: Shenandoah: Traversal failed
 compiler/jsr292/CallSiteDepContextTest.java
In-Reply-To: <1721621f-a5df-58fc-e007-8d0bf713afa1@redhat.com>
References: <1721621f-a5df-58fc-e007-8d0bf713afa1@redhat.com>
Message-ID: <f772b550-286f-ac83-13db-dcf35e6d6ff3@redhat.com>

So for traversal, it falls to the normal LRB, is that what you intended?

Roman

> This patch partially reverts JDK-8231293's fix, because it hides dead
> oops from GC, by returning NULL, which causes the failure of this test
> case.
> 
> The root cause of JDK-8231293 is that, Traversal deactivates SATB
> barrier too late, it should be turned off before weak root processing.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231999
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231999/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release) on Linux
> 
> After this fix, CallSiteDepContextTest.java test hangs in traversal
> mode, but it is separate issue, tracked by JDK-8232380.
> 
> Thanks,
> 
> -Zhengyu
> 
> 
> 


From zgu at redhat.com  Wed Oct 16 15:40:33 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 16 Oct 2019 11:40:33 -0400
Subject: RFR 8231999: Shenandoah: Traversal failed
 compiler/jsr292/CallSiteDepContextTest.java
In-Reply-To: <f772b550-286f-ac83-13db-dcf35e6d6ff3@redhat.com>
References: <1721621f-a5df-58fc-e007-8d0bf713afa1@redhat.com>
 <f772b550-286f-ac83-13db-dcf35e6d6ff3@redhat.com>
Message-ID: <86750e0f-d71a-0ad6-26e9-866e6f022686@redhat.com>


On 10/16/19 11:25 AM, Roman Kennke wrote:
> So for traversal, it falls to the normal LRB, is that what you intended?

It is always the case, isn't it?


-Zhengyu

> 
> Roman
> 
>> This patch partially reverts JDK-8231293's fix, because it hides dead
>> oops from GC, by returning NULL, which causes the failure of this test
>> case.
>>
>> The root cause of JDK-8231293 is that, Traversal deactivates SATB
>> barrier too late, it should be turned off before weak root processing.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231999
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231999/webrev.00/
>>
>> Test:
>>  ? hotspot_gc_shenandoah (fastdebug and release) on Linux
>>
>> After this fix, CallSiteDepContextTest.java test hangs in traversal
>> mode, but it is separate issue, tracked by JDK-8232380.
>>
>> Thanks,
>>
>> -Zhengyu
>>
>>
>>
> 


From rkennke at redhat.com  Wed Oct 16 15:47:17 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 16 Oct 2019 17:47:17 +0200
Subject: RFR 8231999: Shenandoah: Traversal failed
 compiler/jsr292/CallSiteDepContextTest.java
In-Reply-To: <86750e0f-d71a-0ad6-26e9-866e6f022686@redhat.com>
References: <1721621f-a5df-58fc-e007-8d0bf713afa1@redhat.com>
 <f772b550-286f-ac83-13db-dcf35e6d6ff3@redhat.com>
 <86750e0f-d71a-0ad6-26e9-866e6f022686@redhat.com>
Message-ID: <fe52bc24-b465-20b2-7ccd-d187cb6e6113@redhat.com>

>> So for traversal, it falls to the normal LRB, is that what you intended?
> 
> It is always the case, isn't it?

Yeah sure, just wanted to check if that is what you intended.

The patch is ok. Ideally, the code that calls into GC path wouldn't go
through the barrier to begin with, though. Can you keep a record of
which code path does that?

Roman

> 
> 
> -Zhengyu
> 
>>
>> Roman
>>
>>> This patch partially reverts JDK-8231293's fix, because it hides dead
>>> oops from GC, by returning NULL, which causes the failure of this test
>>> case.
>>>
>>> The root cause of JDK-8231293 is that, Traversal deactivates SATB
>>> barrier too late, it should be turned off before weak root processing.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231999
>>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231999/webrev.00/
>>>
>>> Test:
>>> ?? hotspot_gc_shenandoah (fastdebug and release) on Linux
>>>
>>> After this fix, CallSiteDepContextTest.java test hangs in traversal
>>> mode, but it is separate issue, tracked by JDK-8232380.
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>>
>>>
>>>
>>


From sangheon.kim at oracle.com  Wed Oct 16 17:54:02 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Wed, 16 Oct 2019 10:54:02 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
Message-ID: <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>

Hi Kim, Stefan and Thomas,

Many thanks for the reviews and suggestions!

Kim,
I will move page_size() near page_start() before push as you suggested.
As you know, all 3 patches will be pushed together though.

Thanks,
Sangheon


On 10/16/19 7:00 AM, Kim Barrett wrote:
>> On Oct 15, 2019, at 10:33 AM, sangheon.kim at oracle.com wrote:
>>
>> Hi all,
>>
>> Here's revised webrev which addresses:
>> 1) G1RegionToSpaceMapper checks mtJavaHeap and then conditionally calls G1NUMA::request_memory_on_node() (Kim)
>> 2) The signature of G1NUMA::request_memory_on_node(void* address, ,) is changed to have actual address instead of page index. (Stefan)
>> 3) Some local variable name changes at G1RegionToSpaceMapper. i -> region_idx, idx -> page_idx (for local style, used idx instead of index)
>>
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5/
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5.inc/
>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
> Looks good.
>
> In g1PageBasedVirtualSpace.cpp, could the newly added definition of page_size()
> be moved to be near the existing definition of page_start()?  I don?t need a new
> webrev if you move it.
>


From sangheon.kim at oracle.com  Wed Oct 16 18:02:50 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Wed, 16 Oct 2019 11:02:50 -0700
Subject: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent
 marking bitmaps.
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F3CB57D45D@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
 <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57D45D@ORSMSX116.amr.corp.intel.com>
Message-ID: <f9e8443e-9f4a-510f-4ef4-b7356114e929@oracle.com>

Hi Kishor,

Before reviewing webrev.02, could you remind us what was the motivation 
of pinning the bitmap mappers here?
In addition to explanations of the problematic situation, any logs / 
stack-trace also may help.

We think that understanding of the root cause should be considered first.

Thanks,
Sangheon


On 10/15/19 6:23 PM, Kharbas, Kishor wrote:
>
> Thank you for the suggestions.
>
> In this webrev I added a flag to ReservedSpace constructors to direct 
> it to pin the memory space. So now G1PageBasedVirtualSpace does not 
> have to do special handling.
>
> http://cr.openjdk.java.net/~kkharbas/8215893/webrev.02/
>
> To add more to Sangheon?s reply to Stefan?s question,
>
> > Another thing, can you remind me why we need the bitmaps to be pinned but not other structures such as the 
> card table?
>
> When I implemented this feature I had run into issue with the default 
> implementation of concurrent marking bitmaps.
>
> Thanks,
>
> Kishor
>
> *From:*sangheon.kim at oracle.com [mailto:sangheon.kim at oracle.com]
> *Sent:* Wednesday, October 9, 2019 2:42 PM
> *To:* Kharbas, Kishor <kishor.kharbas at intel.com>; 
> hotspot-gc-dev at openjdk.java.net
> *Cc:* Stefan Johansson <stefan.johansson at oracle.com>
> *Subject:* Re: RFR(S): 8215893: Add better abstraction for pinning G1 
> concurrent marking bitmaps.
>
> Hi Kishor,
>
> On 10/4/19 4:15 PM, Kharbas, Kishor wrote:
>
>     Hi Stefan,
>
>     Thanks for the review. Some comments inline.
>
>     New webrev :
>     http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00_to_01/
>
>     http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/
>     <http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/>
>
> I am reviewing the patch but have a question on top of Stefan's 
> question[1].
> Why the bimap mappers are committed? I think all troubles started from 
> 'committing but treating as special here. Couldn't just treat the 
> bitmap mappers as 'special' without commit?
> If 'not committing' is doable, couldn't simply create ReservedSpace 
> with 'special' enabled (independent to large page setting, which is 
> same to Stefan's comment)? Or add PinnedResevedSpace to force 'special 
> enabled'.
>
> [1]: Another thing, can you remind me why we need the bitmaps to be 
> pinned but not other structures such as the card table?
>
> +HeterogeneousHeapRegionManager::initialize()
> ...
> +? // We commit bitmap for all regions during initialization and mark 
> the bitmap space as special.
> +? // This allows regions to be un-committed while concurrent-marking 
> threads are accesing the bitmap concurrently.
>
> Thanks,
> Sangheon
>
>
>
>     > Hi Kishor,
>
>     >
>
>     > On 04.10.19 03:00, Kharbas, Kishor wrote:
>
>     >> Hi,
>
>     >> When I worked on
>     JDK-8211425<https://bugs.openjdk.java.net/browse/JDK-8211425>
>     <https://bugs.openjdk.java.net/browse/JDK-8211425>, there was a
>     request for better abstraction for pinning G1's CM bitmaps. RFE
>     for the request is here -
>     JDK-8215893<https://bugs.openjdk.java.net/browse/JDK-8215893>
>     <https://bugs.openjdk.java.net/browse/JDK-8215893>.
>
>     >>
>
>     >> Here is a proposal :
>     http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/
>     <http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/>
>
>     >>
>
>     >> Here G1PageBasedVirtualSpace pins the entire reserved memory to
>     memory during construction. The constructor takes an additional
>     bool flag which says "does it need to pin the memory".
>
>     >> If the memory is pinned, '_special' flag is set to true. I
>     piggy back on _special flag's behavior which is to not do actual
>     OS (un-)commits on calls to (un)commit().
>
>     >> Rest of the changes is the mechanism to pass this flag from CM
>     bitmaps creation in G1CollectedHeap all the way to
>     G1PageBasedVirtualSpace.
>
>     >>
>
>     >> Let me know if this is a good abstraction and if there is any
>     better way.
>
>     >>
>
>     >> Thanks
>
>     >> Kishor
>
>     >>
>
>     >
>
>     > Some comments:
>
>     >
>
>     > - in the parameter lists, if the parameters are already laid out
>
>     > line-by-line, if adding a new one, please put it on a new line
>     as well.
>
>     >
>
>     Fixed in the new webrev.
>
>     > - this code
>
>     >
>
>     >??? if (_special) {
>
>     >????? if (!rs.special()) {
>
>     > commit_internal(addr_to_page_index(_low_boundary),
>
>     > addr_to_page_index(_high_boundary));
>
>     >????? }
>
>     >
>
>     > in g1PageBasedVirtualSpace looks very incomprehensible.? :)
>
>     >
>
>     > I would prefer (pending the second reviewer's comment) to either
>     use the
>
>     > "pinned" flag here, or even better, move the necessary commit
>     calls into
>
>     > the (now removed) HeterogeneousHeapRegionManager::initialize().
>
>     >
>
>     Made it little more comprehensible. Will see what other reviewers
>     think about moving it somewhere else.
>
>     > - I would just purely from feeling prefer if the "pinned" flag
>     parameter
>
>     > would be listed after the "type" parameter in the
>     G1RegionToSpaceMapper.
>
>     > But that's probably just me.
>
>     >
>
>     I did it this way to logically group the parameters. MemTracker is
>     a tracker used by the VM everywhere and does not pertain to this
>     class as such, so I kept it in the end.
>
>     > Also, finally one parameter per line for the
>     declaration/definition of
>
>     > the constructor would improve readability.
>
>     >
>
>     Done.
>
>     Thank you,
>
>     Kishor
>
>     > Thanks,
>
>     >? ??Thomas
>


From manc at google.com  Wed Oct 16 18:16:46 2019
From: manc at google.com (Man Cao)
Date: Wed, 16 Oct 2019 11:16:46 -0700
Subject: RFR(S): 8232232: G1RemSetSummary::_rs_threads_vtimes is not
 initialized to zero
In-Reply-To: <75800cc4-9d67-bc11-c0b7-4b0a56b85ab5@oracle.com>
References: <CA+w6HxYyx7iRXHRYUjbuSwwkiwKWsATSMNyVovOqPLNYuEXRVQ@mail.gmail.com>
 <75800cc4-9d67-bc11-c0b7-4b0a56b85ab5@oracle.com>
Message-ID: <CA+w6HxavJ=EaABNJPc06Jb+QS4Lqhd8BJdBBHu3xuBis-D2wQA@mail.gmail.com>

Thanks for the reviews.

-Man


From serguei.spitsyn at oracle.com  Wed Oct 16 23:21:55 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 16 Oct 2019 16:21:55 -0700
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <7FC905BD-8F45-4D04-9C2C-C473AB0FA3DD@oracle.com>
References: <0a2eee49-9bb4-a1be-f8fc-b2efcc01fd59@oracle.com>
 <7FC905BD-8F45-4D04-9C2C-C473AB0FA3DD@oracle.com>
Message-ID: <d561917a-a597-27ec-e7f8-7b4b4dedb7be@oracle.com>

Hi Per,

Looks good.

Thanks,
Serguei


On 10/15/19 23:56, Erik Osterlund wrote:
> +1
>
> /Erik
>
>> On 10 Oct 2019, at 14:28, Per Liden <per.liden at oracle.com> wrote:
>>
>> ?(CC:ing serviceability-dev)
>>
>>> On 10/7/19 2:38 PM, Per Liden wrote:
>>> This test is currently disabled for ZGC, but it can easily be enabled by adjusting the expected log string. ZGC doesn't print "Pause Full", but it still prints the "(Diagnostic Command)" part.
>>> Also, the test enables gc=debug logging, which is unnecessary since this is always printed on the gc=info level.
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231943
>>> Webrev: http://cr.openjdk.java.net/~pliden/8231943/webrev.0
>>> Testing: Manually ran test with all GCs (except Epsilon)
>>> /Per


From kishor.kharbas at intel.com  Thu Oct 17 01:39:48 2019
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Thu, 17 Oct 2019 01:39:48 +0000
Subject: RFR(S): 8215893: Add better abstraction for pinning G1
 concurrent marking bitmaps.
In-Reply-To: <f9e8443e-9f4a-510f-4ef4-b7356114e929@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
 <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57D45D@ORSMSX116.amr.corp.intel.com>
 <f9e8443e-9f4a-510f-4ef4-b7356114e929@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F3CB57DA2E@ORSMSX116.amr.corp.intel.com>

Hi Sangheon,

From: sangheon.kim at oracle.com [mailto:sangheon.kim at oracle.com]
Sent: Wednesday, October 16, 2019 11:03 AM
To: Kharbas, Kishor <kishor.kharbas at intel.com>
Cc: hotspot-gc-dev at openjdk.java.net; Stefan Johansson <stefan.johansson at oracle.com>
Subject: Re: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent marking bitmaps.

Hi Kishor,

Before reviewing webrev.02, could you remind us what was the motivation of pinning the bitmap mappers here?
In addition to explanations of the problematic situation, any logs / stack-trace also may help.

We think that understanding of the root cause should be considered first.
Unfortunately, I do not have log/stack-trace of the problem I had faced.
I am trying to reproduce it by running SPECjbb workload over and over again.

I haven't looked at GC code since end of last year. So I am having a difficult time pinning what the problem was.
I am looking at G1ClearBitMapTask which iterates over bitmap for all available regions. I am not sure when this task is performed.
There is comment in HeapRegionManager::par_iterate() as shown below,
  // This also (potentially) iterates over regions newly allocated during GC. This
  // is no problem except for some extra work.

This method is eventually called from G1ClearBitMapTask. The comment suggests that regions are allocated concurrently when the function is run. This also means with AllocateOldGenAt flag enabled, regions can also be un-committed.
Pardon me if my understanding is incorrect.

Regards,
Kishor

Thanks,
Sangheon

On 10/15/19 6:23 PM, Kharbas, Kishor wrote:
Thank you for the suggestions.
In this webrev I added a flag to ReservedSpace constructors to direct it to pin the memory space. So now G1PageBasedVirtualSpace does not have to do special handling.

http://cr.openjdk.java.net/~kkharbas/8215893/webrev.02/

To add more to Sangheon's reply to Stefan's question,
> Another thing, can you remind me why we need the bitmaps to be pinned but not other structures such as the card table?
When I implemented this feature I had run into issue with the default implementation of concurrent marking bitmaps.

Thanks,
Kishor

From: sangheon.kim at oracle.com<mailto:sangheon.kim at oracle.com> [mailto:sangheon.kim at oracle.com]
Sent: Wednesday, October 9, 2019 2:42 PM
To: Kharbas, Kishor <kishor.kharbas at intel.com><mailto:kishor.kharbas at intel.com>; hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>
Cc: Stefan Johansson <stefan.johansson at oracle.com><mailto:stefan.johansson at oracle.com>
Subject: Re: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent marking bitmaps.

Hi Kishor,
On 10/4/19 4:15 PM, Kharbas, Kishor wrote:

Hi Stefan,

Thanks for the review. Some comments inline.

New webrev : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00_to_01/

                              http://cr.openjdk.java.net/~kkharbas/8215893/webrev.01/
I am reviewing the patch but have a question on top of Stefan's question[1].
Why the bimap mappers are committed? I think all troubles started from 'committing but treating as special here. Couldn't just treat the bitmap mappers as 'special' without commit?
If 'not committing' is doable, couldn't simply create ReservedSpace with 'special' enabled (independent to large page setting, which is same to Stefan's comment)? Or add PinnedResevedSpace to force 'special enabled'.

[1]: Another thing, can you remind me why we need the bitmaps to be pinned but not other structures such as the card table?

+HeterogeneousHeapRegionManager::initialize()

...

+  // We commit bitmap for all regions during initialization and mark the bitmap space as special.

+  // This allows regions to be un-committed while concurrent-marking threads are accesing the bitmap concurrently.


Thanks,
Sangheon


> Hi Kishor,

>

> On 04.10.19 03:00, Kharbas, Kishor wrote:

>> Hi,

>> When I worked on JDK-8211425<https://bugs.openjdk.java.net/browse/JDK-8211425><https://bugs.openjdk.java.net/browse/JDK-8211425>, there was a request for better abstraction for pinning G1's CM bitmaps. RFE for the request is here - JDK-8215893<https://bugs.openjdk.java.net/browse/JDK-8215893><https://bugs.openjdk.java.net/browse/JDK-8215893>.

>>

>> Here is a proposal : http://cr.openjdk.java.net/~kkharbas/8215893/webrev.00/

>>

>> Here G1PageBasedVirtualSpace pins the entire reserved memory to memory during construction. The constructor takes an additional bool flag which says "does it need to pin the memory".

>> If the memory is pinned, '_special' flag is set to true. I piggy back on _special flag's behavior which is to not do actual OS (un-)commits on calls to (un)commit().

>> Rest of the changes is the mechanism to pass this flag from CM bitmaps creation in G1CollectedHeap all the way to G1PageBasedVirtualSpace.

>>

>> Let me know if this is a good abstraction and if there is any better way.

>>

>> Thanks

>> Kishor

>>

>

> Some comments:

>

> - in the parameter lists, if the parameters are already laid out

> line-by-line, if adding a new one, please put it on a new line as well.

>

Fixed in the new webrev.


> - this code

>

>    if (_special) {

>      if (!rs.special()) {

>        commit_internal(addr_to_page_index(_low_boundary),

> addr_to_page_index(_high_boundary));

>      }

>

> in g1PageBasedVirtualSpace looks very incomprehensible.  :)

>

> I would prefer (pending the second reviewer's comment) to either use the

> "pinned" flag here, or even better, move the necessary commit calls into

> the (now removed) HeterogeneousHeapRegionManager::initialize().

>

Made it little more comprehensible. Will see what other reviewers think about moving it somewhere else.


> - I would just purely from feeling prefer if the "pinned" flag parameter

> would be listed after the "type" parameter in the G1RegionToSpaceMapper.

> But that's probably just me.

>

I did it this way to logically group the parameters. MemTracker is a tracker used by the VM everywhere and does not pertain to this class as such, so I kept it in the end.


> Also, finally one parameter per line for the declaration/definition of

> the constructor would improve readability.

>

Done.

Thank you,

Kishor


> Thanks,

>    Thomas


From per.liden at oracle.com  Thu Oct 17 08:44:03 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 17 Oct 2019 10:44:03 +0200
Subject: RFR: 8231943: ZGC: Enable serviceability/dcmd/gc/RunGCTest
In-Reply-To: <d561917a-a597-27ec-e7f8-7b4b4dedb7be@oracle.com>
References: <0a2eee49-9bb4-a1be-f8fc-b2efcc01fd59@oracle.com>
 <7FC905BD-8F45-4D04-9C2C-C473AB0FA3DD@oracle.com>
 <d561917a-a597-27ec-e7f8-7b4b4dedb7be@oracle.com>
Message-ID: <9b63f39b-8fb6-a6e9-57be-a63df0e6ede4@oracle.com>

Thanks Serguei!

/Per

On 2019-10-17 01:21, serguei.spitsyn at oracle.com wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> Serguei
> 
> 
> On 10/15/19 23:56, Erik Osterlund wrote:
>> +1
>>
>> /Erik
>>
>>> On 10 Oct 2019, at 14:28, Per Liden <per.liden at oracle.com> wrote:
>>>
>>> ?(CC:ing serviceability-dev)
>>>
>>>> On 10/7/19 2:38 PM, Per Liden wrote:
>>>> This test is currently disabled for ZGC, but it can easily be 
>>>> enabled by adjusting the expected log string. ZGC doesn't print 
>>>> "Pause Full", but it still prints the "(Diagnostic Command)" part.
>>>> Also, the test enables gc=debug logging, which is unnecessary since 
>>>> this is always printed on the gc=info level.
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231943
>>>> Webrev: http://cr.openjdk.java.net/~pliden/8231943/webrev.0
>>>> Testing: Manually ran test with all GCs (except Epsilon)
>>>> /Per
> 


From stefan.johansson at oracle.com  Thu Oct 17 11:34:00 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 17 Oct 2019 13:34:00 +0200
Subject: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent
 marking bitmaps.
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F3CB57DA2E@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
 <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57D45D@ORSMSX116.amr.corp.intel.com>
 <f9e8443e-9f4a-510f-4ef4-b7356114e929@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57DA2E@ORSMSX116.amr.corp.intel.com>
Message-ID: <c4f9c0ec-f242-82fd-95b9-b640dd389715@oracle.com>

Hi Kishor,

On 2019-10-17 03:39, Kharbas, Kishor wrote:
> Hi Sangheon,
> 
> *From:*sangheon.kim at oracle.com [mailto:sangheon.kim at oracle.com]
> *Sent:* Wednesday, October 16, 2019 11:03 AM
> *To:* Kharbas, Kishor <kishor.kharbas at intel.com>
> *Cc:* hotspot-gc-dev at openjdk.java.net; Stefan Johansson 
> <stefan.johansson at oracle.com>
> *Subject:* Re: RFR(S): 8215893: Add better abstraction for pinning G1 
> concurrent marking bitmaps.
> 
>> Hi Kishor,
>> 
>> Before reviewing webrev.02, could you remind us what was the motivation 
>> of pinning the bitmap mappers here?
>> In addition to explanations of the problematic situation, any logs / 
>> stack-trace also may help.
>> 
>> We think that understanding of the root cause should be considered first.
> 
> Unfortunately, I do not have log/stack-trace of the problem I had faced.
> 
> I am trying to reproduce it by running SPECjbb workload over and over again.
> 
> I haven?t looked at GC code since end of last year. So I am having a 
> difficult time pinning what the problem was.
> 
> I am looking at G1ClearBitMapTask which iterates over bitmap for all 
> available regions. I am not sure when this task is performed.
> 
> There is comment in HeapRegionManager::par_iterate() as shown below,
> 
> /// This also (potentially) iterates over regions newly allocated during 
> GC. This/
> 
> /? // is no problem except for some extra work./
> 
> This method is eventually called from G1ClearBitMapTask. The comment 
> suggests that regions are allocated concurrently when the function is 
> run. This also means with AllocateOldGenAt flag enabled, regions can 
> also be un-committed.

I don't understand how AllocateOldGenAt would make any difference, 
regions can be un-committed without it as well and there are mechanisms 
in place to make sure only the correct parts of the side structures are 
un-committed when that happens.

I want to reiterate what Sangheon said about identifying the root cause. 
If we don't know why this is needed and can't reproduce any failures 
without the special pinning of the bitmaps, I would rather see that we 
remove the pinning code to make things work more like normal G1.

Thanks,
Stefan


> 
> Pardon me if my understanding is incorrect.
> 
> Regards,
> 
> Kishor


From shade at redhat.com  Thu Oct 17 17:00:30 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 17 Oct 2019 19:00:30 +0200
Subject: RFR (XS) 8232534: Shenandoah: guard against reentrant
 ShenandoahHeapLock locking
Message-ID: <75af279f-5a46-de65-9f16-dd064ac98210@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8232534

This one was very useful for debugging:

diff -r 55fe0d93bdd3 src/hotspot/share/gc/shenandoah/shenandoahLock.hpp
--- a/src/hotspot/share/gc/shenandoah/shenandoahLock.hpp        Tue Oct 15 22:22:23 2019 -0400
+++ b/src/hotspot/share/gc/shenandoah/shenandoahLock.hpp        Thu Oct 17 18:59:27 2019 +0200
@@ -39,10 +39,13 @@

 public:
   ShenandoahLock() : _state(unlocked), _owner(NULL) {};

   void lock() {
+#ifdef ASSERT
+    assert(_owner != Thread::current(), "reentrant locking attempt, would deadlock");
+#endif
     Thread::SpinAcquire(&_state, "Shenandoah Heap Lock");
 #ifdef ASSERT
     assert(_state == locked, "must be locked");
     assert(_owner == NULL, "must not be owned");
     _owner = Thread::current();

Testing: hotspot_gc_shenandoah; multiple assert failures due to bugs in development

-- 
Thanks,
-Aleksey


From shade at redhat.com  Thu Oct 17 18:10:44 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 17 Oct 2019 20:10:44 +0200
Subject: RFR (S) 8232573: Shenandoah: cleanup and add more logging for
 in-pause phases
Message-ID: <b116b10a-f326-e882-d1c2-bb725de65738@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8232573

Fix:
  https://cr.openjdk.java.net/~shade/8232573/webrev.01

This improves profiling for pauses, fixing issues recently found when doing some performance
investigations and development work.

Testing: hotspot_gc_shenandoah, eyeballing -Xlog:gc+stats

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Thu Oct 17 18:13:53 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 17 Oct 2019 20:13:53 +0200
Subject: RFR (XS) 8232534: Shenandoah: guard against reentrant
 ShenandoahHeapLock locking
In-Reply-To: <75af279f-5a46-de65-9f16-dd064ac98210@redhat.com>
References: <75af279f-5a46-de65-9f16-dd064ac98210@redhat.com>
Message-ID: <fc360664-b413-dfd5-bf31-779a980b26de@redhat.com>

Yep, good and useful.

Roman


> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8232534
> 
> This one was very useful for debugging:
> 
> diff -r 55fe0d93bdd3 src/hotspot/share/gc/shenandoah/shenandoahLock.hpp
> --- a/src/hotspot/share/gc/shenandoah/shenandoahLock.hpp        Tue Oct 15 22:22:23 2019 -0400
> +++ b/src/hotspot/share/gc/shenandoah/shenandoahLock.hpp        Thu Oct 17 18:59:27 2019 +0200
> @@ -39,10 +39,13 @@
> 
>  public:
>    ShenandoahLock() : _state(unlocked), _owner(NULL) {};
> 
>    void lock() {
> +#ifdef ASSERT
> +    assert(_owner != Thread::current(), "reentrant locking attempt, would deadlock");
> +#endif
>      Thread::SpinAcquire(&_state, "Shenandoah Heap Lock");
>  #ifdef ASSERT
>      assert(_state == locked, "must be locked");
>      assert(_owner == NULL, "must not be owned");
>      _owner = Thread::current();
> 
> Testing: hotspot_gc_shenandoah; multiple assert failures due to bugs in development
> 


From rkennke at redhat.com  Thu Oct 17 18:15:37 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 17 Oct 2019 20:15:37 +0200
Subject: RFR (S) 8232573: Shenandoah: cleanup and add more logging for
 in-pause phases
In-Reply-To: <b116b10a-f326-e882-d1c2-bb725de65738@redhat.com>
References: <b116b10a-f326-e882-d1c2-bb725de65738@redhat.com>
Message-ID: <65205df1-b696-6f0e-5ef8-29f6e57a45ca@redhat.com>

Good, that seems useful. Patch looks good.

Roman

> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8232573
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232573/webrev.01
> 
> This improves profiling for pauses, fixing issues recently found when doing some performance
> investigations and development work.
> 
> Testing: hotspot_gc_shenandoah, eyeballing -Xlog:gc+stats
> 


From zgu at redhat.com  Thu Oct 17 18:16:35 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 17 Oct 2019 14:16:35 -0400
Subject: RFR 8231324: Shenandoah: avoid duplicated weak root works during
 final traversal
In-Reply-To: <d94f8b07-5cb6-9df2-d312-576ef1b9b99a@redhat.com>
References: <d94f8b07-5cb6-9df2-d312-576ef1b9b99a@redhat.com>
Message-ID: <07f55f1f-15ac-339f-37aa-135be1ff2bde@redhat.com>

Updated after JDK-8231999.

Changed: heap->is_concurrent_traversal_in_progress() to 
heap->is_traversal_mode()

Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231324/webrev.01/

Reran hotspot_gc_shenandoah test.

-Zhengyu


On 10/4/19 10:51 AM, Zhengyu Gu wrote:
> Please review this patch that avoids traversal GC to walk weak roots 
> twice during final traversal.
> 
> Also, it should process weak roots first, so that, fixup phase does not 
> visit dead CLDs/codes, etc.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231324
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231324/webrev.00/
> 
> Test:
>  ? hotspot_gc_shenandoah (fastdebug and release) on Linux x86_64
> 
> Thanks,
> 
> -Zhengyu


From kishor.kharbas at intel.com  Thu Oct 17 21:28:10 2019
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Thu, 17 Oct 2019 21:28:10 +0000
Subject: RFR(S): 8215893: Add better abstraction for pinning G1
 concurrent marking bitmaps.
In-Reply-To: <c4f9c0ec-f242-82fd-95b9-b640dd389715@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
 <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57D45D@ORSMSX116.amr.corp.intel.com>
 <f9e8443e-9f4a-510f-4ef4-b7356114e929@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57DA2E@ORSMSX116.amr.corp.intel.com>
 <c4f9c0ec-f242-82fd-95b9-b640dd389715@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F3CB57DDDE@ORSMSX116.amr.corp.intel.com>

Hi Stefan,

> -----Original Message-----
> From: Stefan Johansson [mailto:stefan.johansson at oracle.com]
> Sent: Thursday, October 17, 2019 4:34 AM
> To: Kharbas, Kishor <kishor.kharbas at intel.com>; sangheon.kim at oracle.com
> Cc: hotspot-gc-dev at openjdk.java.net
> Subject: Re: RFR(S): 8215893: Add better abstraction for pinning G1
> concurrent marking bitmaps.
> 
> Hi Kishor,
> 
> On 2019-10-17 03:39, Kharbas, Kishor wrote:
> > Hi Sangheon,
> >
> > *From:*sangheon.kim at oracle.com [mailto:sangheon.kim at oracle.com]
> > *Sent:* Wednesday, October 16, 2019 11:03 AM
> > *To:* Kharbas, Kishor <kishor.kharbas at intel.com>
> > *Cc:* hotspot-gc-dev at openjdk.java.net; Stefan Johansson
> > <stefan.johansson at oracle.com>
> > *Subject:* Re: RFR(S): 8215893: Add better abstraction for pinning G1
> > concurrent marking bitmaps.
> >
> >> Hi Kishor,
> >>
> >> Before reviewing webrev.02, could you remind us what was the
> >> motivation of pinning the bitmap mappers here?
> >> In addition to explanations of the problematic situation, any logs /
> >> stack-trace also may help.
> >>
> >> We think that understanding of the root cause should be considered first.
> >
> > Unfortunately, I do not have log/stack-trace of the problem I had faced.
> >
> > I am trying to reproduce it by running SPECjbb workload over and over
> again.
> >
> > I haven't looked at GC code since end of last year. So I am having a
> > difficult time pinning what the problem was.
> >
> > I am looking at G1ClearBitMapTask which iterates over bitmap for all
> > available regions. I am not sure when this task is performed.
> >
> > There is comment in HeapRegionManager::par_iterate() as shown below,
> >
> > /// This also (potentially) iterates over regions newly allocated
> > during GC. This/
> >
> > /? // is no problem except for some extra work./
> >
> > This method is eventually called from G1ClearBitMapTask. The comment
> > suggests that regions are allocated concurrently when the function is
> > run. This also means with AllocateOldGenAt flag enabled, regions can
> > also be un-committed.
> 
> I don't understand how AllocateOldGenAt would make any difference,
> regions can be un-committed without it as well and there are mechanisms in
> place to make sure only the correct parts of the side structures are un-
> committed when that happens.

In the regular code un-commit is only done by VM thread during safepoint. Un-commit of region also causes its corresponding bitmap to be un-committed.
But it never happens that CM threads are iterating over bitmap while regions are being un-committed concurrently.

Whereas when AllocateOldGenAt is used, because of the way regions are managed between
dram and nvdimms, regions can be un-committed by mutator threads and GC threads.
1. Mutator threads - during mutator region allocation and humongous region allocation.
2. GC worker threads - during survivor region and old region allocation.
3. VMThread - heap size adjustment as in default and after full GC to allocate enough regions in dram for young gen (may require to un-commit some regions from nvdimm).

Could any of these be running concurrently when CM threads are iterating over the bitmap?

> 
> I want to reiterate what Sangheon said about identifying the root cause.
> If we don't know why this is needed and can't reproduce any failures without
> the special pinning of the bitmaps, I would rather see that we remove the
> pinning code to make things work more like normal G1.

I am trying to reproduce but as you can imagine it is very rare and hard-to-reproduce bug, if it is.

Thanks,
Kishor
> 
> Thanks,
> Stefan
> 
> 
> >
> > Pardon me if my understanding is incorrect.
> >
> > Regards,
> >
> > Kishor


From zgu at redhat.com  Fri Oct 18 12:23:17 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 18 Oct 2019 08:23:17 -0400
Subject: RFR 8232009: Shenandoah: C2 load barrier does not match
 interpreter version
In-Reply-To: <689f5b52-de3b-6a1a-0032-365dedf58414@redhat.com>
References: <689f5b52-de3b-6a1a-0032-365dedf58414@redhat.com>
Message-ID: <737de3b3-fb29-1795-89c7-99781da22a09@redhat.com>

Updated: Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232009/webrev.01/

Added PHANTOM_OOP_REF references.

Test:
  Reran tests

Thanks,

-Zhengyu

On 10/11/19 1:11 PM, Zhengyu Gu wrote:
> Please review this patch that matches C2 load barrier to interpreter's 
> implementation.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232009
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232009/webrev.00/
> 
> Test:
>  ? hotspot_gc_shenandoah (fastdebug and release) with x86_32 and x86_64 
> JVMs on Linux
> 
> 
> Thanks,
> 
> -Zhengyu


From rkennke at redhat.com  Fri Oct 18 13:07:45 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 18 Oct 2019 15:07:45 +0200
Subject: RFR 8232009: Shenandoah: C2 load barrier does not match
 interpreter version
In-Reply-To: <737de3b3-fb29-1795-89c7-99781da22a09@redhat.com>
References: <689f5b52-de3b-6a1a-0032-365dedf58414@redhat.com>
 <737de3b3-fb29-1795-89c7-99781da22a09@redhat.com>
Message-ID: <1b753db4-70a8-2b77-fb44-c42dc34aec41@redhat.com>

Ok. Thank you!

Roman


> Updated: Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232009/webrev.01/
> 
> Added PHANTOM_OOP_REF references.
> 
> Test:
> ?Reran tests
> 
> Thanks,
> 
> -Zhengyu
> 
> On 10/11/19 1:11 PM, Zhengyu Gu wrote:
>> Please review this patch that matches C2 load barrier to interpreter's
>> implementation.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232009
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232009/webrev.00/
>>
>> Test:
>> ?? hotspot_gc_shenandoah (fastdebug and release) with x86_32 and
>> x86_64 JVMs on Linux
>>
>>
>> Thanks,
>>
>> -Zhengyu
> 


From rkennke at redhat.com  Fri Oct 18 13:09:32 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 18 Oct 2019 15:09:32 +0200
Subject: RFR 8231324: Shenandoah: avoid duplicated weak root works during
 final traversal
In-Reply-To: <07f55f1f-15ac-339f-37aa-135be1ff2bde@redhat.com>
References: <d94f8b07-5cb6-9df2-d312-576ef1b9b99a@redhat.com>
 <07f55f1f-15ac-339f-37aa-135be1ff2bde@redhat.com>
Message-ID: <1ee00447-919e-c127-3f0f-d65c60aab057@redhat.com>

Looks good. (I was pretty sure I looked through it yesterday already. Hmm.)

Thanks,
Roman

> Updated after JDK-8231999.
> 
> Changed: heap->is_concurrent_traversal_in_progress() to
> heap->is_traversal_mode()
> 
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231324/webrev.01/
> 
> Reran hotspot_gc_shenandoah test.
> 
> -Zhengyu
> 
> 
> On 10/4/19 10:51 AM, Zhengyu Gu wrote:
>> Please review this patch that avoids traversal GC to walk weak roots
>> twice during final traversal.
>>
>> Also, it should process weak roots first, so that, fixup phase does
>> not visit dead CLDs/codes, etc.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231324
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8231324/webrev.00/
>>
>> Test:
>> ?? hotspot_gc_shenandoah (fastdebug and release) on Linux x86_64
>>
>> Thanks,
>>
>> -Zhengyu
> 


From rkennke at redhat.com  Fri Oct 18 13:10:09 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 18 Oct 2019 15:10:09 +0200
Subject: RFR 8232008: Shenandoah: C1 load barrier does not match
 interpreter version
In-Reply-To: <d531e080-00bd-ea38-3ab8-3aae46a1d960@redhat.com>
References: <d531e080-00bd-ea38-3ab8-3aae46a1d960@redhat.com>
Message-ID: <f441edff-1311-465b-275d-84bd80f75d97@redhat.com>

Looks good. Thanks!

Roman


> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232008
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232008/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release) with x86_64 and x86-32
> JVM on Linux.
> 
> Thanks,
> 
> -Zhengyu
> 


From rkennke at redhat.com  Fri Oct 18 13:13:52 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 18 Oct 2019 15:13:52 +0200
Subject: RFR 8232010: Shenandoah: implement self-fixing native barrier
In-Reply-To: <6cecca8a-a477-53b4-48de-f504a2100955@redhat.com>
References: <6cecca8a-a477-53b4-48de-f504a2100955@redhat.com>
Message-ID: <ced96959-b847-d99d-f33c-1f91ed278d48@redhat.com>

Would a similar implementation also work for the non-native LRB?

It's lacking an aarch64 implementation, right?

Roman


> Please review this patch that implements self-fixing LRB for in native
> oops.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232010
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232010/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release) with x86_32 and x86_64
> JVM on Linux.
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Fri Oct 18 13:24:36 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 18 Oct 2019 09:24:36 -0400
Subject: RFR 8232010: Shenandoah: implement self-fixing native barrier
In-Reply-To: <ced96959-b847-d99d-f33c-1f91ed278d48@redhat.com>
References: <6cecca8a-a477-53b4-48de-f504a2100955@redhat.com>
 <ced96959-b847-d99d-f33c-1f91ed278d48@redhat.com>
Message-ID: <ac7f3cac-b7b4-c562-ed9c-25cd56245dae@redhat.com>


On 10/18/19 9:13 AM, Roman Kennke wrote:
> Would a similar implementation also work for the non-native LRB?

Yes, just need to make LRB stub to take the second parameter.

> 
> It's lacking an aarch64 implementation, right?

aarch64 misses all recent barrier changes. I intent to implement them 
after stabilize x86.

Thanks,

-Zhengyu

> 
> Roman
> 
> 
>> Please review this patch that implements self-fixing LRB for in native
>> oops.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232010
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232010/webrev.00/
>>
>> Test:
>>  ? hotspot_gc_shenandoah (fastdebug and release) with x86_32 and x86_64
>> JVM on Linux.
>>
>> Thanks,
>>
>> -Zhengyu
>>
> 


From shade at redhat.com  Fri Oct 18 13:58:23 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 18 Oct 2019 15:58:23 +0200
Subject: RFR (M) 8232575: Shenandoah: asynchronous object/region pinning
Message-ID: <ad3e50dd-103f-527c-74d5-15fd82198f99@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8232575

Current object/region pinning scheme bottlenecks on a lock, rendering some non-exceptional scenarios
quite slow. The way out is to collect critical pins atomically, and then update the region states
near the code that needs it (mostly selecting collection set). See the bug for more info.

Fix:
  https://cr.openjdk.java.net/~shade/8232575/webrev.02/

Testing: hotspot_gc_shenandoah {fastdebug,release}; tier{1,2,3} with Shenandoah; GZIP workload with
{normal, traversal} x {adaptive, aggressive}

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Fri Oct 18 14:03:50 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 18 Oct 2019 16:03:50 +0200
Subject: RFR (M) 8232575: Shenandoah: asynchronous object/region pinning
In-Reply-To: <ad3e50dd-103f-527c-74d5-15fd82198f99@redhat.com>
References: <ad3e50dd-103f-527c-74d5-15fd82198f99@redhat.com>
Message-ID: <63521dc4-53e8-72b4-852b-1d77def03c62@redhat.com>

Patch looks good! Thank you!

Roman


> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8232575
> 
> Current object/region pinning scheme bottlenecks on a lock, rendering some non-exceptional scenarios
> quite slow. The way out is to collect critical pins atomically, and then update the region states
> near the code that needs it (mostly selecting collection set). See the bug for more info.
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232575/webrev.02/
> 
> Testing: hotspot_gc_shenandoah {fastdebug,release}; tier{1,2,3} with Shenandoah; GZIP workload with
> {normal, traversal} x {adaptive, aggressive}
> 


From stefan.johansson at oracle.com  Fri Oct 18 14:31:48 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Fri, 18 Oct 2019 16:31:48 +0200
Subject: RFR(S): 8215893: Add better abstraction for pinning G1 concurrent
 marking bitmaps.
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F3CB57DDDE@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
 <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57D45D@ORSMSX116.amr.corp.intel.com>
 <f9e8443e-9f4a-510f-4ef4-b7356114e929@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57DA2E@ORSMSX116.amr.corp.intel.com>
 <c4f9c0ec-f242-82fd-95b9-b640dd389715@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57DDDE@ORSMSX116.amr.corp.intel.com>
Message-ID: <2CB4D3B2-02E7-46A8-85D3-CCEA34C0695B@oracle.com>

Hi Kishor,

> 17 okt. 2019 kl. 23:28 skrev Kharbas, Kishor <kishor.kharbas at intel.com>:
> 
> Hi Stefan,
> 
>> -----Original Message-----
>> From: Stefan Johansson [mailto:stefan.johansson at oracle.com]
>> Sent: Thursday, October 17, 2019 4:34 AM
>> To: Kharbas, Kishor <kishor.kharbas at intel.com>; sangheon.kim at oracle.com
>> Cc: hotspot-gc-dev at openjdk.java.net
>> Subject: Re: RFR(S): 8215893: Add better abstraction for pinning G1
>> concurrent marking bitmaps.
>> 
>> Hi Kishor,
>> 
>> On 2019-10-17 03:39, Kharbas, Kishor wrote:
>>> Hi Sangheon,
>>> 
>>> *From:*sangheon.kim at oracle.com [mailto:sangheon.kim at oracle.com]
>>> *Sent:* Wednesday, October 16, 2019 11:03 AM
>>> *To:* Kharbas, Kishor <kishor.kharbas at intel.com>
>>> *Cc:* hotspot-gc-dev at openjdk.java.net; Stefan Johansson
>>> <stefan.johansson at oracle.com>
>>> *Subject:* Re: RFR(S): 8215893: Add better abstraction for pinning G1
>>> concurrent marking bitmaps.
>>> 
>>>> Hi Kishor,
>>>> 
>>>> Before reviewing webrev.02, could you remind us what was the
>>>> motivation of pinning the bitmap mappers here?
>>>> In addition to explanations of the problematic situation, any logs /
>>>> stack-trace also may help.
>>>> 
>>>> We think that understanding of the root cause should be considered first.
>>> 
>>> Unfortunately, I do not have log/stack-trace of the problem I had faced.
>>> 
>>> I am trying to reproduce it by running SPECjbb workload over and over
>> again.
>>> 
>>> I haven't looked at GC code since end of last year. So I am having a
>>> difficult time pinning what the problem was.
>>> 
>>> I am looking at G1ClearBitMapTask which iterates over bitmap for all
>>> available regions. I am not sure when this task is performed.
>>> 
>>> There is comment in HeapRegionManager::par_iterate() as shown below,
>>> 
>>> /// This also (potentially) iterates over regions newly allocated
>>> during GC. This/
>>> 
>>> /  // is no problem except for some extra work./
>>> 
>>> This method is eventually called from G1ClearBitMapTask. The comment
>>> suggests that regions are allocated concurrently when the function is
>>> run. This also means with AllocateOldGenAt flag enabled, regions can
>>> also be un-committed.
>> 
>> I don't understand how AllocateOldGenAt would make any difference,
>> regions can be un-committed without it as well and there are mechanisms in
>> place to make sure only the correct parts of the side structures are un-
>> committed when that happens.
> 
> In the regular code un-commit is only done by VM thread during safepoint. Un-commit of region also causes its corresponding bitmap to be un-committed.
> But it never happens that CM threads are iterating over bitmap while regions are being un-committed concurrently.
> 
> Whereas when AllocateOldGenAt is used, because of the way regions are managed between
> dram and nvdimms, regions can be un-committed by mutator threads and GC threads.
> 1. Mutator threads - during mutator region allocation and humongous region allocation.

This is the problem, I managed to reproduce this by adding a short sleep in the clearing code and force back to back concurrent cycles in SPECjvm2008 and a 2g heap. I think this is only a problem for humongous allocations, because we should never allocate more young regions than we have already made available at the end of the previous GC. But the humongous allocations can very well happen during we clear the bitmaps in the concurrent cycle so that is probably why the pinning was added. 

Thinking more about this, a different solution would be to not un-commit memory in this situation. This all depends on how one sees the amount of committed memory when using AllocateOldGenAt, should the amount of committed on dram + nvdimm never be more than Xmx or is the important thing that the number of regions use never exceeds Xmx. I think I?m leaning towards the latter, but there might be reasons I haven?t thought about here. This would break the current invariant:
assert(total_committed_before == total_regions_committed(), "invariant not met?);

But that might be ok. If using that approach, instead of un-committing (shrink_dram), just remove the same number of regions from the freelist, that you expand on nvdimm. The unused removed regions need to be kept track of so we can add them again during the GC. To me this is more or less the same concept we use when borrowing regions during the GC. There might be issues with this approach but I think it would be interesting to explore. 

I also wonder if we ever should need to expand_dram during allocate_new_region, I see that it happens now during GC and that is probably because we do this at the end of the GC:
_manager->adjust_dram_regions((uint)young_list_target_length() ?

If this adjustment included the expected number of survivors as well, we should have enough DRAM regions and if we then end up getting an NVDIMM region when asking for a survivor we should return NULL signaling that survivor is full.

What do you think about that approach?

Thanks,
Stefan

> 2. GC worker threads - during survivor region and old region allocation.
> 3. VMThread - heap size adjustment as in default and after full GC to allocate enough regions in dram for young gen (may require to un-commit some regions from nvdimm).
> 
> Could any of these be running concurrently when CM threads are iterating over the bitmap?
> 
>> 
>> I want to reiterate what Sangheon said about identifying the root cause.
>> If we don't know why this is needed and can't reproduce any failures without
>> the special pinning of the bitmaps, I would rather see that we remove the
>> pinning code to make things work more like normal G1.
> 
> I am trying to reproduce but as you can imagine it is very rare and hard-to-reproduce bug, if it is.
> 
> Thanks,
> Kishor
>> 
>> Thanks,
>> Stefan
>> 
>> 
>>> 
>>> Pardon me if my understanding is incorrect.
>>> 
>>> Regards,
>>> 
>>> Kishor


From thomas.schatzl at oracle.com  Sat Oct 19 13:06:09 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Sat, 19 Oct 2019 15:06:09 +0200
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
Message-ID: <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>

Hi all,

  there is a new webrev at 

http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2/ (full only,
there is no point in providing a diff)

since I like this solution a lot as it removes a lot of additional
post-processing.

Testing has been a bit of a headache: interference between strong and
weak processing is extremely rare, so I had to make it pretty common by

1) only a single thread doing strong processing
2) the weak processing stage has to be moved right after the root
processing so they overlap with a lot higher probability

hs-tier 1-5 passes with and without these changes, with a noticable
amount of overlap according to additional log messages. That change can
be looked at at 
http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2.testing/ .
Obviously I am not going to push this.

Surprisingly there had to be no changes to Shenandoah as it does not
use the claim mechanism changed here, implementing something else.
Shenandoah also passed vmTestbase/gc with these changes with no
problem.

Below this email is a copy of Kim's suggestion about the state machine
again for reference. I also added documentation about why and how the
code is supposed to work.

Thanks,
  Thomas

On Wed, 2019-10-09 at 17:23 -0400, Kim Barrett wrote:
> > On Oct 8, 2019, at 7:48 PM, Kim Barrett <kim.barrett at oracle.com>
> > wrote:
> > src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> > 3874   if (collector_state()->in_initial_mark_gc()) {
> > 3875     remark_strong_nmethods(per_thread_states);
> > 3876   }
> > 
> > I think this additional task and the associated pending strong
> > nmethod
> > sets in the pss can be eliminated by using a 2-bit tag and a more
> > complex state machine earlier.
> 
> I thought about this some more and have some improvements to the
> previous pseudo-code, including eliminating the loop in
> strong_processor.  More careful consideration of the possible states
> showed them to be more limited than I'd previously thought they were.
> I hadn't noticed the benefit from delaying weak_processor's push onto
> the global list and combining it with the transition to the "weak
> done" state.
> 
> States, encoded in the link member of nmethod N:
> - unclaimed: NULL
> - weak: N, tag 00
> - weak done: NEXT, tag 01
> - weak, need strong: N, tag 10
> - strong: NEXT, tag 11
> 
> where NEXT is the next nmethod in the global list, or N if it is the
> last entry, e.g. self-loop indicates end of list.
> 
> weak_processor(n):
>     if n->link != NULL:
>         # already claimed; nothing to do here.
>         return
>     elif not replace_if_null(tagged(n, 0), &n->link):
>         # just claimed by another thread; nothing to do here.
>         return
>     # successfully claimed for weak processing.
>     assert n->link == tagged(n, 0)
>     do_weak_processing(n)
>     # push onto global list.  self-loop end of list to avoid tagged
> NULL.
>     # not pushing onto global list until ready to mark weak
> processing
>     # done significantly simplifies the set of states.
>     next = xchg(n, &_list_head) 
>     if next == NULL: next = n 
>     # try to install end of list + weak done tag.
>     if cmpxchg(tagged(next, 1), &n->link, tagged(n, 0)) == tagged(n,
> 0):
>         return
>     # failed, which means some other thread added strong request.
>     assert n->link == tagged(n, 2)
>     # do deferred strong processing.
>     n->link = tagged(next, 3)
>     do_strong_processing(n)
> 
> strong_processor(n):
>     raw_next = cmpxchg(tagged(n, 3), &n->link, NULL)
>     if raw_next == NULL:
>         # successfully claimed for strong processing.
>         do_strong_processing(n)
>         # push onto global list.  self-loop end of list to avoid
> tagged NULL.
>         next = xchg(n, &_list_head)
>         if next == NULL: next = n
>         n->link = tagged(next, 3)
>         return
>     # claim failed.  figure out why and handle it.
>     next = strip_tag(raw_next)
>     if raw_next == next:          # (raw_next - next) == 0
>         # claim failed because being weak processed (state ==
> "weak").
> 	# try to request deferred strong processing.
>         assert next == tagged(n, 0)
>         raw_next = cmpxchg(tagged(n, 2), &n->link, next)
>         if (raw_next == next):
>             # successfully requested deferred strong processing.
>             return
>         # failed because of a concurrent transition.
> 	# no longer in "weak" state.
>         next = strip_tag(raw_next)
>     if (raw_next - next) >= 2:
>         # already claimed for strong processing or requested for
> such.
>         return
>     # weak processing is complete.
>     # raw_next: tag == 1, NEXT == next list entry or N    
>     if cmpxchg(tagged(NEXT, 3), &N->link, raw_next) == raw_next:
>         # claimed "weak done" to "strong".
>         do_strong_processing(N)
>     # if claim failed then some other thread got it.
> 


From shade at redhat.com  Sun Oct 20 19:29:06 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Sun, 20 Oct 2019 21:29:06 +0200
Subject: RFR 8232010: Shenandoah: implement self-fixing native barrier
In-Reply-To: <6cecca8a-a477-53b4-48de-f504a2100955@redhat.com>
References: <6cecca8a-a477-53b4-48de-f504a2100955@redhat.com>
Message-ID: <7c73ed8d-ce53-c54b-28d5-6806f1000af7@redhat.com>

On 10/11/19 2:30 PM, Zhengyu Gu wrote:
> Please review this patch that implements self-fixing LRB for in native oops.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232010
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232010/webrev.00/

This breaks Windows builds, see:
  https://bugs.openjdk.java.net/browse/JDK-8232674

-- 
Thanks,
-Aleksey


From shade at redhat.com  Sun Oct 20 20:33:56 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Sun, 20 Oct 2019 22:33:56 +0200
Subject: RFR (S) 8232674: Fix build and rename
 ShenandoahBarrierSet::oop_load_from_native_barrier
Message-ID: <d42d8cbb-0f37-9732-fe23-6c6a3a407553@redhat.com>

P1 bug:
  https://bugs.openjdk.java.net/browse/JDK-8232674

I believe this is caused by missing definition of this method:
  oop oop_load_from_native_barrier(oop obj, narrowOop* load_addr);

The way out is to generify it, the same way as we do it for SBS::load_reference_barrier. I also took
this opportunity to rename the method to match the other LRB flavor: now the
ShenandoahRuntime::load_reference_barrier_native wrapper looks right.

Fix:
  https://cr.openjdk.java.net/~shade/8232674/webrev.01/

Testing: Windows x86_64 build, Linux x86_64 build, hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Mon Oct 21 00:19:21 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Sun, 20 Oct 2019 20:19:21 -0400
Subject: RFR (S) 8232674: Fix build and rename
 ShenandoahBarrierSet::oop_load_from_native_barrier
In-Reply-To: <d42d8cbb-0f37-9732-fe23-6c6a3a407553@redhat.com>
References: <d42d8cbb-0f37-9732-fe23-6c6a3a407553@redhat.com>
Message-ID: <c619d38b-c5a4-941e-d512-ce4daf0e92cb@redhat.com>

Thanks for fixing it, Aleksey

On 10/20/19 4:33 PM, Aleksey Shipilev wrote:
> P1 bug:
>    https://bugs.openjdk.java.net/browse/JDK-8232674
> 
> I believe this is caused by missing definition of this method:
>    oop oop_load_from_native_barrier(oop obj, narrowOop* load_addr);

This method should never been used. I thought I can get away with not 
implementing it (just like with gcc on Linux).

I think the method body should just be:
  ...
   ShouldNotReacheHere();
   return NULL;
  ...

-Zhengyu


> 
> The way out is to generify it, the same way as we do it for SBS::load_reference_barrier. I also took
> this opportunity to rename the method to match the other LRB flavor: now the
> ShenandoahRuntime::load_reference_barrier_native wrapper looks right.
> 
> Fix:
>    https://cr.openjdk.java.net/~shade/8232674/webrev.01/
> 
> Testing: Windows x86_64 build, Linux x86_64 build, hotspot_gc_shenandoah
> 


From shade at redhat.com  Mon Oct 21 08:00:21 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 21 Oct 2019 10:00:21 +0200
Subject: RFR (S) 8232674: Fix build and rename
 ShenandoahBarrierSet::oop_load_from_native_barrier
In-Reply-To: <c619d38b-c5a4-941e-d512-ce4daf0e92cb@redhat.com>
References: <d42d8cbb-0f37-9732-fe23-6c6a3a407553@redhat.com>
 <c619d38b-c5a4-941e-d512-ce4daf0e92cb@redhat.com>
Message-ID: <b5eeeebd-531c-be70-8ded-364bab4bae70@redhat.com>

On 10/21/19 2:19 AM, Zhengyu Gu wrote:
> On 10/20/19 4:33 PM, Aleksey Shipilev wrote:
>> P1 bug:
>> ?? https://bugs.openjdk.java.net/browse/JDK-8232674
>>
>> I believe this is caused by missing definition of this method:
>> ?? oop oop_load_from_native_barrier(oop obj, narrowOop* load_addr);
> 
> This method should never been used. I thought I can get away with not implementing it (just like
> with gcc on Linux).
> 
> I think the method body should just be:
> ?...
> ? ShouldNotReacheHere();
> ? return NULL;
> ?...

Right! Let's do that:
 https://cr.openjdk.java.net/~shade/8232674/webrev.02/
?? Testing: {Linux, Windows} x86_64 hotspot_gc_shenandoah; tier1 with Shenandoah

-- 
Thanks,
-Aleksey


From sakamoto.osamu at nttcom.co.jp  Mon Oct 21 08:50:23 2019
From: sakamoto.osamu at nttcom.co.jp (Osamu Sakamoto)
Date: Mon, 21 Oct 2019 17:50:23 +0900
Subject: Segmentation Fault occurs when ClassLoader and Metaspace is released
 in JDK 8
Message-ID: <fb308571-cbdf-1f4b-177f-aa6bac986a5f@nttcom.co.jp_1>

Hi all,

I have a problem about Segmentation Fault(SEGV) in GC and I can't make 
the cause clear.
Could you help me solve the problem?

Our System uses OpenJDK 1.8.0.181, and crashed by SEGV when purging 
ClassLoader at safepoint.
This problem can't be reproduced, but this has happened 4 times in a few 
months.

The following is the summary of my investigation.

=============================================================================

First I checked hs_err, and that shows that the SEGV occurred.
VM_Operation is GenCollectForAllocation at safepoint.

-----------------------------------------------------------------------------
#
# A fatal error has been detected by the Java Runtime Environment:
#
#? SIGSEGV (0xb) at pc=0x00007f6080c97f88, pid=23931, tid=0x00007f607c3ed700
#
# JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 
1.8.0_181-b13)
# Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# V? [libjvm.so+0x84bf88]
#
# Core dump written. Default location: /opt/tomcate0/core or core.23931
#
# If you would like to submit a bug report, please visit:
#?? http://bugreport.java.com/bugreport/crash.jsp
#

---------------? T H R E A D? ---------------

Current thread (0x00007f6078c00000):? VMThread [stack: 
0x00007f607c2ed000,0x00007f607c3ee000] [id=23939]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x0000000000000018

Registers:
RAX=0x0000000000000010, RBX=0x00007f5ff800ad30, RCX=0x0000000000000010, 
RDX=0x0000000000000000
RSP=0x00007f607c3ecb50, RBP=0x00007f607c3ecb80, RSI=0x0000000000000002, 
RDI=0x0000000001cfe570
R8 =0x00007f5ff80ae320, R9 =0x00007f5ff8052480, R10=0x0000000000000000, 
R11=0x0000000000000400
R12=0x0000000001cfe570, R13=0x00007f6081419470, R14=0x0000000000000002, 
R15=0x00007f6081418640
RIP=0x00007f6080c97f88, EFLAGS=0x0000000000010202, 
CSGSFS=0x0000000000000033, ERR=0x0000000000000004
 ? TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007f607c3ecb50)
0x00007f607c3ecb50:?? 00007f607c3ecba0 00007f5ff800ad30
0x00007f607c3ecb60:?? 00007f5ff800ad00 0000000000000000
0x00007f607c3ecb70:?? 0000000000000000 0000000000000001
0x00007f607c3ecb80:?? 00007f607c3ecba0 00007f6080c995fa
0x00007f607c3ecb90:?? 00007f5ff800ad00 00007f5ff800ac20
0x00007f607c3ecba0:?? 00007f607c3ecbc0 00007f60808bff5e
0x00007f607c3ecbb0:?? 00007f5ff800ac20 00007f5ff8052870
0x00007f607c3ecbc0:?? 00007f607c3ecbe0 00007f60808c0f0f
0x00007f607c3ecbd0:?? 00007f607c3ecbf0 00007f608140f308
0x00007f607c3ecbe0:?? 00007f607c3ecc30 00007f6080daa0b7
0x00007f607c3ecbf0:?? 00007f6069000100 0000000000000000
0x00007f607c3ecc00:?? 00007f607c3ecc20 00007f6080ed0800
0x00007f607c3ecc10:?? 00000000000000f9 88e95c3ba257ab00
0x00007f607c3ecc20:?? 431bde82d7b634db 00007f607800aa00
0x00007f607c3ecc30:?? 00007f607c3eccc0 00007f6080daa9d5
0x00007f607c3ecc40:?? 0000000000000000 00007f607803bf20
0x00007f607c3ecc50:?? 00007f607803be20 00000000000003e8
0x00007f607c3ecc60:?? 0000000000000001 00007f6078c00000
0x00007f607c3ecc70:?? 00007f607c3eccc0 0000000000000000
0x00007f607c3ecc80:?? 00000004000000f9 00007f60813e2b99
0x00007f607c3ecc90:?? 00007f607803bfa0 00007f6078c00000
0x00007f607c3ecca0:?? 0000000000000000 0000000000000000
0x00007f607c3eccb0:?? 00007f6081418bd0 00007f607803bf20
0x00007f607c3eccc0:?? 00007f607c3ece60 00007f6080f2048a
0x00007f607c3eccd0:?? 00007f607c3ecd20 00007f607c3ecce0
0x00007f607c3ecce0:?? 00007f6078c00000 00007f6078c00980
0x00007f607c3eccf0:?? 00007f6078c009c0 00007f6078c009d0
0x00007f607c3ecd00:?? 00007f6078c00aa8 00000000000000d8
0x00007f607c3ecd10:?? 00007f6078c00be0 0000000000000000
0x00007f607c3ecd20:?? 00007f607c3ecd28 6e69747563657845
0x00007f607c3ecd30:?? 65706f204d562067 203a6e6f69746172
0x00007f607c3ecd40:?? 656c6c6f436e6547 6c6c41726f467463

Instructions: (pc=0x00007f6080c97f88)
0x00007f6080c97f68:?? b6 12 80 fa 00 74 01 f0 48 0f c1 01 31 c9 31 f6
0x00007f6080c97f78:?? 48 8b 44 0b 10 31 d2 48 85 c0 74 11 0f 1f 40 00
0x00007f6080c97f88:?? 48 8b 40 08 48 83 c2 01 48 85 c0 75 f3 48 83 c1
0x00007f6080c97f98:?? 08 48 01 d6 48 83 f9 20 75 d6 8b 7b 08 48 8b 05

Register to memory mapping:

RAX=0x0000000000000010 is an unknown value
RBX=0x00007f5ff800ad30 is an unknown value
RCX=0x0000000000000010 is an unknown value
RDX=0x0000000000000000 is an unknown value
RSP=0x00007f607c3ecb50 is an unknown value
RBP=0x00007f607c3ecb80 is an unknown value
RSI=0x0000000000000002 is an unknown value
RDI=0x0000000001cfe570 is an unknown value
R8 =0x00007f5ff80ae320 is an unknown value
R9 =0x00007f5ff8052480 is an unknown value
R10=0x0000000000000000 is an unknown value
R11=0x0000000000000400 is an unknown value
R12=0x0000000001cfe570 is an unknown value
R13=0x00007f6081419470: <offset 0xfcd470> in 
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so 
at 0x00007f608044c000
R14=0x0000000000000002 is an unknown value
R15=0x00007f6081418640: <offset 0xfcc640> in 
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so 
at 0x00007f608044c000


Stack: [0x00007f607c2ed000,0x00007f607c3ee000], sp=0x00007f607c3ecb50,? 
free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
C=native code)
V? [libjvm.so+0x84bf88]
V? [libjvm.so+0x84d5fa]
V? [libjvm.so+0x473f5e]
V? [libjvm.so+0x474f0f]
V? [libjvm.so+0x95e0b7]
V? [libjvm.so+0x95e9d5]
V? [libjvm.so+0xad448a]
V? [libjvm.so+0xad48f1]
V? [libjvm.so+0x8beb82]

VM_Operation (0x00007f5fd69e6120): GenCollectForAllocation, mode: 
safepoint, requested by thread 0x00007f6079013800

...
-----------------------------------------------------------------------------


Next, I used GDB to check the backtrace of the SEGV thread from the 
coredump.
The following is the backtrace.
The SEGV occurred when ClassLoader is purged and Metaspace is destructed.
And frame #7 shows that a signal(SEGV) handler is called after 
SpaceManager::~SpaceManager() is executed.

-----------------------------------------------------------------------------
(gdb) bt
#0? 0x00007f608146f1f7 in __GI_raise (sig=sig at entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1? 0x00007f60814708e8 in __GI_abort () at abort.c:90
#2? 0x00007f6080d0bc39 in os::abort (dump_core=<optimized out>) at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1519
#3? 0x00007f6080f1b816 in VMError::report_and_die 
(this=this at entry=0x7f607c3ebd10) at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060
#4? 0x00007f6080d15927 in JVM_handle_linux_signal (sig=11, 
info=0x7f607c3ebfb0, ucVoid=0x7f607c3ebe80, 
abort_if_unrecognized=<optimized out>)
 ??? at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
#5? 0x00007f6080d09038 in signalHandler (sig=11, info=0x7f607c3ebfb0, 
uc=0x7f607c3ebe80) at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4446
#6? <signal handler called>
#7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, 
__in_chrg=<optimized out>) at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
#8? 0x00007f6080c995fa in Metaspace::~Metaspace (this=0x7f5ff800ad00, 
__in_chrg=<optimized out>)
 ??? at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2971
#9? 0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData 
(this=0x7f5ff800ac20, __in_chrg=<optimized out>)
 ??? at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
#10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
#11 0x00007f6080daa0b7 in ClassLoaderDataGraph::purge_if_needed () at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.hpp:104
#12 SafepointSynchronize::do_cleanup_tasks () at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:551
#13 0x00007f6080daa9d5 in SafepointSynchronize::begin () at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:402
#14 0x00007f6080f2048a in VMThread::loop 
(this=this at entry=0x7f6078c00000) at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:501
#15 0x00007f6080f208f1 in VMThread::run (this=0x7f6078c00000) at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276
#16 0x00007f6080d0ab82 in java_start (thread=0x7f6078c00000) at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796
#17 0x00007f6081e2de25 in start_thread (arg=0x7f607c3ed700) at 
pthread_create.c:308
#18 0x00007f608153234d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:113
-----------------------------------------------------------------------------


In Frame #7, Line 2028 (chunk = chunk->next()) is the crash point.
The variable "chunk" is defined at Line 2025 (Metachunk* chunk = 
chunks_in_use(i);).
"chunks_in_use(i)" is defined at Line 648 (Metachunk* 
chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }).
So I checked values of "_chunks_in_use", and understood that 
"_chunks_in_use[2]" has Illegal Address "0x10".
Therefore, I think that the SEGV occurred because of referencing Illegal 
Address "0x10" at "chunk = chunk->next()".

-----------------------------------------------------------------------------
(gdb) f 7
#7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, 
__in_chrg=<optimized out>) at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
2028??? ??? chunk = chunk->next();
(gdb) list
2023??? size_t SpaceManager::sum_count_in_chunks_in_use(ChunkIndex i) {
2024??? ? size_t count = 0;
2025??? ? Metachunk* chunk = chunks_in_use(i);
2026??? ? while (chunk != NULL) {
2027??? ??? count++;
2028??? ??? chunk = chunk->next();
2029??? ? }
2030??? ? return count;
2031??? }
2032
(gdb) list SpaceManager::chunks_in_use
647??? ? // Accessors
648??? ? Metachunk* chunks_in_use(ChunkIndex index) const { return 
_chunks_in_use[index]; }
...
(gdb) p _chunks_in_use
$11 = {0x7f5fcd41c400, 0x7f5fcd41a000, 0x10, 0x0}
-----------------------------------------------------------------------------


The following is disassemble code of "SpaceManager::~SpaceManager()".
%rax has 0x10 at "0x00007f6080c97f88 <+200>", but I don't understand why 
this "0x10" is inserted to %rax.

-----------------------------------------------------------------------------
(gdb) disas
Dump of assembler code for function SpaceManager::~SpaceManager():
 ?? 0x00007f6080c97ec0 <+0>:??? push?? %rbp
 ?? 0x00007f6080c97ec1 <+1>:??? mov??? %rsp,%rbp
 ?? 0x00007f6080c97ec4 <+4>:??? push?? %r15
 ?? 0x00007f6080c97ec6 <+6>:??? push?? %r14
 ?? 0x00007f6080c97ec8 <+8>:??? push?? %r13
 ?? 0x00007f6080c97eca <+10>:??? push?? %r12
 ?? 0x00007f6080c97ecc <+12>:??? push?? %rbx
 ?? 0x00007f6080c97ecd <+13>:??? mov??? %rdi,%rbx
 ?? 0x00007f6080c97ed0 <+16>:??? sub??? $0x8,%rsp
 ?? 0x00007f6080c97ed4 <+20>:??? mov 0x780785(%rip),%r12??????? # 
0x7f6081418660 <_ZN12SpaceManager12_expand_lockE>
 ?? 0x00007f6080c97edb <+27>:??? test?? %r12,%r12
 ?? 0x00007f6080c97ede <+30>:??? je???? 0x7f6080c97ee8 
<SpaceManager::~SpaceManager()+40>
 ?? 0x00007f6080c97ee0 <+32>:??? mov??? %r12,%rdi
 ?? 0x00007f6080c97ee3 <+35>:??? callq? 0x7f6080cce2f0 
<Monitor::lock_without_safepoint_check()>
 ?? 0x00007f6080c97ee8 <+40>:??? movslq 0x8(%rbx),%rcx
 ?? 0x00007f6080c97eec <+44>:??? lea 0x78075d(%rip),%rdx??????? # 
0x7f6081418650 <_ZN12MetaspaceAux15_capacity_wordsE>
 ?? 0x00007f6080c97ef3 <+51>:??? lea 0x781576(%rip),%r13??????? # 
0x7f6081419470 <_ZN2os16_processor_countE>
 ?? 0x00007f6080c97efa <+58>:??? lea 0x78073f(%rip),%r15??????? # 
0x7f6081418640 <_ZN12MetaspaceAux11_used_wordsE>
 ?? 0x00007f6080c97f01 <+65>:??? mov??? (%rdx,%rcx,8),%rax
 ?? 0x00007f6080c97f05 <+69>:??? sub??? 0x40(%rbx),%rax
 ?? 0x00007f6080c97f09 <+73>:??? mov??? %rax,(%rdx,%rcx,8)
 ?? 0x00007f6080c97f0d <+77>:??? mov??? 0x38(%rbx),%rax
 ?? 0x00007f6080c97f11 <+81>:??? movslq 0x8(%rbx),%rdx
 ?? 0x00007f6080c97f15 <+85>:??? neg??? %rax
 ?? 0x00007f6080c97f18 <+88>:??? cmpl?? $0x1,0x0(%r13)
 ?? 0x00007f6080c97f1d <+93>:??? lea??? (%r15,%rdx,8),%rcx
 ?? 0x00007f6080c97f21 <+97>:??? mov??? $0x1,%edx
 ?? 0x00007f6080c97f26 <+102>:??? jne??? 0x7f6080c97f32 
<SpaceManager::~SpaceManager()+114>
 ?? 0x00007f6080c97f28 <+104>:??? lea 0x74acb4(%rip),%rdx??????? # 
0x7f60813e2be3 <AssumeMP>
 ?? 0x00007f6080c97f2f <+111>:??? movzbl (%rdx),%edx
 ?? 0x00007f6080c97f32 <+114>:??? cmp??? $0x0,%dl
 ?? 0x00007f6080c97f35 <+117>:??? je???? 0x7f6080c97f38 
<SpaceManager::~SpaceManager()+120>
 ?? 0x00007f6080c97f37 <+119>:??? lock xadd %rax,(%rcx)
 ?? 0x00007f6080c97f3c <+124>:??? mov??? 0x48(%rbx),%r14
 ?? 0x00007f6080c97f40 <+128>:??? callq? 0x7f6080c951a0 
<Metachunk::overhead()>
 ?? 0x00007f6080c97f45 <+133>:??? movslq 0x8(%rbx),%rdx
 ?? 0x00007f6080c97f49 <+137>:??? imul?? %r14,%rax
 ?? 0x00007f6080c97f4d <+141>:??? lea??? (%r15,%rdx,8),%rcx
 ?? 0x00007f6080c97f51 <+145>:??? mov??? $0x1,%edx
 ?? 0x00007f6080c97f56 <+150>:??? neg??? %rax
 ?? 0x00007f6080c97f59 <+153>:??? cmpl?? $0x1,0x0(%r13)
 ?? 0x00007f6080c97f5e <+158>:??? jne??? 0x7f6080c97f6a 
<SpaceManager::~SpaceManager()+170>
 ?? 0x00007f6080c97f60 <+160>:??? lea 0x74ac7c(%rip),%rdx??????? # 
0x7f60813e2be3 <AssumeMP>
 ?? 0x00007f6080c97f67 <+167>:??? movzbl (%rdx),%edx
 ?? 0x00007f6080c97f6a <+170>:??? cmp??? $0x0,%dl
 ?? 0x00007f6080c97f6d <+173>:??? je???? 0x7f6080c97f70 
<SpaceManager::~SpaceManager()+176>
 ?? 0x00007f6080c97f6f <+175>:??? lock xadd %rax,(%rcx)
 ?? 0x00007f6080c97f74 <+180>:??? xor??? %ecx,%ecx
 ?? 0x00007f6080c97f76 <+182>:??? xor??? %esi,%esi
 ?? 0x00007f6080c97f78 <+184>:??? mov 0x10(%rbx,%rcx,1),%rax
 ?? 0x00007f6080c97f7d <+189>:??? xor??? %edx,%edx
 ?? 0x00007f6080c97f7f <+191>:??? test?? %rax,%rax
 ?? 0x00007f6080c97f82 <+194>:??? je???? 0x7f6080c97f95 
<SpaceManager::~SpaceManager()+213>
 ?? 0x00007f6080c97f84 <+196>:??? nopl?? 0x0(%rax)
=> 0x00007f6080c97f88 <+200>:??? mov??? 0x8(%rax),%rax
 ?? 0x00007f6080c97f8c <+204>:??? add??? $0x1,%rdx
 ?? 0x00007f6080c97f90 <+208>:??? test?? %rax,%rax
...
(gdb) info registers
rax??????????? 0x10??? 16
rbx??????????? 0x7f5ff800ad30??? 140050159414576
rcx??????????? 0x10??? 16
rdx??????????? 0x0??? 0
rsi??????????? 0x2??? 2
rdi??????????? 0x1cfe570??? 30401904
rbp??????????? 0x7f607c3ecb80??? 0x7f607c3ecb80
rsp??????????? 0x7f607c3ecb50??? 0x7f607c3ecb50
r8???????????? 0x7f5ff80ae320??? 140050160083744
r9???????????? 0x7f5ff8052480??? 140050159707264
r10??????????? 0x0??? 0
r11??????????? 0x400??? 1024
r12??????????? 0x1cfe570??? 30401904
r13??????????? 0x7f6081419470??? 140052462146672
r14??????????? 0x2??? 2
r15??????????? 0x7f6081418640??? 140052462143040
rip??????????? 0x7f6080c97f88??? 0x7f6080c97f88 
<SpaceManager::~SpaceManager()+200>
eflags???????? 0x206??? [ PF IF ]
cs???????????? 0x33??? 51
ss???????????? 0x2b??? 43
ds???????????? 0x0??? 0
es???????????? 0x0??? 0
fs???????????? 0x0??? 0
gs???????????? 0x0??? 0
k0???????????? <unavailable>
k1???????????? <unavailable>
k2???????????? <unavailable>
k3???????????? <unavailable>
k4???????????? <unavailable>
k5???????????? <unavailable>
k6???????????? <unavailable>
k7???????????? <unavailable>
-----------------------------------------------------------------------------

=============================================================================


Does anyone know about this case?

Thanks, Osamu


From shade at redhat.com  Mon Oct 21 10:08:39 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 21 Oct 2019 12:08:39 +0200
Subject: RFR (S) 8232702: Shenandoah: gc/shenandoah/TestVerifyJCStress.java
 uses non-existent -XX:+VerifyObjectEquals
Message-ID: <5106d033-9d06-4190-8fcb-9fbc984ec736@redhat.com>

Testbug:
  https://bugs.openjdk.java.net/browse/JDK-8232702

Fix:
  https://cr.openjdk.java.net/~shade/8232702/webrev.01/

This is the left-over from the days when ShenandoahVerifyObjectEquals was just VerifyObjectEquals.
It was removed by JDK-8231946. This test never noticed it, because it ignored unrecognized VM
options wholesale, but should really only do it for the ShVerifyOptoBarriers.

Testing: affected test on Linux x86_64 {release, fastdebug, slowdebug}

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Mon Oct 21 10:34:56 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 21 Oct 2019 12:34:56 +0200
Subject: RFR (S) 8232702: Shenandoah:
 gc/shenandoah/TestVerifyJCStress.java uses non-existent
 -XX:+VerifyObjectEquals
In-Reply-To: <5106d033-9d06-4190-8fcb-9fbc984ec736@redhat.com>
References: <5106d033-9d06-4190-8fcb-9fbc984ec736@redhat.com>
Message-ID: <702435be-2a6c-3a32-ce8f-7471454db7ac@redhat.com>

Looks good. Thanks!

Roman


> Testbug:
>   https://bugs.openjdk.java.net/browse/JDK-8232702
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232702/webrev.01/
> 
> This is the left-over from the days when ShenandoahVerifyObjectEquals was just VerifyObjectEquals.
> It was removed by JDK-8231946. This test never noticed it, because it ignored unrecognized VM
> options wholesale, but should really only do it for the ShVerifyOptoBarriers.
> 
> Testing: affected test on Linux x86_64 {release, fastdebug, slowdebug}
> 


From thomas.schatzl at oracle.com  Mon Oct 21 11:18:46 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 21 Oct 2019 13:18:46 +0200
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
Message-ID: <4b3026b8-10bd-7439-84c6-d906bacd7774@oracle.com>

Hi Sangheon,

On 13.10.19 08:00, sangheon.kim at oracle.com wrote:
> Hi all,
> 
> Previous patch conflicts, so I'm posting rebased one.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.2
> Testing: hs-tier 1 ~ 5, with/without UseNUMA
> 

Looks good to me.

Thanks,
   Thomas


From stefan.karlsson at oracle.com  Mon Oct 21 13:00:14 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 21 Oct 2019 15:00:14 +0200
Subject: RFR: 8232601: ZGC: Parameterize the ZGranuleMap table size
Message-ID: <f486753e-cd02-5c22-7390-041cf248f423@oracle.com>

Hi all,

Please review this patch to parameterize the ZGranuleMap table size.

https://cr.openjdk.java.net/~stefank/8232601/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8232601

Previously, the maps were always bound by the range of a virtual address 
space view (ZAddressOffsetMax). We want to be able to use ZGranuleMap to 
map against physical memory offsets, so this RFE suggests that we allow 
users of ZGranuleMap to specify the max offset.

Thanks,
StefanK


From zgu at redhat.com  Mon Oct 21 13:03:19 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 21 Oct 2019 09:03:19 -0400
Subject: RFR (S) 8232674: Fix build and rename
 ShenandoahBarrierSet::oop_load_from_native_barrier
In-Reply-To: <b5eeeebd-531c-be70-8ded-364bab4bae70@redhat.com>
References: <d42d8cbb-0f37-9732-fe23-6c6a3a407553@redhat.com>
 <c619d38b-c5a4-941e-d512-ce4daf0e92cb@redhat.com>
 <b5eeeebd-531c-be70-8ded-364bab4bae70@redhat.com>
Message-ID: <77e762ba-df60-ef10-b110-e6260e75cf77@redhat.com>

Looks good to me.

Thanks,

-Zhengyu

On 10/21/19 4:00 AM, Aleksey Shipilev wrote:
> On 10/21/19 2:19 AM, Zhengyu Gu wrote:
>> On 10/20/19 4:33 PM, Aleksey Shipilev wrote:
>>> P1 bug:
>>>  ?? https://bugs.openjdk.java.net/browse/JDK-8232674
>>>
>>> I believe this is caused by missing definition of this method:
>>>  ?? oop oop_load_from_native_barrier(oop obj, narrowOop* load_addr);
>>
>> This method should never been used. I thought I can get away with not implementing it (just like
>> with gcc on Linux).
>>
>> I think the method body should just be:
>>  ?...
>>  ? ShouldNotReacheHere();
>>  ? return NULL;
>>  ?...
> 
> Right! Let's do that:
>   https://cr.openjdk.java.net/~shade/8232674/webrev.02/
>  ?? Testing: {Linux, Windows} x86_64 hotspot_gc_shenandoah; tier1 with Shenandoah
> 


From stefan.karlsson at oracle.com  Mon Oct 21 13:09:57 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 21 Oct 2019 15:09:57 +0200
Subject: RFR: 8232602: ZGC: Make ZGranuleMap ZAddress agnostic
Message-ID: <4638080b-9f2e-6965-6ed9-a17b32ad3b94@oracle.com>

Hi all,

Please review this patch to make ZGranuleMap ZAddress agnostic.

https://cr.openjdk.java.net/~stefank/8232602/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8232602

Currently, the ZGranuleMap get and put functions take an address in the 
heap as a parameter. The address is then converted into an offset (into 
a heap view), before being scaled to a granule.

We want to be able to use the ZGranuleMap for physical memory offsets, 
and not only heap addresses. Therefore, I propose that we move the 
conversions from address to offset out from ZGranuleMap, and move it to 
the current users of ZGranuleMap.

This patch applies on-top of the patch for JDK-8232601.

Thanks,
StefanK


From stefan.karlsson at oracle.com  Mon Oct 21 13:22:00 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 21 Oct 2019 15:22:00 +0200
Subject: RFR: 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of declarations
Message-ID: <b981daf2-2924-1708-5a2f-7475cef3d85a@oracle.com>

Hi all,

Please review this patch to move ATTRIBUTE_ALIGNED to the front of 
declarations.

https://cr.openjdk.java.net/~stefank/8232648/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8232648

This is done because the Windows compiler requires ATTRIBUTE_ALIGNED to 
be put at the front of declarations. A new macro (ZCACHE_ALIGNED) is 
introduced, and used, to shorten the affected lines.

Thanks,
StefanK


From suenaga at oss.nttdata.com  Mon Oct 21 13:29:22 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 21 Oct 2019 22:29:22 +0900
Subject: Segmentation Fault occurs when ClassLoader and Metaspace is
 released in JDK 8
In-Reply-To: <fb308571-cbdf-1f4b-177f-aa6bac986a5f@nttcom.co.jp_1>
References: <fb308571-cbdf-1f4b-177f-aa6bac986a5f@nttcom.co.jp_1>
Message-ID: <422c9ca2-5053-c761-cb61-f075877bb666@oss.nttdata.com>

Hi Osamu,

What JVM options did you pass?

I guess you used CMS because this problem seems to occur on CMS only [1] [2].
So it might be work around not to use CMS.

I'm not sure root cause of this issue, but it seems to break ClassLoaderDataGraph::_unloading.
(like double free (delete) of CLD)


Thanks,

Yasumasa


[1] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.hpp#l100
[2] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp#l6384


On 2019/10/21 17:50, Osamu Sakamoto wrote:
> Hi all,
> 
> I have a problem about Segmentation Fault(SEGV) in GC and I can't make the cause clear.
> Could you help me solve the problem?
> 
> Our System uses OpenJDK 1.8.0.181, and crashed by SEGV when purging ClassLoader at safepoint.
> This problem can't be reproduced, but this has happened 4 times in a few months.
> 
> The following is the summary of my investigation.
> 
> =============================================================================
> 
> First I checked hs_err, and that shows that the SEGV occurred.
> VM_Operation is GenCollectForAllocation at safepoint.
> 
> -----------------------------------------------------------------------------
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #? SIGSEGV (0xb) at pc=0x00007f6080c97f88, pid=23931, tid=0x00007f607c3ed700
> #
> # JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
> # Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
> # Problematic frame:
> # V? [libjvm.so+0x84bf88]
> #
> # Core dump written. Default location: /opt/tomcate0/core or core.23931
> #
> # If you would like to submit a bug report, please visit:
> #?? http://bugreport.java.com/bugreport/crash.jsp
> #
> 
> ---------------? T H R E A D? ---------------
> 
> Current thread (0x00007f6078c00000):? VMThread [stack: 0x00007f607c2ed000,0x00007f607c3ee000] [id=23939]
> 
> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000018
> 
> Registers:
> RAX=0x0000000000000010, RBX=0x00007f5ff800ad30, RCX=0x0000000000000010, RDX=0x0000000000000000
> RSP=0x00007f607c3ecb50, RBP=0x00007f607c3ecb80, RSI=0x0000000000000002, RDI=0x0000000001cfe570
> R8 =0x00007f5ff80ae320, R9 =0x00007f5ff8052480, R10=0x0000000000000000, R11=0x0000000000000400
> R12=0x0000000001cfe570, R13=0x00007f6081419470, R14=0x0000000000000002, R15=0x00007f6081418640
> RIP=0x00007f6080c97f88, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000004
>  ? TRAPNO=0x000000000000000e
> 
> Top of Stack: (sp=0x00007f607c3ecb50)
> 0x00007f607c3ecb50:?? 00007f607c3ecba0 00007f5ff800ad30
> 0x00007f607c3ecb60:?? 00007f5ff800ad00 0000000000000000
> 0x00007f607c3ecb70:?? 0000000000000000 0000000000000001
> 0x00007f607c3ecb80:?? 00007f607c3ecba0 00007f6080c995fa
> 0x00007f607c3ecb90:?? 00007f5ff800ad00 00007f5ff800ac20
> 0x00007f607c3ecba0:?? 00007f607c3ecbc0 00007f60808bff5e
> 0x00007f607c3ecbb0:?? 00007f5ff800ac20 00007f5ff8052870
> 0x00007f607c3ecbc0:?? 00007f607c3ecbe0 00007f60808c0f0f
> 0x00007f607c3ecbd0:?? 00007f607c3ecbf0 00007f608140f308
> 0x00007f607c3ecbe0:?? 00007f607c3ecc30 00007f6080daa0b7
> 0x00007f607c3ecbf0:?? 00007f6069000100 0000000000000000
> 0x00007f607c3ecc00:?? 00007f607c3ecc20 00007f6080ed0800
> 0x00007f607c3ecc10:?? 00000000000000f9 88e95c3ba257ab00
> 0x00007f607c3ecc20:?? 431bde82d7b634db 00007f607800aa00
> 0x00007f607c3ecc30:?? 00007f607c3eccc0 00007f6080daa9d5
> 0x00007f607c3ecc40:?? 0000000000000000 00007f607803bf20
> 0x00007f607c3ecc50:?? 00007f607803be20 00000000000003e8
> 0x00007f607c3ecc60:?? 0000000000000001 00007f6078c00000
> 0x00007f607c3ecc70:?? 00007f607c3eccc0 0000000000000000
> 0x00007f607c3ecc80:?? 00000004000000f9 00007f60813e2b99
> 0x00007f607c3ecc90:?? 00007f607803bfa0 00007f6078c00000
> 0x00007f607c3ecca0:?? 0000000000000000 0000000000000000
> 0x00007f607c3eccb0:?? 00007f6081418bd0 00007f607803bf20
> 0x00007f607c3eccc0:?? 00007f607c3ece60 00007f6080f2048a
> 0x00007f607c3eccd0:?? 00007f607c3ecd20 00007f607c3ecce0
> 0x00007f607c3ecce0:?? 00007f6078c00000 00007f6078c00980
> 0x00007f607c3eccf0:?? 00007f6078c009c0 00007f6078c009d0
> 0x00007f607c3ecd00:?? 00007f6078c00aa8 00000000000000d8
> 0x00007f607c3ecd10:?? 00007f6078c00be0 0000000000000000
> 0x00007f607c3ecd20:?? 00007f607c3ecd28 6e69747563657845
> 0x00007f607c3ecd30:?? 65706f204d562067 203a6e6f69746172
> 0x00007f607c3ecd40:?? 656c6c6f436e6547 6c6c41726f467463
> 
> Instructions: (pc=0x00007f6080c97f88)
> 0x00007f6080c97f68:?? b6 12 80 fa 00 74 01 f0 48 0f c1 01 31 c9 31 f6
> 0x00007f6080c97f78:?? 48 8b 44 0b 10 31 d2 48 85 c0 74 11 0f 1f 40 00
> 0x00007f6080c97f88:?? 48 8b 40 08 48 83 c2 01 48 85 c0 75 f3 48 83 c1
> 0x00007f6080c97f98:?? 08 48 01 d6 48 83 f9 20 75 d6 8b 7b 08 48 8b 05
> 
> Register to memory mapping:
> 
> RAX=0x0000000000000010 is an unknown value
> RBX=0x00007f5ff800ad30 is an unknown value
> RCX=0x0000000000000010 is an unknown value
> RDX=0x0000000000000000 is an unknown value
> RSP=0x00007f607c3ecb50 is an unknown value
> RBP=0x00007f607c3ecb80 is an unknown value
> RSI=0x0000000000000002 is an unknown value
> RDI=0x0000000001cfe570 is an unknown value
> R8 =0x00007f5ff80ae320 is an unknown value
> R9 =0x00007f5ff8052480 is an unknown value
> R10=0x0000000000000000 is an unknown value
> R11=0x0000000000000400 is an unknown value
> R12=0x0000000001cfe570 is an unknown value
> R13=0x00007f6081419470: <offset 0xfcd470> in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so at 0x00007f608044c000
> R14=0x0000000000000002 is an unknown value
> R15=0x00007f6081418640: <offset 0xfcc640> in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so at 0x00007f608044c000
> 
> 
> Stack: [0x00007f607c2ed000,0x00007f607c3ee000], sp=0x00007f607c3ecb50, free space=1022k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V? [libjvm.so+0x84bf88]
> V? [libjvm.so+0x84d5fa]
> V? [libjvm.so+0x473f5e]
> V? [libjvm.so+0x474f0f]
> V? [libjvm.so+0x95e0b7]
> V? [libjvm.so+0x95e9d5]
> V? [libjvm.so+0xad448a]
> V? [libjvm.so+0xad48f1]
> V? [libjvm.so+0x8beb82]
> 
> VM_Operation (0x00007f5fd69e6120): GenCollectForAllocation, mode: safepoint, requested by thread 0x00007f6079013800
> 
> ...
> -----------------------------------------------------------------------------
> 
> 
> 
> Next, I used GDB to check the backtrace of the SEGV thread from the coredump.
> The following is the backtrace.
> The SEGV occurred when ClassLoader is purged and Metaspace is destructed.
> And frame #7 shows that a signal(SEGV) handler is called after SpaceManager::~SpaceManager() is executed.
> 
> -----------------------------------------------------------------------------
> (gdb) bt
> #0? 0x00007f608146f1f7 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1? 0x00007f60814708e8 in __GI_abort () at abort.c:90
> #2? 0x00007f6080d0bc39 in os::abort (dump_core=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1519
> #3? 0x00007f6080f1b816 in VMError::report_and_die (this=this at entry=0x7f607c3ebd10) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060
> #4? 0x00007f6080d15927 in JVM_handle_linux_signal (sig=11, info=0x7f607c3ebfb0, ucVoid=0x7f607c3ebe80, abort_if_unrecognized=<optimized out>)
>  ??? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
> #5? 0x00007f6080d09038 in signalHandler (sig=11, info=0x7f607c3ebfb0, uc=0x7f607c3ebe80) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4446
> #6? <signal handler called>
> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, __in_chrg=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
> #8? 0x00007f6080c995fa in Metaspace::~Metaspace (this=0x7f5ff800ad00, __in_chrg=<optimized out>)
>  ??? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2971
> #9? 0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData (this=0x7f5ff800ac20, __in_chrg=<optimized out>)
>  ??? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
> #11 0x00007f6080daa0b7 in ClassLoaderDataGraph::purge_if_needed () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.hpp:104
> #12 SafepointSynchronize::do_cleanup_tasks () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:551
> #13 0x00007f6080daa9d5 in SafepointSynchronize::begin () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:402
> #14 0x00007f6080f2048a in VMThread::loop (this=this at entry=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:501
> #15 0x00007f6080f208f1 in VMThread::run (this=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276
> #16 0x00007f6080d0ab82 in java_start (thread=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796
> #17 0x00007f6081e2de25 in start_thread (arg=0x7f607c3ed700) at pthread_create.c:308
> #18 0x00007f608153234d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
> -----------------------------------------------------------------------------
> 
> 
> In Frame #7, Line 2028 (chunk = chunk->next()) is the crash point.
> The variable "chunk" is defined at Line 2025 (Metachunk* chunk = chunks_in_use(i);).
> "chunks_in_use(i)" is defined at Line 648 (Metachunk* chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }).
> So I checked values of "_chunks_in_use", and understood that "_chunks_in_use[2]" has Illegal Address "0x10".
> Therefore, I think that the SEGV occurred because of referencing Illegal Address "0x10" at "chunk = chunk->next()".
> 
> -----------------------------------------------------------------------------
> (gdb) f 7
> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, __in_chrg=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
> 2028??? ??? chunk = chunk->next();
> (gdb) list
> 2023??? size_t SpaceManager::sum_count_in_chunks_in_use(ChunkIndex i) {
> 2024??? ? size_t count = 0;
> 2025??? ? Metachunk* chunk = chunks_in_use(i);
> 2026??? ? while (chunk != NULL) {
> 2027??? ??? count++;
> 2028??? ??? chunk = chunk->next();
> 2029??? ? }
> 2030??? ? return count;
> 2031??? }
> 2032
> (gdb) list SpaceManager::chunks_in_use
> 647??? ? // Accessors
> 648??? ? Metachunk* chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }
> ...
> (gdb) p _chunks_in_use
> $11 = {0x7f5fcd41c400, 0x7f5fcd41a000, 0x10, 0x0}
> -----------------------------------------------------------------------------
> 
> 
> 
> The following is disassemble code of "SpaceManager::~SpaceManager()".
> %rax has 0x10 at "0x00007f6080c97f88 <+200>", but I don't understand why this "0x10" is inserted to %rax.
> 
> -----------------------------------------------------------------------------
> (gdb) disas
> Dump of assembler code for function SpaceManager::~SpaceManager():
>  ?? 0x00007f6080c97ec0 <+0>:??? push?? %rbp
>  ?? 0x00007f6080c97ec1 <+1>:??? mov??? %rsp,%rbp
>  ?? 0x00007f6080c97ec4 <+4>:??? push?? %r15
>  ?? 0x00007f6080c97ec6 <+6>:??? push?? %r14
>  ?? 0x00007f6080c97ec8 <+8>:??? push?? %r13
>  ?? 0x00007f6080c97eca <+10>:??? push?? %r12
>  ?? 0x00007f6080c97ecc <+12>:??? push?? %rbx
>  ?? 0x00007f6080c97ecd <+13>:??? mov??? %rdi,%rbx
>  ?? 0x00007f6080c97ed0 <+16>:??? sub??? $0x8,%rsp
>  ?? 0x00007f6080c97ed4 <+20>:??? mov 0x780785(%rip),%r12??????? # 0x7f6081418660 <_ZN12SpaceManager12_expand_lockE>
>  ?? 0x00007f6080c97edb <+27>:??? test?? %r12,%r12
>  ?? 0x00007f6080c97ede <+30>:??? je???? 0x7f6080c97ee8 <SpaceManager::~SpaceManager()+40>
>  ?? 0x00007f6080c97ee0 <+32>:??? mov??? %r12,%rdi
>  ?? 0x00007f6080c97ee3 <+35>:??? callq? 0x7f6080cce2f0 <Monitor::lock_without_safepoint_check()>
>  ?? 0x00007f6080c97ee8 <+40>:??? movslq 0x8(%rbx),%rcx
>  ?? 0x00007f6080c97eec <+44>:??? lea 0x78075d(%rip),%rdx??????? # 0x7f6081418650 <_ZN12MetaspaceAux15_capacity_wordsE>
>  ?? 0x00007f6080c97ef3 <+51>:??? lea 0x781576(%rip),%r13??????? # 0x7f6081419470 <_ZN2os16_processor_countE>
>  ?? 0x00007f6080c97efa <+58>:??? lea 0x78073f(%rip),%r15??????? # 0x7f6081418640 <_ZN12MetaspaceAux11_used_wordsE>
>  ?? 0x00007f6080c97f01 <+65>:??? mov??? (%rdx,%rcx,8),%rax
>  ?? 0x00007f6080c97f05 <+69>:??? sub??? 0x40(%rbx),%rax
>  ?? 0x00007f6080c97f09 <+73>:??? mov??? %rax,(%rdx,%rcx,8)
>  ?? 0x00007f6080c97f0d <+77>:??? mov??? 0x38(%rbx),%rax
>  ?? 0x00007f6080c97f11 <+81>:??? movslq 0x8(%rbx),%rdx
>  ?? 0x00007f6080c97f15 <+85>:??? neg??? %rax
>  ?? 0x00007f6080c97f18 <+88>:??? cmpl?? $0x1,0x0(%r13)
>  ?? 0x00007f6080c97f1d <+93>:??? lea??? (%r15,%rdx,8),%rcx
>  ?? 0x00007f6080c97f21 <+97>:??? mov??? $0x1,%edx
>  ?? 0x00007f6080c97f26 <+102>:??? jne??? 0x7f6080c97f32 <SpaceManager::~SpaceManager()+114>
>  ?? 0x00007f6080c97f28 <+104>:??? lea 0x74acb4(%rip),%rdx??????? # 0x7f60813e2be3 <AssumeMP>
>  ?? 0x00007f6080c97f2f <+111>:??? movzbl (%rdx),%edx
>  ?? 0x00007f6080c97f32 <+114>:??? cmp??? $0x0,%dl
>  ?? 0x00007f6080c97f35 <+117>:??? je???? 0x7f6080c97f38 <SpaceManager::~SpaceManager()+120>
>  ?? 0x00007f6080c97f37 <+119>:??? lock xadd %rax,(%rcx)
>  ?? 0x00007f6080c97f3c <+124>:??? mov??? 0x48(%rbx),%r14
>  ?? 0x00007f6080c97f40 <+128>:??? callq? 0x7f6080c951a0 <Metachunk::overhead()>
>  ?? 0x00007f6080c97f45 <+133>:??? movslq 0x8(%rbx),%rdx
>  ?? 0x00007f6080c97f49 <+137>:??? imul?? %r14,%rax
>  ?? 0x00007f6080c97f4d <+141>:??? lea??? (%r15,%rdx,8),%rcx
>  ?? 0x00007f6080c97f51 <+145>:??? mov??? $0x1,%edx
>  ?? 0x00007f6080c97f56 <+150>:??? neg??? %rax
>  ?? 0x00007f6080c97f59 <+153>:??? cmpl?? $0x1,0x0(%r13)
>  ?? 0x00007f6080c97f5e <+158>:??? jne??? 0x7f6080c97f6a <SpaceManager::~SpaceManager()+170>
>  ?? 0x00007f6080c97f60 <+160>:??? lea 0x74ac7c(%rip),%rdx??????? # 0x7f60813e2be3 <AssumeMP>
>  ?? 0x00007f6080c97f67 <+167>:??? movzbl (%rdx),%edx
>  ?? 0x00007f6080c97f6a <+170>:??? cmp??? $0x0,%dl
>  ?? 0x00007f6080c97f6d <+173>:??? je???? 0x7f6080c97f70 <SpaceManager::~SpaceManager()+176>
>  ?? 0x00007f6080c97f6f <+175>:??? lock xadd %rax,(%rcx)
>  ?? 0x00007f6080c97f74 <+180>:??? xor??? %ecx,%ecx
>  ?? 0x00007f6080c97f76 <+182>:??? xor??? %esi,%esi
>  ?? 0x00007f6080c97f78 <+184>:??? mov 0x10(%rbx,%rcx,1),%rax
>  ?? 0x00007f6080c97f7d <+189>:??? xor??? %edx,%edx
>  ?? 0x00007f6080c97f7f <+191>:??? test?? %rax,%rax
>  ?? 0x00007f6080c97f82 <+194>:??? je???? 0x7f6080c97f95 <SpaceManager::~SpaceManager()+213>
>  ?? 0x00007f6080c97f84 <+196>:??? nopl?? 0x0(%rax)
> => 0x00007f6080c97f88 <+200>:??? mov??? 0x8(%rax),%rax
>  ?? 0x00007f6080c97f8c <+204>:??? add??? $0x1,%rdx
>  ?? 0x00007f6080c97f90 <+208>:??? test?? %rax,%rax
> ...
> (gdb) info registers
> rax??????????? 0x10??? 16
> rbx??????????? 0x7f5ff800ad30??? 140050159414576
> rcx??????????? 0x10??? 16
> rdx??????????? 0x0??? 0
> rsi??????????? 0x2??? 2
> rdi??????????? 0x1cfe570??? 30401904
> rbp??????????? 0x7f607c3ecb80??? 0x7f607c3ecb80
> rsp??????????? 0x7f607c3ecb50??? 0x7f607c3ecb50
> r8???????????? 0x7f5ff80ae320??? 140050160083744
> r9???????????? 0x7f5ff8052480??? 140050159707264
> r10??????????? 0x0??? 0
> r11??????????? 0x400??? 1024
> r12??????????? 0x1cfe570??? 30401904
> r13??????????? 0x7f6081419470??? 140052462146672
> r14??????????? 0x2??? 2
> r15??????????? 0x7f6081418640??? 140052462143040
> rip??????????? 0x7f6080c97f88??? 0x7f6080c97f88 <SpaceManager::~SpaceManager()+200>
> eflags???????? 0x206??? [ PF IF ]
> cs???????????? 0x33??? 51
> ss???????????? 0x2b??? 43
> ds???????????? 0x0??? 0
> es???????????? 0x0??? 0
> fs???????????? 0x0??? 0
> gs???????????? 0x0??? 0
> k0???????????? <unavailable>
> k1???????????? <unavailable>
> k2???????????? <unavailable>
> k3???????????? <unavailable>
> k4???????????? <unavailable>
> k5???????????? <unavailable>
> k6???????????? <unavailable>
> k7???????????? <unavailable>
> -----------------------------------------------------------------------------
> 
> =============================================================================
> 
> 
> 
> Does anyone know about this case?
> 
> Thanks, Osamu
> 
> 


From stefan.karlsson at oracle.com  Mon Oct 21 14:06:40 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 21 Oct 2019 16:06:40 +0200
Subject: RFR: 8232649: ZGC: Add callbacks to ZMemoryManager
Message-ID: <8793cda6-bec6-dac7-5164-8fc34454286e@oracle.com>

Hi all,

Please review this patch to add callbacks to ZMemoryManager.

https://cr.openjdk.java.net/~stefank/8232649/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8232649

This allows users of ZMemoryManager to get callbacks when memory regions 
are inserted, removed, split, and coalesced. This is needed to support 
Windows' stricter requirements for placeholder reserved memory.

Thanks,
StefanK


From thomas.schatzl at oracle.com  Mon Oct 21 14:09:16 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 21 Oct 2019 16:09:16 +0200
Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for
 G1, Logging (3/3)
In-Reply-To: <ba8c3fa4-9ee1-6a98-d13f-ffaacc59025c@oracle.com>
References: <e7c52f60-a5c7-072a-4e3b-65c608907679@oracle.com>
 <e903223b-90a5-9d01-5421-a47011bd5985@oracle.com>
 <ba8c3fa4-9ee1-6a98-d13f-ffaacc59025c@oracle.com>
Message-ID: <b5f39fc2-3319-a81c-25b4-f979282aef9f@oracle.com>

Hi,

   some initial comments looking at the log output:

On 13.10.19 08:16, sangheon.kim at oracle.com wrote:
> Hi all,
> 
> Previous patch conflicts because of JDK-8220310, I'm posting rebased one 
> with some refactoring.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.2
> Testing: hs-tier 1 ~ 5, with/without UseNUMA
> 
> Here's the full patch of 8220310, 8220311 and 8220312.
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.full.2/
> 

   - I did not performance impact test the additional logging yet, but I 
do not expect issues.

   - that's something from the first NUMA patch:

There is this gc+heap+numa=debug log message "Request memory [address, 
address] to be numa id (X)." for every region.

First, it seems to be on the wrong level, consider a heap with 
ten-thousands of regions. This imo clogs the log too much, and I would 
prefer to move this information to trace level.

Second, the full stop at the end is not necessary :)

   - the G1HRPrinter should be made NUMA aware, i.e. print expected NUMA 
id for this region

   - the casing of NUMA changes depending on message, i.e. sometimes 
"NUMA" and other times "numa" in the log messages themselves. I would 
recommend uniformly use "NUMA".

However I think that all the "NUMA id" in these messages should read 
"node id" as at that level we do not manage the OS level NUMA ids any more.

   - the "numa id" values in the various messages are formatted 
differently in the different messages with no apparent guideline: 
sometimes the code adds the leading zeros, sometimes not. Also the 
separator between node id and value is sometimes ":" and once "="

E.g.

"NUMA id verification: preferred id (matched #): 00 (32), 01 (32), ..."
"Region Allocated / Requested: 99% xxxx/yyyy (numa id 0: 99% ..."

I am kind of undecided what is best, but probably simply leaving out the 
leading zeros is best for the large majority of cases.

   - just a suggestion: "Region Allocated / Requested" -> "Placement 
Match Ratio" or so. Maybe somebody else has a better name.

Also in that message I would not print "numa id" at all to make the 
message shorter.

   - "Worker threads local object process rate" -> "Worker task locality 
match rate" seems shorter.

Again, to make the message shorter I would prefer that "numa id" were 
not printed at all in the details.

Not sure if that rate at this point is extremely interesting since G1 
won't even try to improve it at this time, but you can leave it in if 
you want.

   - I would *probably* like to have most of these messages split into 
"recent" and "total" statistics. Maybe others think that the totals are 
okay.

   - Again, to save space I would prefer to have the per-node details in 
the region summaries in the same line as the original output. I.e. 
instead of

Eden regions: 28->0 (29)
   From numa id 0: 18->0
   From numa id 1: 10->0

the following would be much shorter:

Eden regions: 28->0 (29) (0: 18->0, 1: 10->0)

As with higher node counts you will get lots of lines with little 
content imho. Maybe others think differently?

Also, this would "fix" the problem that when you enabled gc+heap+numa 
but not gc+heap, you will see these "From numa id" numbers in the log 
without their required context. Alternatively, gc+heap+numa could 
automatically enable gc+heap at the same level.

Comments after some superficial look at the changes themselves:

   - G1Regions should be renamed as G1RegionCounts and get a single line 
comment like: "Contains per Node id region count".

   - G1NodeTimes::Stat: it would probably be useful to have a "rate()" 
getter that recalculates the value as needed instead of the member.

   - G1HeapTransition::Data::~Data: the "if (soemthing != NULL)" checks 
are unnecessary. FREE_C_HEAP_ARRAY does that already.

Same in G1ParScanThreadState::G1ParscanThreadState.

   - I do not understand the name "G1NodeTimes" :) What "time" is that 
referring to?

   - G1NUMA::clear_statistics() seems to be unused.

   - G1NodeTimes::print_mutator_alloc_stat_info() and 
G1NodeTimes::copy_to_sruvivor_stat_info() look very similar. Could the 
code be refactored a bit?

Thanks,
   Thomas


From stefan.karlsson at oracle.com  Mon Oct 21 14:37:34 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 21 Oct 2019 16:37:34 +0200
Subject: RFR: 8232650: ZGC: Add initialization hooks for OS specific code
Message-ID: <5cdd2722-26a4-8e6c-1262-5d97dfd7f46c@oracle.com>

Hi all,

Please review this patch to add initialization hooks for OS specific code.

https://cr.openjdk.java.net/~stefank/8232650/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8232650

These hooks are needed to for a Windows port. ZInitialize allows 
syscalls to be dynamically resolved. ZVirtualMemory allows callbacks 
from 8232649 to be initialized.

Thanks,
StefanK


From shade at redhat.com  Mon Oct 21 16:55:33 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 21 Oct 2019 18:55:33 +0200
Subject: RFR (XS) 8232729: Shenandoah: assert ShenandoahHeap::cas_oop
 addresses are aligned
Message-ID: <c5b62eb1-ea9f-9f4c-3c8d-49025bf4658b@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8232729

Fix:
  https://cr.openjdk.java.net/~shade/8232729/webrev.01/

Current ShenandoahHeap::cas_oop routines perform CASes on given address, hoping the hardware would
handle it properly. In most cases, this is guaranteed by callers who pass aligned addresses to it:
those are aligned narrowOop*/oop* fields or the roots that we can update concurrently.

However, we should assert the alignment directly to catch bugs. This would fail the asserts with
proper message rather than obscure SIGBUS on some platforms like AArch64. These new asserts are
known to legitimately fail with Traversal (JDK-8232730) on x86_64 and with jcstress on AArch64
(JDK-8232712), so I am going to push this after the fixes land to ensure clean test results.

Testing: {x86_64, x86_32} hotspot_gc_shenandoah; x86_64 tier1 with Shenandoah (running)

-- 
Thanks,
-Aleksey


From shade at redhat.com  Mon Oct 21 16:55:47 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 21 Oct 2019 18:55:47 +0200
Subject: RFR (S) 8232730: Shenandoah: Traversal should not CAS the roots
Message-ID: <09c30b70-fd00-d762-1760-5ff32fd01301@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8232730

Fix:
  https://cr.openjdk.java.net/~shade/8232730/webrev.01/

This is captured by asserts from JDK-8232729 with hotspot_gc_shenandoah on x86_64. See more details
in the bug. The underlying reason for this failure is trying to CAS the roots that are not aligned
to the pointer size, notably code roots. Normal concurrent cycle avoids this by updating the roots
with plain stores, Traversal should do the same.

Testing: {x86_64, x86_32} hotspot_gc_shenandoah; tier1 with Shenandoah (running)

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Mon Oct 21 17:02:20 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 21 Oct 2019 13:02:20 -0400
Subject: RFR 8232712: Shenandoah: SIGBUS in load_reference_barrier_native
Message-ID: <59396f6f-6ff8-3ac6-ab51-240f56298ab6@redhat.com>

I missed aarch64 changes for JDK-8232010[1].

On aarch64, native barrier does not setup the second parameter 
(load_addr) for runtime call, therefore, the address to CAS is bogus.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232712
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232712/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release) on aarch64 Linux.

[1] https://bugs.openjdk.java.net/browse/JDK-8232010

Thanks,

-Zhengyu


From rkennke at redhat.com  Mon Oct 21 17:18:41 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 21 Oct 2019 19:18:41 +0200
Subject: RFR (XS) 8232729: Shenandoah: assert ShenandoahHeap::cas_oop
 addresses are aligned
In-Reply-To: <c5b62eb1-ea9f-9f4c-3c8d-49025bf4658b@redhat.com>
References: <c5b62eb1-ea9f-9f4c-3c8d-49025bf4658b@redhat.com>
Message-ID: <5cb6c167-e4e5-07c6-c892-1df1d8700505@redhat.com>

Yup. More asserts are always good. :-)

Roman

> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8232729
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232729/webrev.01/
> 
> Current ShenandoahHeap::cas_oop routines perform CASes on given address, hoping the hardware would
> handle it properly. In most cases, this is guaranteed by callers who pass aligned addresses to it:
> those are aligned narrowOop*/oop* fields or the roots that we can update concurrently.
> 
> However, we should assert the alignment directly to catch bugs. This would fail the asserts with
> proper message rather than obscure SIGBUS on some platforms like AArch64. These new asserts are
> known to legitimately fail with Traversal (JDK-8232730) on x86_64 and with jcstress on AArch64
> (JDK-8232712), so I am going to push this after the fixes land to ensure clean test results.
> 
> Testing: {x86_64, x86_32} hotspot_gc_shenandoah; x86_64 tier1 with Shenandoah (running)
> 


From shade at redhat.com  Mon Oct 21 17:35:01 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 21 Oct 2019 19:35:01 +0200
Subject: RFR 8232712: Shenandoah: SIGBUS in load_reference_barrier_native
In-Reply-To: <59396f6f-6ff8-3ac6-ab51-240f56298ab6@redhat.com>
References: <59396f6f-6ff8-3ac6-ab51-240f56298ab6@redhat.com>
Message-ID: <3d3061c5-7ab0-2905-2fe3-dc16ac3dd911@redhat.com>

On 10/21/19 7:02 PM, Zhengyu Gu wrote:
> I missed aarch64 changes for JDK-8232010[1].
> 
> On aarch64, native barrier does not setup the second parameter (load_addr) for runtime call,
> therefore, the address to CAS is bogus.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232712
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232712/webrev.00/

Roman needs to ack this. This patch allows me to pass the subset of jcstress tests that were
previously failing on aarch64.

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Mon Oct 21 17:42:18 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 21 Oct 2019 13:42:18 -0400
Subject: RFR (S) 8232730: Shenandoah: Traversal should not CAS the roots
In-Reply-To: <09c30b70-fd00-d762-1760-5ff32fd01301@redhat.com>
References: <09c30b70-fd00-d762-1760-5ff32fd01301@redhat.com>
Message-ID: <3dc51ca9-d26b-6fe7-dcbd-d48169c55993@redhat.com>

Good to me.

Thanks,

-Zhengyu

On 10/21/19 12:55 PM, Aleksey Shipilev wrote:
> Bug:
>    https://bugs.openjdk.java.net/browse/JDK-8232730
> 
> Fix:
>    https://cr.openjdk.java.net/~shade/8232730/webrev.01/
> 
> This is captured by asserts from JDK-8232729 with hotspot_gc_shenandoah on x86_64. See more details
> in the bug. The underlying reason for this failure is trying to CAS the roots that are not aligned
> to the pointer size, notably code roots. Normal concurrent cycle avoids this by updating the roots
> with plain stores, Traversal should do the same.
> 
> Testing: {x86_64, x86_32} hotspot_gc_shenandoah; tier1 with Shenandoah (running)
> 


From rkennke at redhat.com  Mon Oct 21 18:24:17 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 21 Oct 2019 20:24:17 +0200
Subject: RFR (S) 8232730: Shenandoah: Traversal should not CAS the roots
In-Reply-To: <09c30b70-fd00-d762-1760-5ff32fd01301@redhat.com>
References: <09c30b70-fd00-d762-1760-5ff32fd01301@redhat.com>
Message-ID: <a04e8442-e861-e37e-b920-0f356b4ae801@redhat.com>

Ok!

Thanks,
Roman

> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8232730
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8232730/webrev.01/
> 
> This is captured by asserts from JDK-8232729 with hotspot_gc_shenandoah on x86_64. See more details
> in the bug. The underlying reason for this failure is trying to CAS the roots that are not aligned
> to the pointer size, notably code roots. Normal concurrent cycle avoids this by updating the roots
> with plain stores, Traversal should do the same.
> 
> Testing: {x86_64, x86_32} hotspot_gc_shenandoah; tier1 with Shenandoah (running)
> 


From rkennke at redhat.com  Mon Oct 21 18:24:51 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 21 Oct 2019 20:24:51 +0200
Subject: RFR 8232712: Shenandoah: SIGBUS in load_reference_barrier_native
In-Reply-To: <59396f6f-6ff8-3ac6-ab51-240f56298ab6@redhat.com>
References: <59396f6f-6ff8-3ac6-ab51-240f56298ab6@redhat.com>
Message-ID: <713f6623-17a0-1c6f-90b2-ea398a532bbf@redhat.com>

Ok!

Thanks,
Roman


> I missed aarch64 changes for JDK-8232010[1].
> 
> On aarch64, native barrier does not setup the second parameter
> (load_addr) for runtime call, therefore, the address to CAS is bogus.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232712
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232712/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release) on aarch64 Linux.
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8232010
> 
> Thanks,
> 
> -Zhengyu
> 


From kim.barrett at oracle.com  Mon Oct 21 23:24:33 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 21 Oct 2019 19:24:33 -0400
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
Message-ID: <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>

> On Oct 13, 2019, at 2:00 AM, sangheon.kim at oracle.com wrote:
> 
> Hi all,
> 
> Previous patch conflicts, so I'm posting rebased one.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.2
> Testing: hs-tier 1 ~ 5, with/without UseNUMA
> 
> Thanks,
> Sangheon

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1ParScanThreadState.hpp
Removed:
 190   // ... State is the original (source) cset state for the object
 191   // that is allocated for. ...

That simple removal doesn't seem right. Now "state" in the next
sentence has no explanation.  Maybe some better rewrite?

------------------------------------------------------------------------------

Looks good, other than that one comment issue.


From kim.barrett at oracle.com  Tue Oct 22 01:20:33 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 21 Oct 2019 21:20:33 -0400
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
Message-ID: <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>

> On Oct 19, 2019, at 9:06 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  there is a new webrev at 
> 
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2/ (full only,
> there is no point in providing a diff)
> 
> since I like this solution a lot as it removes a lot of additional
> post-processing.
> 
> Testing has been a bit of a headache: interference between strong and
> weak processing is extremely rare, so I had to make it pretty common by
> 
> 1) only a single thread doing strong processing
> 2) the weak processing stage has to be moved right after the root
> processing so they overlap with a lot higher probability
> 
> hs-tier 1-5 passes with and without these changes, with a noticable
> amount of overlap according to additional log messages. That change can
> be looked at at 
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2.testing/ .
> Obviously I am not going to push this.
> 
> Surprisingly there had to be no changes to Shenandoah as it does not
> use the claim mechanism changed here, implementing something else.
> Shenandoah also passed vmTestbase/gc with these changes with no
> problem.
> 
> Below this email is a copy of Kim's suggestion about the state machine
> again for reference. I also added documentation about why and how the
> code is supposed to work.

I'm glad the new state machine worked out, and allowed the extra task
to be eliminated. Thanks for going the extra mile with the testing.
And thanks for turning my pseudo-code into something more readable. My
comments here mostly suggestions for more of that; I don't think I'd
want to have to decipher this in 6 months without some helpful
commentary. :)

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp
Removed:
1829 #define NMETHOD_SENTINEL ((nmethod*)badAddress)

Yay!

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.hpp
 118   // SR -> SD: the nmethod has been processed strongly from the beginning.

I think this is just the tail of
 114   // WR -> SR -> SD: during weak processing another thread found that the nmethod
and is not what is needed here. I think what you are really looking
for here is the unclaimed -> SD case. I think the state progressions
are

unclaimed -> WR -> WD
unclaimed -> WR -> SR -> SD
unclaimed -> WR -> WD -> SD
unclaimed -> SD

The first is terminal (at WD) if the nmethod doesn't need strong
processing.

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.hpp
  95   // We store state and claim information in the _oops_do_mark_link member, using
  96   // the two LSBs for the state and the rest for linking together nmethods that
  97   // were visited.

There's no description of the upper bits in this comment.  In
particular, the self-loop to indicate end of list isn't mentioned.
Also, the specific values for the upper bits in the transitions turned
out to be important, as discussed in the pseudo-code.

So if N is the nmethod and X is the "next" value (which is N at end of
list), then the state progressions might be described as

unclaimed -> WR(N) -> WD(X)
unclaimed -> WR(N) -> SR(N) -> SD(X)
unclaimed -> WR(N) -> WD(X) -> SD(X)
unclaimed -> SD(N) -> SD(X)

(The text descriptions of the progressions seem okay.)

It also might help to indicate which thread performs each step. If C
is the claiming thread, and O is some other thread, then something
like

unclaimed (C)-> WR(N) (C)-> WD(X)
unclaimed (C)-> WR(N) (O)-> SR(N) (C)-> SD(X)
unclaimed (C)-> WR(N) (C)-> WD(X) (O)-> SD(X)
unclaimed (C)-> SD(N) (C)-> SD(X)

(Admittedly, that's pretty dense notation.)

I think the comments describing the various transition functions might
be better if they explicitly state which of the above transitions they
(attempt to) perform, e.g.

  // Attempt unclaimed -> WR(N) transition, returning true if successful.
  bool oops_do_try_claim_weak_request();

I found the existing text descriptions hard to map onto the specific
steps, even though they are (mostly?) one-to-one.  I was finding it
easier to ignore the descriptions and just use the names, though that
isn't trivial either.

------------------------------------------------------------------------------  
src/hotspot/share/code/nmethod.hpp
 160   oops_do_mark_link* oops_do_try_claim_weak_request_as_strong_request(oops_do_mark_link* next);

I think this function is misnamed; it doesn't really claim anything.
Instead it attempts to add a strong request (SR) to a weak request
(WR), and should be called oops_do_try_add_strong_request.

------------------------------------------------------------------------------  
src/hotspot/share/code/nmethod.cpp
1848 bool nmethod::oops_do_try_claim_weak_request() {
1849   assert(SafepointSynchronize::is_at_safepoint(), "only at safepoint");
1850 
1851   if (_oops_do_mark_link != NULL) {
1852     return false;
1853   }
1854   if (!Atomic::replace_if_null(mark_link(this, claim_weak_request_tag), &_oops_do_mark_link)) {
1855     return false;
1856   }
1857   oops_do_log_change("oops_do, mark weak request");
1858   return true;
1859 }

I found the various "!"s and early returns in the above made it hard
to read. I think simpler is the following. YMMV.

bool nmethod::oops_do_try_claim_weak_request() {
  assert(SafepointSynchronize::is_at_safepoint(), "only at safepoint");

  if ((_oops_do_mark_link == NULL) &&
      Atomic::replace_if_null(mark_link(this, claim_weak_request_tag), &_oops_do_mark_link)) {
    oops_do_log_change("oops_do, mark weak request");
    return true;
  }
  return false;
}

That's also more similar to the style of the other functions.

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.hpp
 130     assert(((uintptr_t)nm & 0x3) == 0, "nmethod pointer must have zero lower two LSB");
 
assert(is_aligned(nm, 2), ...);

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp
1906   if (old_head == NULL) {
1907     old_head = this;
1908   }
...
1922   if (old_head == NULL) {
1923     old_head = this;
1924   }
...
2013   } while (cur != next);

In none of these places nor in the header comments is there any
mention of the use of self-loop to indicate the end of the list (nor
why that's being done).

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp
2014   _oops_do_mark_nmethods = NULL;

Maybe move this up to immediately following
1997   nmethod* next = _oops_do_mark_nmethods;

to make it more immediately obvious that we're taking and processing
the whole list.

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp
1987 void nmethod::oops_do_marking_prologue() {
...
1991   _oops_do_mark_nmethods = NULL;

That assignment ought to be a nop, and could instead be an assert.

------------------------------------------------------------------------------


From sangheon.kim at oracle.com  Tue Oct 22 05:50:26 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Mon, 21 Oct 2019 22:50:26 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <4b3026b8-10bd-7439-84c6-d906bacd7774@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <4b3026b8-10bd-7439-84c6-d906bacd7774@oracle.com>
Message-ID: <98d6a618-c780-1ef1-35cd-8117f2af2a0b@oracle.com>

Hi Thomas,

On 10/21/19 4:18 AM, Thomas Schatzl wrote:
> Hi Sangheon,
>
> On 13.10.19 08:00, sangheon.kim at oracle.com wrote:
>> Hi all,
>>
>> Previous patch conflicts, so I'm posting rebased one.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.2
>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
>>
>
> Looks good to me.
Thanks for your review!

Thanks,
Sangheon


>
> Thanks,
> ? Thomas


From sangheon.kim at oracle.com  Tue Oct 22 05:52:05 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Mon, 21 Oct 2019 22:52:05 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
Message-ID: <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>

Hi Kim,

Thanks for reviewing this part.

On 10/21/19 4:24 PM, Kim Barrett wrote:
>> On Oct 13, 2019, at 2:00 AM, sangheon.kim at oracle.com wrote:
>>
>> Hi all,
>>
>> Previous patch conflicts, so I'm posting rebased one.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.2
>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
>>
>> Thanks,
>> Sangheon
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ParScanThreadState.hpp
> Removed:
>   190   // ... State is the original (source) cset state for the object
>   191   // that is allocated for. ...
>
> That simple removal doesn't seem right. Now "state" in the next
> sentence has no explanation.  Maybe some better rewrite?
What do you think about below comment?

 ? // Tries to allocate word_sz in the PLAB of the next "generation" 
after trying to
 ? // allocate into dest. Previous_plab_refill_failed indicates whether 
previous
 ? // PLAB refill for the original (source) object was failed.
 ? // Returns a non-NULL pointer if successful, and updates dest if 
required.
 ? // Also determines whether we should continue to try to allocate into 
the various
 ? // generations or just end trying to allocate.
 ? HeapWord* allocate_in_next_plab(G1HeapRegionAttr* dest,
...

Let me post the webrev when we decide. :)

Thanks,
Sangheon


>
> ------------------------------------------------------------------------------
>
> Looks good, other than that one comment issue.
>


From per.liden at oracle.com  Tue Oct 22 06:14:27 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 22 Oct 2019 08:14:27 +0200
Subject: RFR: 8232601: ZGC: Parameterize the ZGranuleMap table size
In-Reply-To: <f486753e-cd02-5c22-7390-041cf248f423@oracle.com>
References: <f486753e-cd02-5c22-7390-041cf248f423@oracle.com>
Message-ID: <de172be9-85a3-4016-d26e-07060a96f4e0@oracle.com>

Looks good.

/Per

On 10/21/19 3:00 PM, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to parameterize the ZGranuleMap table size.
> 
> https://cr.openjdk.java.net/~stefank/8232601/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232601
> 
> Previously, the maps were always bound by the range of a virtual address 
> space view (ZAddressOffsetMax). We want to be able to use ZGranuleMap to 
> map against physical memory offsets, so this RFE suggests that we allow 
> users of ZGranuleMap to specify the max offset.
> 
> Thanks,
> StefanK


From per.liden at oracle.com  Tue Oct 22 06:18:37 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 22 Oct 2019 08:18:37 +0200
Subject: RFR: 8232602: ZGC: Make ZGranuleMap ZAddress agnostic
In-Reply-To: <4638080b-9f2e-6965-6ed9-a17b32ad3b94@oracle.com>
References: <4638080b-9f2e-6965-6ed9-a17b32ad3b94@oracle.com>
Message-ID: <c6bb324d-9ff8-7fdf-e99e-e4a2b1da0144@oracle.com>

Looks good.

/Per

On 10/21/19 3:09 PM, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to make ZGranuleMap ZAddress agnostic.
> 
> https://cr.openjdk.java.net/~stefank/8232602/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232602
> 
> Currently, the ZGranuleMap get and put functions take an address in the 
> heap as a parameter. The address is then converted into an offset (into 
> a heap view), before being scaled to a granule.
> 
> We want to be able to use the ZGranuleMap for physical memory offsets, 
> and not only heap addresses. Therefore, I propose that we move the 
> conversions from address to offset out from ZGranuleMap, and move it to 
> the current users of ZGranuleMap.
> 
> This patch applies on-top of the patch for JDK-8232601.
> 
> Thanks,
> StefanK
> 


From per.liden at oracle.com  Tue Oct 22 06:19:22 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 22 Oct 2019 08:19:22 +0200
Subject: RFR: 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of
 declarations
In-Reply-To: <b981daf2-2924-1708-5a2f-7475cef3d85a@oracle.com>
References: <b981daf2-2924-1708-5a2f-7475cef3d85a@oracle.com>
Message-ID: <5a6bf373-bb06-31fc-9493-8c93a7b21ba5@oracle.com>

Looks good.

/Per

On 10/21/19 3:22 PM, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to move ATTRIBUTE_ALIGNED to the front of 
> declarations.
> 
> https://cr.openjdk.java.net/~stefank/8232648/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232648
> 
> This is done because the Windows compiler requires ATTRIBUTE_ALIGNED to 
> be put at the front of declarations. A new macro (ZCACHE_ALIGNED) is 
> introduced, and used, to shorten the affected lines.
> 
> Thanks,
> StefanK


From per.liden at oracle.com  Tue Oct 22 06:22:13 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 22 Oct 2019 08:22:13 +0200
Subject: RFR: 8232650: ZGC: Add initialization hooks for OS specific code
In-Reply-To: <5cdd2722-26a4-8e6c-1262-5d97dfd7f46c@oracle.com>
References: <5cdd2722-26a4-8e6c-1262-5d97dfd7f46c@oracle.com>
Message-ID: <fd674244-27b9-5d4b-664b-5b16ccf503ff@oracle.com>

Looks good.

/Per

On 10/21/19 4:37 PM, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to add initialization hooks for OS specific code.
> 
> https://cr.openjdk.java.net/~stefank/8232650/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232650
> 
> These hooks are needed to for a Windows port. ZInitialize allows 
> syscalls to be dynamically resolved. ZVirtualMemory allows callbacks 
> from 8232649 to be initialized.
> 
> Thanks,
> StefanK


From per.liden at oracle.com  Tue Oct 22 06:24:24 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 22 Oct 2019 08:24:24 +0200
Subject: RFR: 8232649: ZGC: Add callbacks to ZMemoryManager
In-Reply-To: <8793cda6-bec6-dac7-5164-8fc34454286e@oracle.com>
References: <8793cda6-bec6-dac7-5164-8fc34454286e@oracle.com>
Message-ID: <fb9d0ec1-f6d5-aba9-ac16-2a77f8d71a25@oracle.com>

Looks good.

/Per

On 10/21/19 4:06 PM, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to add callbacks to ZMemoryManager.
> 
> https://cr.openjdk.java.net/~stefank/8232649/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232649
> 
> This allows users of ZMemoryManager to get callbacks when memory regions 
> are inserted, removed, split, and coalesced. This is needed to support 
> Windows' stricter requirements for placeholder reserved memory.
> 
> Thanks,
> StefanK


From kim.barrett at oracle.com  Tue Oct 22 07:19:22 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 22 Oct 2019 03:19:22 -0400
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
Message-ID: <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>

> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
> What do you think about below comment?
> 
>   // Tries to allocate word_sz in the PLAB of the next "generation" after trying to
>   // allocate into dest. Previous_plab_refill_failed indicates whether previous
>   // PLAB refill for the original (source) object was failed.

Drop ?was?.  Otherwise looks good.

>   // Returns a non-NULL pointer if successful, and updates dest if required.
>   // Also determines whether we should continue to try to allocate into the various
>   // generations or just end trying to allocate.
>   HeapWord* allocate_in_next_plab(G1HeapRegionAttr* dest,
> ...
> 
> Let me post the webrev when we decide. :)
> 
> Thanks,
> Sangheon
> 
> 
>> 
>> ------------------------------------------------------------------------------
>> 
>> Looks good, other than that one comment issue.


From kishor.kharbas at intel.com  Tue Oct 22 07:22:59 2019
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Tue, 22 Oct 2019 07:22:59 +0000
Subject: RFR(S): 8215893: Add better abstraction for pinning G1
 concurrent marking bitmaps.
In-Reply-To: <2CB4D3B2-02E7-46A8-85D3-CCEA34C0695B@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F3CB569D68@ORSMSX116.amr.corp.intel.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB56A055@ORSMSX116.amr.corp.intel.com>
 <b6b879d9-fc88-c719-d939-6d64070ae13f@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57D45D@ORSMSX116.amr.corp.intel.com>
 <f9e8443e-9f4a-510f-4ef4-b7356114e929@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57DA2E@ORSMSX116.amr.corp.intel.com>
 <c4f9c0ec-f242-82fd-95b9-b640dd389715@oracle.com>
 <F89640DCD01A85489FCBA68183A6A0F3CB57DDDE@ORSMSX116.amr.corp.intel.com>
 <2CB4D3B2-02E7-46A8-85D3-CCEA34C0695B@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F3CB57E9D3@ORSMSX116.amr.corp.intel.com>

Hi Stefan,

> -----Original Message-----
> From: Stefan Johansson [mailto:stefan.johansson at oracle.com]
> Sent: Friday, October 18, 2019 7:32 AM
> To: Kharbas, Kishor <kishor.kharbas at intel.com>
> Cc: sangheon.kim at oracle.com; hotspot-gc-dev at openjdk.java.net
> Subject: Re: RFR(S): 8215893: Add better abstraction for pinning G1
> concurrent marking bitmaps.
> 
> Hi Kishor,
> 
> > 17 okt. 2019 kl. 23:28 skrev Kharbas, Kishor <kishor.kharbas at intel.com>:
> >
> > Hi Stefan,
> >
> >> -----Original Message-----
> >> From: Stefan Johansson [mailto:stefan.johansson at oracle.com]
> >> Sent: Thursday, October 17, 2019 4:34 AM
> >> To: Kharbas, Kishor <kishor.kharbas at intel.com>;
> >> sangheon.kim at oracle.com
> >> Cc: hotspot-gc-dev at openjdk.java.net
> >> Subject: Re: RFR(S): 8215893: Add better abstraction for pinning G1
> >> concurrent marking bitmaps.
> >>
> >> Hi Kishor,
> >>
> >> On 2019-10-17 03:39, Kharbas, Kishor wrote:
> >>> Hi Sangheon,
> >>>
> >>> *From:*sangheon.kim at oracle.com [mailto:sangheon.kim at oracle.com]
> >>> *Sent:* Wednesday, October 16, 2019 11:03 AM
> >>> *To:* Kharbas, Kishor <kishor.kharbas at intel.com>
> >>> *Cc:* hotspot-gc-dev at openjdk.java.net; Stefan Johansson
> >>> <stefan.johansson at oracle.com>
> >>> *Subject:* Re: RFR(S): 8215893: Add better abstraction for pinning
> >>> G1 concurrent marking bitmaps.
> >>>
> >>>> Hi Kishor,
> >>>>
> >>>> Before reviewing webrev.02, could you remind us what was the
> >>>> motivation of pinning the bitmap mappers here?
> >>>> In addition to explanations of the problematic situation, any logs
> >>>> / stack-trace also may help.
> >>>>
> >>>> We think that understanding of the root cause should be considered
> first.
> >>>
> >>> Unfortunately, I do not have log/stack-trace of the problem I had faced.
> >>>
> >>> I am trying to reproduce it by running SPECjbb workload over and
> >>> over
> >> again.
> >>>
> >>> I haven't looked at GC code since end of last year. So I am having a
> >>> difficult time pinning what the problem was.
> >>>
> >>> I am looking at G1ClearBitMapTask which iterates over bitmap for all
> >>> available regions. I am not sure when this task is performed.
> >>>
> >>> There is comment in HeapRegionManager::par_iterate() as shown
> below,
> >>>
> >>> /// This also (potentially) iterates over regions newly allocated
> >>> during GC. This/
> >>>
> >>> /  // is no problem except for some extra work./
> >>>
> >>> This method is eventually called from G1ClearBitMapTask. The comment
> >>> suggests that regions are allocated concurrently when the function
> >>> is run. This also means with AllocateOldGenAt flag enabled, regions
> >>> can also be un-committed.
> >>
> >> I don't understand how AllocateOldGenAt would make any difference,
> >> regions can be un-committed without it as well and there are
> >> mechanisms in place to make sure only the correct parts of the side
> >> structures are un- committed when that happens.
> >
> > In the regular code un-commit is only done by VM thread during safepoint.
> Un-commit of region also causes its corresponding bitmap to be un-
> committed.
> > But it never happens that CM threads are iterating over bitmap while
> regions are being un-committed concurrently.
> >
> > Whereas when AllocateOldGenAt is used, because of the way regions are
> > managed between dram and nvdimms, regions can be un-committed by
> mutator threads and GC threads.
> > 1. Mutator threads - during mutator region allocation and humongous
> region allocation.
> 
> This is the problem, I managed to reproduce this by adding a short sleep in
> the clearing code and force back to back concurrent cycles in SPECjvm2008
> and a 2g heap. I think this is only a problem for humongous allocations,
> because we should never allocate more young regions than we have already
> made available at the end of the previous GC. But the humongous allocations
> can very well happen during we clear the bitmaps in the concurrent cycle so
> that is probably why the pinning was added.
> 
> Thinking more about this, a different solution would be to not un-commit
> memory in this situation. This all depends on how one sees the amount of
> committed memory when using AllocateOldGenAt, should the amount of
> committed on dram + nvdimm never be more than Xmx or is the important
> thing that the number of regions use never exceeds Xmx. I think I?m leaning
> towards the latter, but there might be reasons I haven?t thought about here.
> This would break the current invariant:
> assert(total_committed_before == total_regions_committed(), "invariant
> not met?);
> 
> But that might be ok. If using that approach, instead of un-committing
> (shrink_dram), just remove the same number of regions from the freelist,
> that you expand on nvdimm. The unused removed regions need to be kept
> track of so we can add them again during the GC. To me this is more or less
> the same concept we use when borrowing regions during the GC. There
> might be issues with this approach but I think it would be interesting to
> explore.
> 
[Kharbas, Kishor] Thank you for looking into this and reproducing the bug. I think I follow
your suggestion. I will try to work on a solution using this.

> I also wonder if we ever should need to expand_dram during
> allocate_new_region, I see that it happens now during GC and that is
> probably because we do this at the end of the GC:
> _manager->adjust_dram_regions((uint)young_list_target_length() ?
> 
> If this adjustment included the expected number of survivors as well, we
> should have enough DRAM regions and if we then end up getting an
> NVDIMM region when asking for a survivor we should return NULL signaling
> that survivor is full.
> 
> What do you think about that approach?

 [Kharbas, Kishor] This approach is simpler to implement. I am afraid that it would
change the behavior with respect to default case. Still I will give it a try.

For now, can we close the bug if the abstraction is satisfactory and continue exploration
in a separate issue?

Thanks,
Kishor

> 
> Thanks,
> Stefan
> 
> > 2. GC worker threads - during survivor region and old region allocation.
> > 3. VMThread - heap size adjustment as in default and after full GC to
> allocate enough regions in dram for young gen (may require to un-commit
> some regions from nvdimm).
> >
> > Could any of these be running concurrently when CM threads are iterating
> over the bitmap?
> >
> >>
> >> I want to reiterate what Sangheon said about identifying the root cause.
> >> If we don't know why this is needed and can't reproduce any failures
> >> without the special pinning of the bitmaps, I would rather see that
> >> we remove the pinning code to make things work more like normal G1.
> >
> > I am trying to reproduce but as you can imagine it is very rare and hard-to-
> reproduce bug, if it is.
> >
> > Thanks,
> > Kishor
> >>
> >> Thanks,
> >> Stefan
> >>
> >>
> >>>
> >>> Pardon me if my understanding is incorrect.
> >>>
> >>> Regards,
> >>>
> >>> Kishor


From stefan.karlsson at oracle.com  Tue Oct 22 08:18:52 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Oct 2019 10:18:52 +0200
Subject: RFR: 8232601: ZGC: Parameterize the ZGranuleMap table size
In-Reply-To: <de172be9-85a3-4016-d26e-07060a96f4e0@oracle.com>
References: <f486753e-cd02-5c22-7390-041cf248f423@oracle.com>
 <de172be9-85a3-4016-d26e-07060a96f4e0@oracle.com>
Message-ID: <1f2c9769-525b-d1d6-5e43-b9567ad6d070@oracle.com>

Thanks, Per.

StefanK

On 2019-10-22 08:14, Per Liden wrote:
> Looks good.
> 
> /Per
> 
> On 10/21/19 3:00 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to parameterize the ZGranuleMap table size.
>>
>> https://cr.openjdk.java.net/~stefank/8232601/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232601
>>
>> Previously, the maps were always bound by the range of a virtual 
>> address space view (ZAddressOffsetMax). We want to be able to use 
>> ZGranuleMap to map against physical memory offsets, so this RFE 
>> suggests that we allow users of ZGranuleMap to specify the max offset.
>>
>> Thanks,
>> StefanK


From stefan.karlsson at oracle.com  Tue Oct 22 08:19:01 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Oct 2019 10:19:01 +0200
Subject: RFR: 8232602: ZGC: Make ZGranuleMap ZAddress agnostic
In-Reply-To: <c6bb324d-9ff8-7fdf-e99e-e4a2b1da0144@oracle.com>
References: <4638080b-9f2e-6965-6ed9-a17b32ad3b94@oracle.com>
 <c6bb324d-9ff8-7fdf-e99e-e4a2b1da0144@oracle.com>
Message-ID: <d0b63c33-1b3c-a03f-530b-2479a93db041@oracle.com>

Thanks, Per.

StefanK


On 2019-10-22 08:18, Per Liden wrote:
> Looks good.
> 
> /Per
> 
> On 10/21/19 3:09 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to make ZGranuleMap ZAddress agnostic.
>>
>> https://cr.openjdk.java.net/~stefank/8232602/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232602
>>
>> Currently, the ZGranuleMap get and put functions take an address in 
>> the heap as a parameter. The address is then converted into an offset 
>> (into a heap view), before being scaled to a granule.
>>
>> We want to be able to use the ZGranuleMap for physical memory offsets, 
>> and not only heap addresses. Therefore, I propose that we move the 
>> conversions from address to offset out from ZGranuleMap, and move it 
>> to the current users of ZGranuleMap.
>>
>> This patch applies on-top of the patch for JDK-8232601.
>>
>> Thanks,
>> StefanK
>>


From stefan.karlsson at oracle.com  Tue Oct 22 08:19:26 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Oct 2019 10:19:26 +0200
Subject: RFR: 8232650: ZGC: Add initialization hooks for OS specific code
In-Reply-To: <fd674244-27b9-5d4b-664b-5b16ccf503ff@oracle.com>
References: <5cdd2722-26a4-8e6c-1262-5d97dfd7f46c@oracle.com>
 <fd674244-27b9-5d4b-664b-5b16ccf503ff@oracle.com>
Message-ID: <2c679db1-e2b0-20c6-665b-7d04acd0b03b@oracle.com>

Thanks, Per.

StefanK

On 2019-10-22 08:22, Per Liden wrote:
> Looks good.
> 
> /Per
> 
> On 10/21/19 4:37 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to add initialization hooks for OS specific 
>> code.
>>
>> https://cr.openjdk.java.net/~stefank/8232650/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232650
>>
>> These hooks are needed to for a Windows port. ZInitialize allows 
>> syscalls to be dynamically resolved. ZVirtualMemory allows callbacks 
>> from 8232649 to be initialized.
>>
>> Thanks,
>> StefanK


From stefan.karlsson at oracle.com  Tue Oct 22 08:19:12 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Oct 2019 10:19:12 +0200
Subject: RFR: 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of
 declarations
In-Reply-To: <5a6bf373-bb06-31fc-9493-8c93a7b21ba5@oracle.com>
References: <b981daf2-2924-1708-5a2f-7475cef3d85a@oracle.com>
 <5a6bf373-bb06-31fc-9493-8c93a7b21ba5@oracle.com>
Message-ID: <97da9b0b-89aa-6480-1485-bab4ad5c3be1@oracle.com>

Thanks, Per.

StefanK

On 2019-10-22 08:19, Per Liden wrote:
> Looks good.
> 
> /Per
> 
> On 10/21/19 3:22 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to move ATTRIBUTE_ALIGNED to the front of 
>> declarations.
>>
>> https://cr.openjdk.java.net/~stefank/8232648/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232648
>>
>> This is done because the Windows compiler requires ATTRIBUTE_ALIGNED 
>> to be put at the front of declarations. A new macro (ZCACHE_ALIGNED) 
>> is introduced, and used, to shorten the affected lines.
>>
>> Thanks,
>> StefanK


From stefan.karlsson at oracle.com  Tue Oct 22 08:19:37 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Oct 2019 10:19:37 +0200
Subject: RFR: 8232649: ZGC: Add callbacks to ZMemoryManager
In-Reply-To: <fb9d0ec1-f6d5-aba9-ac16-2a77f8d71a25@oracle.com>
References: <8793cda6-bec6-dac7-5164-8fc34454286e@oracle.com>
 <fb9d0ec1-f6d5-aba9-ac16-2a77f8d71a25@oracle.com>
Message-ID: <2e154c3a-bca6-cf5f-17f0-b9fbc6c079ad@oracle.com>

Thanks, Per.

StefanK

On 2019-10-22 08:24, Per Liden wrote:
> Looks good.
> 
> /Per
> 
> On 10/21/19 4:06 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to add callbacks to ZMemoryManager.
>>
>> https://cr.openjdk.java.net/~stefank/8232649/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232649
>>
>> This allows users of ZMemoryManager to get callbacks when memory 
>> regions are inserted, removed, split, and coalesced. This is needed to 
>> support Windows' stricter requirements for placeholder reserved memory.
>>
>> Thanks,
>> StefanK


From erik.osterlund at oracle.com  Tue Oct 22 09:17:25 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 22 Oct 2019 11:17:25 +0200
Subject: RFR: 8232601: ZGC: Parameterize the ZGranuleMap table size
In-Reply-To: <f486753e-cd02-5c22-7390-041cf248f423@oracle.com>
References: <f486753e-cd02-5c22-7390-041cf248f423@oracle.com>
Message-ID: <dd9c696e-1216-d4da-01e5-ee80dfb2c726@oracle.com>

Hi Stefan,

Looks good.

Thanks,
/Erik

On 10/21/19 3:00 PM, Stefan Karlsson wrote:
> Hi all,
>
> Please review this patch to parameterize the ZGranuleMap table size.
>
> https://cr.openjdk.java.net/~stefank/8232601/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232601
>
> Previously, the maps were always bound by the range of a virtual 
> address space view (ZAddressOffsetMax). We want to be able to use 
> ZGranuleMap to map against physical memory offsets, so this RFE 
> suggests that we allow users of ZGranuleMap to specify the max offset.
>
> Thanks,
> StefanK


From erik.osterlund at oracle.com  Tue Oct 22 09:17:50 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 22 Oct 2019 11:17:50 +0200
Subject: RFR: 8232602: ZGC: Make ZGranuleMap ZAddress agnostic
In-Reply-To: <4638080b-9f2e-6965-6ed9-a17b32ad3b94@oracle.com>
References: <4638080b-9f2e-6965-6ed9-a17b32ad3b94@oracle.com>
Message-ID: <fcd8f2c9-4cef-6fc5-a34d-fab4bf486e24@oracle.com>

Hi Stefan,

Looks good.

Thanks,
/Erik

On 10/21/19 3:09 PM, Stefan Karlsson wrote:
> Hi all,
>
> Please review this patch to make ZGranuleMap ZAddress agnostic.
>
> https://cr.openjdk.java.net/~stefank/8232602/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232602
>
> Currently, the ZGranuleMap get and put functions take an address in 
> the heap as a parameter. The address is then converted into an offset 
> (into a heap view), before being scaled to a granule.
>
> We want to be able to use the ZGranuleMap for physical memory offsets, 
> and not only heap addresses. Therefore, I propose that we move the 
> conversions from address to offset out from ZGranuleMap, and move it 
> to the current users of ZGranuleMap.
>
> This patch applies on-top of the patch for JDK-8232601.
>
> Thanks,
> StefanK
>


From erik.osterlund at oracle.com  Tue Oct 22 09:18:14 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 22 Oct 2019 11:18:14 +0200
Subject: RFR: 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of
 declarations
In-Reply-To: <b981daf2-2924-1708-5a2f-7475cef3d85a@oracle.com>
References: <b981daf2-2924-1708-5a2f-7475cef3d85a@oracle.com>
Message-ID: <201ba7a6-a371-f4c0-340f-a5af14d0323f@oracle.com>

Hi Stefan,

Looks good.

Thanks,
/Erik

On 10/21/19 3:22 PM, Stefan Karlsson wrote:
> Hi all,
>
> Please review this patch to move ATTRIBUTE_ALIGNED to the front of 
> declarations.
>
> https://cr.openjdk.java.net/~stefank/8232648/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232648
>
> This is done because the Windows compiler requires ATTRIBUTE_ALIGNED 
> to be put at the front of declarations. A new macro (ZCACHE_ALIGNED) 
> is introduced, and used, to shorten the affected lines.
>
> Thanks,
> StefanK


From erik.osterlund at oracle.com  Tue Oct 22 09:18:31 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 22 Oct 2019 11:18:31 +0200
Subject: RFR: 8232650: ZGC: Add initialization hooks for OS specific code
In-Reply-To: <5cdd2722-26a4-8e6c-1262-5d97dfd7f46c@oracle.com>
References: <5cdd2722-26a4-8e6c-1262-5d97dfd7f46c@oracle.com>
Message-ID: <c200c515-c38a-be2c-05e0-18622418f777@oracle.com>

Hi Stefan,

Looks good.

Thanks,
/Erik

On 10/21/19 4:37 PM, Stefan Karlsson wrote:
> Hi all,
>
> Please review this patch to add initialization hooks for OS specific 
> code.
>
> https://cr.openjdk.java.net/~stefank/8232650/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232650
>
> These hooks are needed to for a Windows port. ZInitialize allows 
> syscalls to be dynamically resolved. ZVirtualMemory allows callbacks 
> from 8232649 to be initialized.
>
> Thanks,
> StefanK


From erik.osterlund at oracle.com  Tue Oct 22 09:18:47 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 22 Oct 2019 11:18:47 +0200
Subject: RFR: 8232649: ZGC: Add callbacks to ZMemoryManager
In-Reply-To: <8793cda6-bec6-dac7-5164-8fc34454286e@oracle.com>
References: <8793cda6-bec6-dac7-5164-8fc34454286e@oracle.com>
Message-ID: <17249488-d0f4-81ae-3a15-b120cac388af@oracle.com>

Hi Stefan,

Looks good.

Thanks,
/Erik

On 10/21/19 4:06 PM, Stefan Karlsson wrote:
> Hi all,
>
> Please review this patch to add callbacks to ZMemoryManager.
>
> https://cr.openjdk.java.net/~stefank/8232649/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232649
>
> This allows users of ZMemoryManager to get callbacks when memory 
> regions are inserted, removed, split, and coalesced. This is needed to 
> support Windows' stricter requirements for placeholder reserved memory.
>
> Thanks,
> StefanK


From thomas.schatzl at oracle.com  Tue Oct 22 09:53:30 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 11:53:30 +0200
Subject: RFR (XS): 8232771: Revert JDK-8230794 because of environment changes
Message-ID: <31507a84-fb55-25e7-7ead-3955e0c66a31@oracle.com>

Hi all,

   can I have reviews for this small change that reverts JDK-8230794 
because it let the failure reported in JDK-8227695 disappear? Also there 
were some environment changes that we think fixes the issue in JDK-8227695.

I would like the original code bake again, and see if this hunch is correct.
Later I still want to improve the assert, but first let's see about 
JDK-8227695. So sorry for the back and forth.

This is a straight revert of JDK-8230794 that applied without issues.

CR:
https://bugs.openjdk.java.net/browse/JDK-8232771
Webrev:
http://cr.openjdk.java.net/~tschatzl/8232771/webrev/
Testing:
local compilation

Thanks,
   Thomas


From stefan.johansson at oracle.com  Tue Oct 22 10:04:10 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 22 Oct 2019 12:04:10 +0200
Subject: RFR (XS): 8232771: Revert JDK-8230794 because of environment
 changes
In-Reply-To: <31507a84-fb55-25e7-7ead-3955e0c66a31@oracle.com>
References: <31507a84-fb55-25e7-7ead-3955e0c66a31@oracle.com>
Message-ID: <aade15df-7774-4d33-082f-4431b74d2ed2@oracle.com>

Looks good,
Stefan

On 2019-10-22 11:53, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I have reviews for this small change that reverts JDK-8230794 
> because it let the failure reported in JDK-8227695 disappear? Also there 
> were some environment changes that we think fixes the issue in JDK-8227695.
> 
> I would like the original code bake again, and see if this hunch is 
> correct.
> Later I still want to improve the assert, but first let's see about 
> JDK-8227695. So sorry for the back and forth.
> 
> This is a straight revert of JDK-8230794 that applied without issues.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232771
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232771/webrev/
> Testing:
> local compilation
> 
> Thanks,
>  ? Thomas


From per.liden at oracle.com  Tue Oct 22 10:12:11 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 22 Oct 2019 12:12:11 +0200
Subject: RFR (XS): 8232771: Revert JDK-8230794 because of environment
 changes
In-Reply-To: <31507a84-fb55-25e7-7ead-3955e0c66a31@oracle.com>
References: <31507a84-fb55-25e7-7ead-3955e0c66a31@oracle.com>
Message-ID: <3b0b24e8-be45-6501-f1ce-112585087954@oracle.com>

Looks good.

/Per

On 10/22/19 11:53 AM, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I have reviews for this small change that reverts JDK-8230794 
> because it let the failure reported in JDK-8227695 disappear? Also there 
> were some environment changes that we think fixes the issue in JDK-8227695.
> 
> I would like the original code bake again, and see if this hunch is 
> correct.
> Later I still want to improve the assert, but first let's see about 
> JDK-8227695. So sorry for the back and forth.
> 
> This is a straight revert of JDK-8230794 that applied without issues.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232771
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232771/webrev/
> Testing:
> local compilation
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Tue Oct 22 10:13:11 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 12:13:11 +0200
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
 <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
Message-ID: <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>

Hi Kim,

   thanks a lot for taking the time so quickly.

On 22.10.19 03:20, Kim Barrett wrote:
>> On Oct 19, 2019, at 9:06 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi all,
>>
>>   there is a new webrev at
>>
>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2/ (full only,
>> there is no point in providing a diff)
>>
>> since I like this solution a lot as it removes a lot of additional
>> >> post-processing.
>>[...]
 >>
> I'm glad the new state machine worked out, and allowed the extra task
> to be eliminated. Thanks for going the extra mile with the testing.
> And thanks for turning my pseudo-code into something more readable. My
> comments here mostly suggestions for more of that; I don't think I'd
> want to have to decipher this in 6 months without some helpful
> commentary. :)

I think I addressed all your comments, and thanks for your suggestions - 
I agree about having this tricky code well documented.

Changes are currently running through hs-tier1-5 with the changes that 
ease reproduction (the webrev.2.testing changes noted in the last 
email). Since there are no significant code changes apart from 
documentation, I am confident there will be no issues.

Webrevs:
http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2_to_3/ (diff)
http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3/ (full)

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Tue Oct 22 10:13:54 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 12:13:54 +0200
Subject: RFR (XS): 8232771: Revert JDK-8230794 because of environment
 changes
In-Reply-To: <3b0b24e8-be45-6501-f1ce-112585087954@oracle.com>
References: <31507a84-fb55-25e7-7ead-3955e0c66a31@oracle.com>
 <3b0b24e8-be45-6501-f1ce-112585087954@oracle.com>
Message-ID: <1f05b5f8-960f-99be-d518-63f6f7dfb3c2@oracle.com>

Thanks Per and Stefan for your reviews.

Thomas

On 22.10.19 12:12, Per Liden wrote:
> Looks good.
> 
> /Per
> 
> On 10/22/19 11:53 AM, Thomas Schatzl wrote:
>> Hi all,
>>
>> ?? can I have reviews for this small change that reverts JDK-8230794 
>> because it let the failure reported in JDK-8227695 disappear? Also 
>> there were some environment changes that we think fixes the issue in 
>> JDK-8227695.
>>
>> I would like the original code bake again, and see if this hunch is 
>> correct.
>> Later I still want to improve the assert, but first let's see about 
>> JDK-8227695. So sorry for the back and forth.
>>
>> This is a straight revert of JDK-8230794 that applied without issues.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8232771
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8232771/webrev/
>> Testing:
>> local compilation
>>
>> Thanks,
>> ?? Thomas


From shade at redhat.com  Tue Oct 22 11:48:15 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 22 Oct 2019 13:48:15 +0200
Subject: RFR (XS) 8232778: Shenandoah: SBSA::arraycopy_prologue checks wrong
 register
Message-ID: <01ea4f42-e33f-64ee-3514-12d19d2dc820@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8232778

Fix:

diff -r 24d411cb3a90 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
--- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp   Tue Oct 22
08:57:41 2019 +0200
+++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp   Tue Oct 22
13:39:05 2019 +0200
@@ -58,7 +58,7 @@
       Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
       __ ldrb(rscratch1, gc_state);
       if (dest_uninitialized) {
-        __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
+        __ tbz(rscratch1, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
       } else {
         __ mov(rscratch2, ShenandoahHeap::HAS_FORWARDED | ShenandoahHeap::MARKING);
         __ tst(rscratch1, rscratch2);

The load happens into rscratch1, yet we are testing rscratch2. I think this silently breaks
arraycopy to-space guarantees, as rscratch2 may contain garbage.

Testing: aarch64 hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From per.liden at oracle.com  Tue Oct 22 12:01:18 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 22 Oct 2019 14:01:18 +0200
Subject: RFR: 8231552: ZGC: Refine address space reservation
In-Reply-To: <c412fdf3-8f74-390e-6c6d-0d6df4e273f5@oracle.com>
References: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>
 <c412fdf3-8f74-390e-6c6d-0d6df4e273f5@oracle.com>
Message-ID: <2b79829d-f577-819d-9577-91351c03fddb@oracle.com>

Updated webrev after off-line comments from Stefan and Erik.

Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.3
Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.3-diff

/Per

On 10/16/19 10:41 AM, Per Liden wrote:
> Latest version of this patch, rebased on today's jdk/jdk:
> 
> http://cr.openjdk.java.net/~pliden/8231552/webrev.2
> 
> /Per
> 
> On 10/3/19 11:45 AM, Per Liden wrote:
>> We could be slightly more sophisticated and do a better job reserving 
>> address space in situations where parts of the address space is 
>> already occupied or when the process is running with address space 
>> limitations.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231552
>> Webrev: http://cr.openjdk.java.net/~pliden/8231552/webrev.0
>>
>> /Per


From rkennke at redhat.com  Tue Oct 22 12:04:37 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 22 Oct 2019 14:04:37 +0200
Subject: RFR (XS) 8232778: Shenandoah: SBSA::arraycopy_prologue checks
 wrong register
In-Reply-To: <01ea4f42-e33f-64ee-3514-12d19d2dc820@redhat.com>
References: <01ea4f42-e33f-64ee-3514-12d19d2dc820@redhat.com>
Message-ID: <8be9fb1d-1f8c-b121-026f-f40a12b2ca09@redhat.com>

Good spot!

Looks good, thanks!
Roman

> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8232778
> 
> Fix:
> 
> diff -r 24d411cb3a90 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
> --- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp   Tue Oct 22
> 08:57:41 2019 +0200
> +++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp   Tue Oct 22
> 13:39:05 2019 +0200
> @@ -58,7 +58,7 @@
>        Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
>        __ ldrb(rscratch1, gc_state);
>        if (dest_uninitialized) {
> -        __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
> +        __ tbz(rscratch1, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
>        } else {
>          __ mov(rscratch2, ShenandoahHeap::HAS_FORWARDED | ShenandoahHeap::MARKING);
>          __ tst(rscratch1, rscratch2);
> 
> The load happens into rscratch1, yet we are testing rscratch2. I think this silently breaks
> arraycopy to-space guarantees, as rscratch2 may contain garbage.
> 
> Testing: aarch64 hotspot_gc_shenandoah
> 


From shade at redhat.com  Tue Oct 22 12:12:24 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 22 Oct 2019 14:12:24 +0200
Subject: RFR (XS) 8232778: Shenandoah: SBSA::arraycopy_prologue checks
 wrong register
In-Reply-To: <8be9fb1d-1f8c-b121-026f-f40a12b2ca09@redhat.com>
References: <01ea4f42-e33f-64ee-3514-12d19d2dc820@redhat.com>
 <8be9fb1d-1f8c-b121-026f-f40a12b2ca09@redhat.com>
Message-ID: <238286cd-c420-59ab-fe61-d9b3169a4df6@redhat.com>

Thanks, I also think it is trivial. Pushed.

-Aleksey

On 10/22/19 2:04 PM, Roman Kennke wrote:
> Good spot!
> 
> Looks good, thanks!
> Roman
> 
>> Bug:
>>   https://bugs.openjdk.java.net/browse/JDK-8232778
>>
>> Fix:
>>
>> diff -r 24d411cb3a90 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
>> --- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp   Tue Oct 22
>> 08:57:41 2019 +0200
>> +++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp   Tue Oct 22
>> 13:39:05 2019 +0200
>> @@ -58,7 +58,7 @@
>>        Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
>>        __ ldrb(rscratch1, gc_state);
>>        if (dest_uninitialized) {
>> -        __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
>> +        __ tbz(rscratch1, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
>>        } else {
>>          __ mov(rscratch2, ShenandoahHeap::HAS_FORWARDED | ShenandoahHeap::MARKING);
>>          __ tst(rscratch1, rscratch2);
>>
>> The load happens into rscratch1, yet we are testing rscratch2. I think this silently breaks
>> arraycopy to-space guarantees, as rscratch2 may contain garbage.
>>
>> Testing: aarch64 hotspot_gc_shenandoah
>>
> 


-- 
Thanks,
-Aleksey


From erik.osterlund at oracle.com  Tue Oct 22 12:39:26 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 22 Oct 2019 14:39:26 +0200
Subject: RFR: 8231552: ZGC: Refine address space reservation
In-Reply-To: <2b79829d-f577-819d-9577-91351c03fddb@oracle.com>
References: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>
 <c412fdf3-8f74-390e-6c6d-0d6df4e273f5@oracle.com>
 <2b79829d-f577-819d-9577-91351c03fddb@oracle.com>
Message-ID: <13fefca7-da7a-5f8b-ab5f-f208bdf33940@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

On 10/22/19 2:01 PM, Per Liden wrote:
> Updated webrev after off-line comments from Stefan and Erik.
>
> Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.3
> Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.3-diff
>
> /Per
>
> On 10/16/19 10:41 AM, Per Liden wrote:
>> Latest version of this patch, rebased on today's jdk/jdk:
>>
>> http://cr.openjdk.java.net/~pliden/8231552/webrev.2
>>
>> /Per
>>
>> On 10/3/19 11:45 AM, Per Liden wrote:
>>> We could be slightly more sophisticated and do a better job 
>>> reserving address space in situations where parts of the address 
>>> space is already occupied or when the process is running with 
>>> address space limitations.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231552
>>> Webrev: http://cr.openjdk.java.net/~pliden/8231552/webrev.0
>>>
>>> /Per


From per.liden at oracle.com  Tue Oct 22 12:55:24 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 22 Oct 2019 14:55:24 +0200
Subject: RFR: 8231552: ZGC: Refine address space reservation
In-Reply-To: <13fefca7-da7a-5f8b-ab5f-f208bdf33940@oracle.com>
References: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>
 <c412fdf3-8f74-390e-6c6d-0d6df4e273f5@oracle.com>
 <2b79829d-f577-819d-9577-91351c03fddb@oracle.com>
 <13fefca7-da7a-5f8b-ab5f-f208bdf33940@oracle.com>
Message-ID: <4c6adb69-ce8b-5b35-2bb0-644b55ef229d@oracle.com>

Thanks Erik!

/Per

On 10/22/19 2:39 PM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 10/22/19 2:01 PM, Per Liden wrote:
>> Updated webrev after off-line comments from Stefan and Erik.
>>
>> Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.3
>> Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.3-diff
>>
>> /Per
>>
>> On 10/16/19 10:41 AM, Per Liden wrote:
>>> Latest version of this patch, rebased on today's jdk/jdk:
>>>
>>> http://cr.openjdk.java.net/~pliden/8231552/webrev.2
>>>
>>> /Per
>>>
>>> On 10/3/19 11:45 AM, Per Liden wrote:
>>>> We could be slightly more sophisticated and do a better job 
>>>> reserving address space in situations where parts of the address 
>>>> space is already occupied or when the process is running with 
>>>> address space limitations.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231552
>>>> Webrev: http://cr.openjdk.java.net/~pliden/8231552/webrev.0
>>>>
>>>> /Per
> 


From zgu at redhat.com  Tue Oct 22 13:38:57 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 22 Oct 2019 09:38:57 -0400
Subject: RFR 8232747: Shenandoah: Concurrent GC should deactivate SATB before
 processing weak roots
Message-ID: <9c29e649-eb27-be6e-2240-ad3ff99b7462@redhat.com>

This is the counterpart of JDK-8231999[1] for Shenandoah concurrent GC. 
Shenandoah needs to deactivate SATB barrier before processing weak 
roots, to avoid barrier side-effects on its paths.

Bug: https://bugs.openjdk.java.net/browse/JDK-8232747
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232747/webrev.00/index.html

Test:
   hotspot_gc_shenandoah (fastdebug and release) on Linux x86_64

Thanks,

-Zhengyu


[1] https://bugs.openjdk.java.net/browse/JDK-8231999


From stefan.johansson at oracle.com  Tue Oct 22 13:41:46 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 22 Oct 2019 15:41:46 +0200
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
 <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
 <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>
 <CAKSDcxu_t+Ka0LUU3WSzdR52-_+rsvKo_Vxn8av=duFmS4EPyA@mail.gmail.com>
 <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com>
Message-ID: <e72f06af-8847-844b-107c-afd15e01f71b@oracle.com>

Hi Haoyu,

I've reviewed the patch now and have some comments and questions.

To simplify the review and have a common base to look at I've created a 
webrev at:
http://cr.openjdk.java.net/~sjohanss/8220465/00/

One general note first, most of the new code uses four space 
indentation, in hotspot the standard is two spaces, please change this. 
Below are some file by file comments.

src/hotspot/share/gc/parallel/psCompactionManager.cpp
---
   53 GrowableArray<size_t >* ParCompactionManager::_free_shadow = new 
(ResourceObj::C_HEAP, mtInternal) GrowableArray<size_t >(10, true);
   54 Monitor*                ParCompactionManager::_monitor = NULL;

Set _free_shadow to NULL here like the other statics and then create the 
GrowableArray in initialize(). I also think _shadow_region_array or 
something like that would be a better name and the monitor should also 
be named something that signals that it is used for this array.
---
   70   if (_monitor == NULL) {
   71       _monitor = new Monitor(Mutex::barrier, "CompactionManager 
monitor",
   72                              Mutex::_allow_vm_block_flag, 
Monitor::_safepoint_check_never);
   73   }

Instead of doing the monitor creation here having to check for NULL, do 
it in initialize() below together with the array creation.
---

src/hotspot/share/gc/parallel/psParallelCompact.cpp
---
2974       if (cur->push()) {

Correct me if I'm wrong, if this call to push() returns true it means 
that nobody else has "stolen" it (used a shadow region to prepare it) 
and we mark it as pushed. But when pushed in this code path this is the 
end state for this RegionData? If this is the case I think it would be 
easier to understand the code if we added another function and state for 
when we "steal" it. Haven't thought very much about the names but I 
think you understand what I want to achieve:
Normal path:
UNUSED -> push() -> NORMAL
Steal path:
UNUSED -> steal() -> STOLEN -> fill() -> FILLED -> copy() -> SHADOW

We could then also assert in set_completed() that the state is either 
NORMAL or SHADOW (or if they have a shared end state DONE). As I said 
the names can be improved (both for the states and the functions) but I 
think we should have names and not just numbers.
---

3060 template <class T>
3061 void PSParallelCompact::fill_region(ParCompactionManager* cm, 
size_t region_idx, size_t shadow, size_t offset)

As I told you this was a big improvement from the first patch, but I 
think there is room for even more improvements around the way we pass in 
ignored parameters to MoveAndUpdateClosure. Explaining my idea in text 
is harder than code, so I created a patch, what do you think about this?
http://cr.openjdk.java.net/~sjohanss/8220465/00-alt/

This alternative is based on 00 and does not take my other comments into 
consideration. So it might have to be altered a bit if you address some 
of my other comments/questions.
---

3196 void PSParallelCompact::copy_back(HeapWord *region_addr, HeapWord 
*shadow_addr) {

I think the paramenter should change place, so that it corresponds with 
the copy below.
---

3200 bool PSParallelCompact::steal_shadow_region(ParCompactionManager* 
cm, size_t &region_idx) {
3201     size_t& record = cm->shadow_record();

Did you consider to just let shadow_record() be a simple getter instead 
of getting a reference and then have a next_shadow_record() which 
advances it by active_workers?
---

3236 void PSParallelCompact::initialize_steal_record(uint which) {

I'm having a hard time understanding the details here, or I get that all 
threads should have a separate shadow record, but I'm not sure why it is 
not enough to just do:
size_t record = _summary_data.addr_to_region_idx(
   _space_info[old_space_id].dense_prefix());
cm->set_shadow_record(record + which);

As you can see I'm also suggesting adding a setter for shadow_record.
---

3434 ParMarkBitMapClosure::IterationStatus
3435 ShadowClosure::do_addr(HeapWord* addr, size_t words) {
3436     HeapWord* shadow_destination = destination() + _offset;

Using an offset instead of a given address feels a bit backwards, did 
you consider letting the closure keep and update a _shadow_destination 
instead? Or would it even be possible to just set destination to be the 
shadow region address? In that case it should be possible to just use 
the do_addr and other functions from the MoveAndUpdateClosure.

I see from looking at this particular function that there is one assert 
that would have to change:
3408 
assert(PSParallelCompact::summary_data().calc_new_pointer(source(), 
compaction_manager()) ==
3409          destination(), "wrong destination");

This should be easily fixed by adding a virtual function 
check_destination, that has a special implementation for the ShadowClosure.
---

src/hotspot/share/gc/parallel/psParallelCompact.hpp
---
  333     // Preempt the region to avoid double processes
  334     inline bool push();
  335     // Mark the region as filled and ready to be copied back
  336     inline bool fill();
  337     // Preempt the region to copy the shadow region content back
  338     inline bool copy();

As mentioned, I think there might be better names for those functions 
and the comments. Maybe adding a prefix would make the code more self 
explaining. try_push(), mark_filled(), try_copy() and the new try_steal().
---

Thanks again for providing this patch, I look forward to see an updated 
version.

Cheers,
Stefan


On 2019-10-14 15:00, Stefan Johansson wrote:
> Thanks for the quick update Haoyu,
> 
> This is a great improvement and I will try to find time to look into the 
> patch in more detail the coming weeks.
> 
> Thanks,
> Stefan
> 
> On 2019-10-11 14:49, Haoyu Li wrote:
>> Hi Stefan,
>>
>> Thanks for your suggestion! It is very redundant that
>> PSParallelCompact::fill_shadow_region() copies most code from
>> PSParallelCompact::fill_region(), and therefore I've refactored these
>> two functions to share code as many as possible. And the attachment is
>> the updated patch.
>>
>> Specifically, the closure, which moves objects, in
>> PSParallelCompact::fill_region() is now declared as a template of
>> either MoveAndUpdateClosure or ShadowClosure. So by controlling the
>> type of closure when invoking the function, we can decide whether to
>> fill a normal region or a shadow one. Thus, almost all code in
>> PSParallelCompact::fill_region() can be reused.
>>
>> Besides, a virtual function named complete_region() is added in both
>> closures to do some work after the filling, such setting states and
>> copying the shadow region back.
>>
>> Thanks again for reviewing the patch, looking forward to your insights
>> and suggestions!
>>
>> Best Regards,
>> Haoyu Li
>>
>> 2019-10-10 21:50 GMT+08:00, Stefan Johansson 
>> <stefan.johansson at oracle.com>:
>>> Thanks for the clarification =)
>>>
>>> Moving on to the next part, the code in the patch. So this won't be a
>>> full review of the patch but just an initial comment that I would like
>>> to be addressed first.
>>>
>>> The new function PSParallelCompact::fill_shadow_region() is more or less
>>> a copy of PSParallelCompact::fill_region() and I understand that from a
>>> proof of concept point of view it was the easy (and right) way to do it.
>>> I would prefer if the code could be refactored so that fill_region() and
>>> fill_shadow_region() share more code. There might be reasons that I've
>>> missed, that prevents it, but we should at least explore how much code
>>> can be shared.
>>>
>>> Thanks,
>>> Stefan
>>>
>>> On 2019-10-10 15:10, Haoyu Li wrote:
>>>> Hi Stefan,
>>>>
>>>> Thanks for your quick response! As to your concern about the OCA, I am
>>>> the sole author of the patch. And it is the case as what the agreement
>>>> states.
>>>> Best Regrads,
>>>> Haoyu Li,
>>>>
>>>>
>>>> Stefan Johansson <stefan.johansson at oracle.com
>>>> <mailto:stefan.johansson at oracle.com>> ?2019?10?10??? ??8:37 
>>>> ???
>>>>
>>>> ???? Hi,
>>>>
>>>> ???? On 2019-10-10 13:06, Haoyu Li wrote:
>>>> ????? > Hi Stefan,
>>>> ????? >
>>>> ????? > Thanks for your testing! One possible reason for the 
>>>> regressions
>>>> in
>>>> ????? > simple tests is that the region dependencies maybe not heavy
>>>> enough.
>>>> ????? > Because the locality of shadow regions is lower than that of 
>>>> heap
>>>> ????? > regions, writing to shadow regions will be slower than to 
>>>> normal
>>>> ????? > regions, and this is a part of the reason why I reuse shadow
>>>> ???? regions.
>>>> ????? > Therefore, if only a few shadow regions are created and not
>>>> ???? reused, the
>>>> ????? > overhead may not be amortized.
>>>>
>>>> ???? I guess it is something like this. I thought that for "easy" heaps
>>>> the
>>>> ???? shadow regions won't be used at all, and should therefor not 
>>>> really
>>>> ???? cost
>>>> ???? anything.
>>>>
>>>> ????? >
>>>> ????? > As to the OCA, it is the case that I'm the only person 
>>>> signing the
>>>> ????? > agreement. Please let me know if you have any further 
>>>> questions.
>>>> ???? Thanks
>>>> ????? > again!
>>>>
>>>> ???? Ok, so you are the sole author of the patch. The important 
>>>> part, as
>>>> the
>>>> ???? agreement states, is:
>>>> ???? "no other person or entity, including my employer, has or will 
>>>> have
>>>> ???? rights with respect my contributions"
>>>>
>>>> ???? Is that the case?
>>>>
>>>> ???? Thanks,
>>>> ???? Stefan
>>>>
>>>> ????? >
>>>> ????? > Best Regrads,
>>>> ????? > Haoyu Li
>>>> ????? >
>>>> ????? > Stefan Johansson <stefan.johansson at oracle.com
>>>> ???? <mailto:stefan.johansson at oracle.com>
>>>> ????? > <mailto:stefan.johansson at oracle.com
>>>> ???? <mailto:stefan.johansson at oracle.com>>> ?2019?10?8??? ?? 
>>>> 6:49
>>>> ???? ???
>>>> ????? >
>>>> ????? >???? Hi Haoyu,
>>>> ????? >
>>>> ????? >???? I've done some more testing and I haven't seen any issues
>>>> ???? with the
>>>> ????? >???? patch
>>>> ????? >???? so far and the performance looks promising in most 
>>>> cases. For
>>>> ???? simple
>>>> ????? >???? tests I've seen some regressions, but I'm not really sure
>>>> ???? why. Will do
>>>> ????? >???? some more digging.
>>>> ????? >
>>>> ????? >???? To move forward with this the first thing we need to do is
>>>> ???? making sure
>>>> ????? >???? that you being covered by the Oracle Contributor 
>>>> Agreement is
>>>> ???? enough.
>>>> ????? >?????? From what we can see it is only you as an individual that
>>>> ???? has signed
>>>> ????? >???? the OCA and in that case it is important that this 
>>>> statement
>>>> ???? from the
>>>> ????? >???? OCA is fulfilled: "no other person or entity, including my
>>>> ???? employer,
>>>> ????? >???? has
>>>> ????? >???? or will have rights with respect my contributions"
>>>> ????? >
>>>> ????? >???? Is this the case for this contribution or should we have 
>>>> the
>>>> ???? university
>>>> ????? >???? sign the OCA as well? For more information regarding the 
>>>> OCA
>>>> ???? please
>>>> ????? >???? refer to:
>>>> ????? > https://www.oracle.com/technetwork/oca-faq-405384.pdf
>>>> ????? >
>>>> ????? >???? Thanks,
>>>> ????? >???? Stefan
>>>> ????? >
>>>> ????? >???? On 2019-09-16 16:02, Haoyu Li wrote:
>>>> ????? >????? > FYI, the evaluation results on OpenJDK 14 are plotted in
>>>> the
>>>> ????? >???? attachment.
>>>> ????? >????? > I compute the full GC throughput by dividing the heap 
>>>> size
>>>> ???? before
>>>> ????? >???? full
>>>> ????? >????? > GC by the GC pause time, and the results are arithmetic
>>>> mean
>>>> ????? >???? values of
>>>> ????? >????? > ten runs after a warm-up run. The evaluation is 
>>>> conducted on
>>>> a
>>>> ????? >???? machine
>>>> ????? >????? > with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16
>>>> ???? physical
>>>> ????? >???? cores
>>>> ????? >????? > with SMT enabled) and 64G DRAM.
>>>> ????? >????? >
>>>> ????? >????? > Best Regrads,
>>>> ????? >????? > Haoyu Li,
>>>> ????? >????? > Institute of Parallel and Distributed Systems(IPADS),
>>>> ????? >????? > School of Software,
>>>> ????? >????? > Shanghai Jiao Tong University
>>>> ????? >????? >
>>>> ????? >????? >
>>>> ????? >????? > Stefan Johansson <stefan.johansson at oracle.com
>>>> ???? <mailto:stefan.johansson at oracle.com>
>>>> ????? >???? <mailto:stefan.johansson at oracle.com
>>>> ???? <mailto:stefan.johansson at oracle.com>>
>>>> ????? >????? > <mailto:stefan.johansson at oracle.com
>>>> ???? <mailto:stefan.johansson at oracle.com>
>>>> ????? >???? <mailto:stefan.johansson at oracle.com
>>>> ???? <mailto:stefan.johansson at oracle.com>>>> ?2019?9?12??? ? 
>>>> ?5:34
>>>> ????? >???? ???
>>>> ????? >????? >
>>>> ????? >????? >???? Hi Haoyu,
>>>> ????? >????? >
>>>> ????? >????? >???? I recently came across your patch and I would 
>>>> like to
>>>> ???? pick up on
>>>> ????? >????? >???? some of the things Kim mentioned in his mails. I
>>>> ???? especially want
>>>> ????? >????? >???? evaluate and investigate if this is a technique 
>>>> we can
>>>> ???? use to
>>>> ????? >????? >???? improve the other GCs as well. To start that work I
>>>> ???? want to
>>>> ????? >???? take the
>>>> ????? >????? >???? patch for a spin in our internal performance 
>>>> testing.
>>>> ???? The patch
>>>> ????? >????? >???? doesn?t apply clean to the latest JDK repository, so
>>>> ???? if you could
>>>> ????? >????? >???? provide an updated patch that would be very helpful.
>>>> ????? >????? >
>>>> ????? >????? >???? It would also be great if you could share some more
>>>> ???? information
>>>> ????? >????? >???? around the results presented in the paper. For 
>>>> example,
>>>> it
>>>> ????? >???? would be
>>>> ????? >????? >???? good to get the full command lines for the different
>>>> ????? >???? benchmarks so
>>>> ????? >????? >???? we can run them locally and reproduce the
>>>> ???? results you?ve seen.
>>>> ????? >????? >
>>>> ????? >????? >???? Thanks,
>>>> ????? >????? >???? Stefan
>>>> ????? >????? >
>>>> ????? >????? >>???? 12 mars 2019 kl. 03:21 skrev Haoyu Li
>>>> ???? <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>>>> ????? >???? <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>>>> ????? >????? >>???? <mailto:leihouyju at gmail.com
>>>> ???? <mailto:leihouyju at gmail.com> <mailto:leihouyju at gmail.com
>>>> ???? <mailto:leihouyju at gmail.com>>>>:
>>>> ????? >????? >>
>>>> ????? >????? >>???? Hi Kim,
>>>> ????? >????? >>
>>>> ????? >????? >>???? Thanks for reviewing and testing the patch. If 
>>>> there
>>>> ???? are any
>>>> ????? >????? >>???? failures or performance degradation relevant to the
>>>> ???? work, please
>>>> ????? >????? >>???? let me know and I'll be very happy to keep 
>>>> improving
>>>> it.
>>>> ????? >???? Also, any
>>>> ????? >????? >>???? suggestions about code improvements are well
>>>> appreciated.
>>>> ????? >????? >>
>>>> ????? >????? >>???? I'm not quite sure if both G1 and Shenandoah 
>>>> have the
>>>> ???? similar
>>>> ????? >????? >>???? region dependency issue, since I haven't studied 
>>>> their
>>>> GC
>>>> ????? >????? >>???? behaviors before. If they have, I'm also willing to
>>>> ???? propose
>>>> ????? >???? a more
>>>> ????? >????? >>???? general optimization.
>>>> ????? >????? >>
>>>> ????? >????? >>???? As to the memory overhead, I believe it will be low
>>>> ???? because this
>>>> ????? >????? >>???? patch exploits empty regions in the young space
>>>> ???? rather than
>>>> ????? >????? >>???? off-heap memory to allocate shadow regions, and 
>>>> also
>>>> ???? reuses the
>>>> ????? >????? >>???? /_source_region/ field of each /RegionData /to 
>>>> record
>>>> the
>>>> ????? >????? >>???? correspongding shadow region index. We only 
>>>> introduce
>>>> ???? a new
>>>> ????? >????? >>???? integer filed /_shadow /in the RegionData class to
>>>> ???? indicate the
>>>> ????? >????? >>???? status of a region, a global /GrowableArray
>>>> ???? _free_shadow/ to
>>>> ????? >???? store
>>>> ????? >????? >>???? the indices of shadow regions, and a global
>>>> ???? /Monitor/ to protect
>>>> ????? >????? >>???? the array. These information might help if the 
>>>> memory
>>>> ???? overhead
>>>> ????? >????? >>???? need to be evaluated.
>>>> ????? >????? >>
>>>> ????? >????? >>???? Looking forward to your insight.
>>>> ????? >????? >>
>>>> ????? >????? >>???? Best Regrads,
>>>> ????? >????? >>???? Haoyu Li,
>>>> ????? >????? >>???? Institute of Parallel and Distributed 
>>>> Systems(IPADS),
>>>> ????? >????? >>???? School of Software,
>>>> ????? >????? >>???? Shanghai Jiao Tong University
>>>> ????? >????? >>
>>>> ????? >????? >>
>>>> ????? >????? >>???? Kim Barrett <kim.barrett at oracle.com
>>>> ???? <mailto:kim.barrett at oracle.com>
>>>> ????? >???? <mailto:kim.barrett at oracle.com
>>>> <mailto:kim.barrett at oracle.com>>
>>>> ????? >????? >>???? <mailto:kim.barrett at oracle.com
>>>> ???? <mailto:kim.barrett at oracle.com>
>>>> ????? >???? <mailto:kim.barrett at oracle.com
>>>> ???? <mailto:kim.barrett at oracle.com>>>> ?2019?3?12??? ??6:11 
>>>> ???
>>>> ????? >????? >>
>>>> ????? >????? >>???????? > On Mar 11, 2019, at 1:45 AM, Kim Barrett
>>>> ????? >????? >>???????? <kim.barrett at oracle.com
>>>> ???? <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
>>>> ???? <mailto:kim.barrett at oracle.com>>
>>>> ????? >???? <mailto:kim.barrett at oracle.com
>>>> ???? <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
>>>> ???? <mailto:kim.barrett at oracle.com>>>> wrote:
>>>> ????? >????? >>???????? >
>>>> ????? >????? >>???????? >> On Jan 24, 2019, at 3:58 AM, Haoyu Li
>>>> ????? >???? <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>>>> ???? <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>>>> ????? >????? >>???????? <mailto:leihouyju at gmail.com
>>>> ???? <mailto:leihouyju at gmail.com>
>>>> ????? >???? <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>>>
>>>> ???? wrote:
>>>> ????? >????? >>???????? >>
>>>> ????? >????? >>???????? >> Hi Kim,
>>>> ????? >????? >>???????? >>
>>>> ????? >????? >>???????? >> I have ported my patch to OpenJDK 13 
>>>> according
>>>> ???? to your
>>>> ????? >????? >>???????? instructions in your last mail, and the 
>>>> patch is
>>>> ???? attached in
>>>> ????? >????? >>???????? this mail. The patch does not change much since
>>>> ???? PSGC is
>>>> ????? >???? indeed
>>>> ????? >????? >>???????? pretty stable.
>>>> ????? >????? >>???????? >>
>>>> ????? >????? >>???????? >> Also, I evaluate the correctness and
>>>> ???? performance of
>>>> ????? >???? PS full
>>>> ????? >????? >>???????? GC with benchmarks from DaCapo, SPECjvm2008, 
>>>> and
>>>> ???? JOlden
>>>> ????? >???? suits
>>>> ????? >????? >>???????? on a machine with dual Intel Xeon E5-2618L v3
>>>> CPUs(16
>>>> ????? >???? physical
>>>> ????? >????? >>???????? cores), 64G DRAM and linux kernel 4.17. The
>>>> ???? evaluation
>>>> ????? >???? result,
>>>> ????? >????? >>???????? indicating 1.9X GC throughput improvement on
>>>> ???? average, is
>>>> ????? >????? >>???????? attached, too.
>>>> ????? >????? >>???????? >>
>>>> ????? >????? >>???????? >> However, I have no idea how to further test
>>>> this
>>>> ????? >???? patch for
>>>> ????? >????? >>???????? both correctness and performance. Can I please
>>>> ???? get any
>>>> ????? >????? >>???????? guidance from you or some sponsor?
>>>> ????? >????? >>???????? >
>>>> ????? >????? >>???????? > Sorry I missed that you had sent an updated
>>>> ???? version of the
>>>> ????? >????? >>???????? patch.
>>>> ????? >????? >>???????? >
>>>> ????? >????? >>???????? > I?ve run the full regression suite across
>>>> ???? Oracle-supported
>>>> ????? >????? >>???????? platforms.? There are some
>>>> ????? >????? >>???????? > failures, but there are almost always some
>>>> ???? failures in the
>>>> ????? >????? >>???????? later tiers right now.? I?ll start
>>>> ????? >????? >>???????? > looking at them tomorrow to figure out 
>>>> whether
>>>> ???? any of them
>>>> ????? >????? >>???????? are relevant.
>>>> ????? >????? >>???????? >
>>>> ????? >????? >>???????? > I?m also planning to run some of our 
>>>> performance
>>>> ????? >???? benchmarks.
>>>> ????? >????? >>???????? >
>>>> ????? >????? >>???????? > I?ve lightly skimmed the proposed changes.
>>>> ???? There might be
>>>> ????? >????? >>???????? some code improvements
>>>> ????? >????? >>???????? > to be made.
>>>> ????? >????? >>???????? >
>>>> ????? >????? >>???????? > I?m also wondering if this technique 
>>>> applies to
>>>> ???? other
>>>> ????? >????? >>???????? collectors.? It seems like both G1 and
>>>> ????? >????? >>???????? > Shenandoah full gc?s might have similar
>>>> ???? issues?? If so, a
>>>> ????? >????? >>???????? solution that is ParallelGC-specific
>>>> ????? >????? >>???????? > is less interesting than one that has broader
>>>> ????? >????? >>???????? applicability.? Though maybe this optimization
>>>> ????? >????? >>???????? > is less important for G1 and Shenandoah, 
>>>> since
>>>> they
>>>> ????? >???? actively
>>>> ????? >????? >>???????? try to avoid full gc?s.
>>>> ????? >????? >>???????? >
>>>> ????? >????? >>???????? > I?m also not clear on how much additional
>>>> ???? memory might be
>>>> ????? >????? >>???????? temporarily allocated by this
>>>> ????? >????? >>???????? > mechanism.
>>>> ????? >????? >>
>>>> ????? >????? >>???????? I?ve created a CR for this:
>>>> ????? >????? >> https://bugs.openjdk.java.net/browse/JDK-8220465
>>>> ????? >????? >>
>>>> ????? >????? >
>>>> ????? >
>>>>
>>>
>>
>>

From kim.barrett at oracle.com  Tue Oct 22 13:44:22 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 22 Oct 2019 09:44:22 -0400
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
 <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
 <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>
Message-ID: <1898AC1E-0A8C-467C-9CA9-4B02C00A3A07@oracle.com>

> On Oct 22, 2019, at 6:13 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
>  thanks a lot for taking the time so quickly.
> 
> On 22.10.19 03:20, Kim Barrett wrote:
>>> On Oct 19, 2019, at 9:06 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>> 
>>> Hi all,
>>> 
>>>  there is a new webrev at
>>> 
>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2/ (full only,
>>> there is no point in providing a diff)
>>> 
>>> since I like this solution a lot as it removes a lot of additional
>>> >> post-processing.
>>> [...]
> >>
>> I'm glad the new state machine worked out, and allowed the extra task
>> to be eliminated. Thanks for going the extra mile with the testing.
>> And thanks for turning my pseudo-code into something more readable. My
>> comments here mostly suggestions for more of that; I don't think I'd
>> want to have to decipher this in 6 months without some helpful
>> commentary. :)
> 
> I think I addressed all your comments, and thanks for your suggestions - I agree about having this tricky code well documented.
> 
> Changes are currently running through hs-tier1-5 with the changes that ease reproduction (the webrev.2.testing changes noted in the last email). Since there are no significant code changes apart from documentation, I am confident there will be no issues.
> 
> Webrevs:
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2_to_3/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3/ (full)
> 
> Thanks,
>  Thomas

Looks good.


From thomas.schatzl at oracle.com  Tue Oct 22 13:45:52 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 15:45:52 +0200
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <1898AC1E-0A8C-467C-9CA9-4B02C00A3A07@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
 <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
 <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>
 <1898AC1E-0A8C-467C-9CA9-4B02C00A3A07@oracle.com>
Message-ID: <7f150234-4080-b2f9-a791-b456038af795@oracle.com>

Hi Kim,

On 22.10.19 15:44, Kim Barrett wrote:
>> On Oct 22, 2019, at 6:13 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi Kim,
>>
>>   thanks a lot for taking the time so quickly.
>>
>> On 22.10.19 03:20, Kim Barrett wrote:
>>>> On Oct 19, 2019, at 9:06 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>>   there is a new webrev at
>>>>
>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2/ (full only,
>>>> there is no point in providing a diff)
>>>>
>>>> since I like this solution a lot as it removes a lot of additional
>>>>>> post-processing.
>>>> [...]
>>>>
>>> I'm glad the new state machine worked out, and allowed the extra task
>>> to be eliminated. Thanks for going the extra mile with the testing.
>>> And thanks for turning my pseudo-code into something more readable. My
>>> comments here mostly suggestions for more of that; I don't think I'd
>>> want to have to decipher this in 6 months without some helpful
>>> commentary. :)
>>
>> I think I addressed all your comments, and thanks for your suggestions - I agree about having this tricky code well documented.
>>
>> Changes are currently running through hs-tier1-5 with the changes that ease reproduction (the webrev.2.testing changes noted in the last email). Since there are no significant code changes apart from documentation, I am confident there will be no issues.
>>
>> Webrevs:
>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2_to_3/ (diff)
>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3/ (full)
>>
>> Thanks,
>>   Thomas
> 
> Looks good.
> 

   thanks for your review.

As expected, the hs-tier1-5 testing found no issues in the meantime.

Thanks,
   Thomas


From shade at redhat.com  Tue Oct 22 13:55:04 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 22 Oct 2019 15:55:04 +0200
Subject: RFR 8232747: Shenandoah: Concurrent GC should deactivate SATB
 before processing weak roots
In-Reply-To: <9c29e649-eb27-be6e-2240-ad3ff99b7462@redhat.com>
References: <9c29e649-eb27-be6e-2240-ad3ff99b7462@redhat.com>
Message-ID: <e974f653-e4aa-cc0e-c6ac-142d6c15ca7a@redhat.com>

On 10/22/19 3:38 PM, Zhengyu Gu wrote:
> This is the counterpart of JDK-8231999[1] for Shenandoah concurrent GC. Shenandoah needs to
> deactivate SATB barrier before processing weak roots, to avoid barrier side-effects on its paths.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232747
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232747/webrev.00/index.html

*) Mmm... In ShenandoahConcurrentMark::finish_mark_from_roots, there is a call:
  _heap->parallel_cleaning(full_gc);

  Does it mean new code would perform cleaning twice?

*) This comment relates to keeping has_forwarded_objects set on cancelled path:

   // If we needed to update refs, and concurrent marking has been cancelled,
   // we need to finish updating references.

...current placement loses that connection. Suggestion:

   // If this cycle was updating references and got cancelled, we need to keep
   // the flag on, for subsequent phases to deal with it.

*) Maybe we should inline stop_concurrent_marking everywhere to make the flow more obvious...

-- 
Thanks,
-Aleksey


From shade at redhat.com  Tue Oct 22 14:29:03 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 22 Oct 2019 16:29:03 +0200
Subject: RFR (XS) 8232791: Shenandoah: passive mode should disable pacing
Message-ID: <9fb18df5-68fb-43e9-81fb-70318e67d8ba@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8232791

The rationale is in the RFE description.

Fix:
  https://cr.openjdk.java.net/~shade/8232791/webrev.01/

Testing: hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Tue Oct 22 14:30:59 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 22 Oct 2019 10:30:59 -0400
Subject: RFR (XS) 8232791: Shenandoah: passive mode should disable pacing
In-Reply-To: <9fb18df5-68fb-43e9-81fb-70318e67d8ba@redhat.com>
References: <9fb18df5-68fb-43e9-81fb-70318e67d8ba@redhat.com>
Message-ID: <a1ab53b2-a185-2281-48d5-151479609e97@redhat.com>

Good and trivial.

Thanks,

-Zhengyu

On 10/22/19 10:29 AM, Aleksey Shipilev wrote:
> RFE:
>    https://bugs.openjdk.java.net/browse/JDK-8232791
> 
> The rationale is in the RFE description.
> 
> Fix:
>    https://cr.openjdk.java.net/~shade/8232791/webrev.01/
> 
> Testing: hotspot_gc_shenandoah
> 


From zgu at redhat.com  Tue Oct 22 14:31:51 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 22 Oct 2019 10:31:51 -0400
Subject: RFR 8232747: Shenandoah: Concurrent GC should deactivate SATB
 before processing weak roots
In-Reply-To: <e974f653-e4aa-cc0e-c6ac-142d6c15ca7a@redhat.com>
References: <9c29e649-eb27-be6e-2240-ad3ff99b7462@redhat.com>
 <e974f653-e4aa-cc0e-c6ac-142d6c15ca7a@redhat.com>
Message-ID: <3e64dfe9-ea9e-60ed-6a51-1c5c466078c0@redhat.com>

Hi Aleksey,

On 10/22/19 9:55 AM, Aleksey Shipilev wrote:
> On 10/22/19 3:38 PM, Zhengyu Gu wrote:
>> This is the counterpart of JDK-8231999[1] for Shenandoah concurrent GC. Shenandoah needs to
>> deactivate SATB barrier before processing weak roots, to avoid barrier side-effects on its paths.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232747
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232747/webrev.00/index.html
> 
> *) Mmm... In ShenandoahConcurrentMark::finish_mark_from_roots, there is a call:
>    _heap->parallel_cleaning(full_gc);

It is removed by following.

diff --git 
a/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp 
b/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp
--- a/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp
+++ b/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp
@@ -442,8 +442,6 @@
      weak_refs_work(full_gc);
    }

-  _heap->parallel_cleaning(full_gc);
-
    assert(task_queues()->is_empty(), "Should be empty");
    TASKQUEUE_STATS_ONLY(task_queues()->print_taskqueue_stats());
    TASKQUEUE_STATS_ONLY(task_queues()->reset_taskqueue_stats());

> 
>    Does it mean new code would perform cleaning twice?
> 
> *) This comment relates to keeping has_forwarded_objects set on cancelled path:
> 
>     // If we needed to update refs, and concurrent marking has been cancelled,
>     // we need to finish updating references.
> 
> ...current placement loses that connection. Suggestion:
> 
>     // If this cycle was updating references and got cancelled, we need to keep
>     // the flag on, for subsequent phases to deal with it.
> 
> *) Maybe we should inline stop_concurrent_marking everywhere to make the flow more obvious...
> 

Updated: http://cr.openjdk.java.net/~zgu/JDK-8232747/webrev.01/index.html

Thanks,

-Zhengyu


From shade at redhat.com  Tue Oct 22 14:46:17 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 22 Oct 2019 16:46:17 +0200
Subject: RFR 8232747: Shenandoah: Concurrent GC should deactivate SATB
 before processing weak roots
In-Reply-To: <3e64dfe9-ea9e-60ed-6a51-1c5c466078c0@redhat.com>
References: <9c29e649-eb27-be6e-2240-ad3ff99b7462@redhat.com>
 <e974f653-e4aa-cc0e-c6ac-142d6c15ca7a@redhat.com>
 <3e64dfe9-ea9e-60ed-6a51-1c5c466078c0@redhat.com>
Message-ID: <b36605fe-a16f-26af-0c11-d318616a37ee@redhat.com>

On 10/22/19 4:31 PM, Zhengyu Gu wrote:
> Updated: http://cr.openjdk.java.net/~zgu/JDK-8232747/webrev.01/index.html

Right. Looks much better. Still, a few nits:

*) We don't need to assert these anymore (we never do in other places)

1481     assert(is_concurrent_mark_in_progress(), "How else could we get here?");
...
1585     assert(is_concurrent_mark_in_progress(), "How else could we get here?");

*) Newline between lines here, also captialize "Marking..."

1479     concurrent_mark()->finish_mark_from_roots(/* full_gc = */ false);
1480     // marking is completed, deactivate SATB barrier

*) This is still awkwardly worded, that's my fault. Let's do this:

     concurrent_mark()->cancel();
     assert(is_concurrent_mark_in_progress(), "How else could we get here?");
     set_concurrent_mark_in_progress(false);

     // If this cycle was updating references, we need to keep the has_forwarded_objects
     // flag on, for subsequent phases to deal with it.

     if (process_references())

*) You tested hotspot_gc_shenandoah to verify that adding parallel_cleaning call in mark-compact
phase1 is safe, right?

Otherwise looks good.

-- 
Thanks,
-Aleksey


From shade at redhat.com  Tue Oct 22 15:14:18 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 22 Oct 2019 17:14:18 +0200
Subject: RFR (XS) 8232802: Shenandoah: transition between "cset" and
 "pinned_cset" does not require cancelled gc
Message-ID: <5f3b9301-a873-7166-6416-8ba4cd358039@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8232802

Fix:
  https://cr.openjdk.java.net/~shade/8232802/webrev.01/

The failure caught in testing says that transition from cset to pinned-cset is invalid when GC was
not cancelled. However, this was only true before JDK-8232575 work. Now, this transition is done in
sync_pinned_region_status that is supposed to work on all paths. In this case, Degenerated GC
dropped the cancelled GC flag already, and thus blows up the check.

The check is excessive and should be removed.

Testing: hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Tue Oct 22 15:19:47 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 22 Oct 2019 11:19:47 -0400
Subject: RFR (XS) 8232802: Shenandoah: transition between "cset" and
 "pinned_cset" does not require cancelled gc
In-Reply-To: <5f3b9301-a873-7166-6416-8ba4cd358039@redhat.com>
References: <5f3b9301-a873-7166-6416-8ba4cd358039@redhat.com>
Message-ID: <4258260f-4737-077e-fd39-12f880e8fc16@redhat.com>

Good.

Thanks,

-Zhengyu

On 10/22/19 11:14 AM, Aleksey Shipilev wrote:
> Bug:
>    https://bugs.openjdk.java.net/browse/JDK-8232802
> 
> Fix:
>    https://cr.openjdk.java.net/~shade/8232802/webrev.01/
> 
> The failure caught in testing says that transition from cset to pinned-cset is invalid when GC was
> not cancelled. However, this was only true before JDK-8232575 work. Now, this transition is done in
> sync_pinned_region_status that is supposed to work on all paths. In this case, Degenerated GC
> dropped the cancelled GC flag already, and thus blows up the check.
> 
> The check is excessive and should be removed.
> 
> Testing: hotspot_gc_shenandoah
> 


From zgu at redhat.com  Tue Oct 22 16:00:39 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 22 Oct 2019 12:00:39 -0400
Subject: RFR 8232747: Shenandoah: Concurrent GC should deactivate SATB
 before processing weak roots
In-Reply-To: <b36605fe-a16f-26af-0c11-d318616a37ee@redhat.com>
References: <9c29e649-eb27-be6e-2240-ad3ff99b7462@redhat.com>
 <e974f653-e4aa-cc0e-c6ac-142d6c15ca7a@redhat.com>
 <3e64dfe9-ea9e-60ed-6a51-1c5c466078c0@redhat.com>
 <b36605fe-a16f-26af-0c11-d318616a37ee@redhat.com>
Message-ID: <2207188f-7d8c-3d3a-b1f6-3f4ead520c33@redhat.com>


On 10/22/19 10:46 AM, Aleksey Shipilev wrote:
> On 10/22/19 4:31 PM, Zhengyu Gu wrote:
>> Updated: http://cr.openjdk.java.net/~zgu/JDK-8232747/webrev.01/index.html
> 
> Right. Looks much better. Still, a few nits:
> 
> *) We don't need to assert these anymore (we never do in other places)
> 
> 1481     assert(is_concurrent_mark_in_progress(), "How else could we get here?");
> ...
> 1585     assert(is_concurrent_mark_in_progress(), "How else could we get here?");
> 
> *) Newline between lines here, also captialize "Marking..."
> 
> 1479     concurrent_mark()->finish_mark_from_roots(/* full_gc = */ false);
> 1480     // marking is completed, deactivate SATB barrier
> 
> *) This is still awkwardly worded, that's my fault. Let's do this:
> 
>       concurrent_mark()->cancel();
>       assert(is_concurrent_mark_in_progress(), "How else could we get here?");
>       set_concurrent_mark_in_progress(false);
> 
>       // If this cycle was updating references, we need to keep the has_forwarded_objects
>       // flag on, for subsequent phases to deal with it.
> 
>       if (process_references())

All fixed and pushed.

> 
> *) You tested hotspot_gc_shenandoah to verify that adding parallel_cleaning call in mark-compact
> phase1 is safe, right?

Of course. And reran the tests after every iteration.

Thanks,

-Zhengyu

> 
> Otherwise looks good.
> 


From sangheon.kim at oracle.com  Tue Oct 22 16:47:56 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 22 Oct 2019 09:47:56 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
Message-ID: <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>

Hi Kim,

On 10/22/19 12:19 AM, Kim Barrett wrote:
>> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
>> What do you think about below comment?
>>
>>    // Tries to allocate word_sz in the PLAB of the next "generation" after trying to
>>    // allocate into dest. Previous_plab_refill_failed indicates whether previous
>>    // PLAB refill for the original (source) object was failed.
> Drop ?was?.  Otherwise looks good.
Done.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220311/webrev.3
http://cr.openjdk.java.net/~sangheki/8220311/webrev.3.inc

Thanks,
Sangheon


>
>>    // Returns a non-NULL pointer if successful, and updates dest if required.
>>    // Also determines whether we should continue to try to allocate into the various
>>    // generations or just end trying to allocate.
>>    HeapWord* allocate_in_next_plab(G1HeapRegionAttr* dest,
>> ...
>>
>> Let me post the webrev when we decide. :)
>>
>> Thanks,
>> Sangheon
>>
>>
>>> ------------------------------------------------------------------------------
>>>
>>> Looks good, other than that one comment issue.
>


From thomas.schatzl at oracle.com  Tue Oct 22 17:06:45 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 19:06:45 +0200
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
Message-ID: <649a42fa-3a31-e86c-90c8-f5a408fcfe39@oracle.com>

Hi,

On 22.10.19 18:47, sangheon.kim at oracle.com wrote:
> Hi Kim,
> 
> On 10/22/19 12:19 AM, Kim Barrett wrote:
>>> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
>>> What do you think about below comment?
>>>
>>> ?? // Tries to allocate word_sz in the PLAB of the next "generation" 
>>> after trying to
>>> ?? // allocate into dest. Previous_plab_refill_failed indicates 
>>> whether previous
>>> ?? // PLAB refill for the original (source) object was failed.
>> Drop ?was?.? Otherwise looks good.
> Done.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3.inc
> 

   still good :)

Thomas


From thomas.schatzl at oracle.com  Tue Oct 22 17:30:15 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 19:30:15 +0200
Subject: RFR (M): 8228609: G1 copy cost prediction uses used vs. actual copied
 bytes
Message-ID: <edf21992-b654-e3da-3a38-8321e0e5994a@oracle.com>

Hi all,

   can I have reviews for this change that makes G1 calculate and the 
use actual amount of bytes copied for Object Copy phase estimation?

The problem is that the "used" value that is currently used for this can 
differ a lot from the number of actually copied bytes during the 
parallel phases.

Sources for differences are:
  - TLAB sizing
  - TLAB/region fragmentation
  - all of that multiplied by the number of threads

Particularly if the amount of copied data is small compared to the 
number of regions all this can add up and disturb the prediction quite a 
lot, although overall it's not that bad.

It's only that this and other small inaccuracies add up.

CR:
https://bugs.openjdk.java.net/browse/JDK-8228609
Webrev:
http://cr.openjdk.java.net/~tschatzl/8228609/webrev/
Testing:
hs-tier1-5

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Tue Oct 22 17:35:38 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 19:35:38 +0200
Subject: RFR (S): 8232776: G1 should always take rs_length_diff into account
 when predicting rs_lengths
Message-ID: <2e973399-ce75-7ab4-ce21-58fe63c74f9c@oracle.com>

Hi all,

   can I have reviews for this small change that makes G1 always use the 
error term for rs-length prediction, not only if G1 sees fit.

While rs length prediction is still kind of bad even with this change 
(and seemingly a band-aid), with that change it is a bit better. While 
there is a "real" fix for RS length estimation coming that so far looks 
really good, this change decreases complexity of further changes in 
G1Policy enough while improving the estimation.

CR:
https://bugs.openjdk.java.net/browse/JDK-8232776
Webrev:
http://cr.openjdk.java.net/~tschatzl/8232776/webrev/
Testing:
hs-tier1-5

Thanks,
   Thomas


From sangheon.kim at oracle.com  Tue Oct 22 17:36:54 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 22 Oct 2019 10:36:54 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <649a42fa-3a31-e86c-90c8-f5a408fcfe39@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <649a42fa-3a31-e86c-90c8-f5a408fcfe39@oracle.com>
Message-ID: <9855fa14-ebf5-4c80-082f-4a26e578ee66@oracle.com>

Thanks, Thomas!

Sangheon


On 10/22/19 10:06 AM, Thomas Schatzl wrote:
> Hi,
>
> On 22.10.19 18:47, sangheon.kim at oracle.com wrote:
>> Hi Kim,
>>
>> On 10/22/19 12:19 AM, Kim Barrett wrote:
>>>> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
>>>> What do you think about below comment?
>>>>
>>>> ?? // Tries to allocate word_sz in the PLAB of the next 
>>>> "generation" after trying to
>>>> ?? // allocate into dest. Previous_plab_refill_failed indicates 
>>>> whether previous
>>>> ?? // PLAB refill for the original (source) object was failed.
>>> Drop ?was?.? Otherwise looks good.
>> Done.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3.inc
>>
>
> ? still good :)
>
> Thomas


From thomas.schatzl at oracle.com  Tue Oct 22 18:02:27 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 20:02:27 +0200
Subject: RFR (S): 8232777: Rename G1Policy::_max_rs_length as it is no maximum
Message-ID: <6d966a0f-2f56-4bae-4b55-47eeec7e9d81@oracle.com>

Hi all,

   can I have reviews for this small cleanup that renames 
G1Policy::_max_rs_length to just _rs_length because the contained value 
is simply no maximum. This causes some confusion down the line in its 
use (imo).

CR:
https://bugs.openjdk.java.net/browse/JDK-8232777
Webrev:
http://cr.openjdk.java.net/~tschatzl/8232777/webrev/
Testing:
local compilation

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Tue Oct 22 18:05:13 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 20:05:13 +0200
Subject: RFR (XS): 8232779: G1 current collection parallel time does not
 include optional evacuation
Message-ID: <15f7fa18-0334-c9ff-be69-c8ecb114e363@oracle.com>

Hi all,

   can I have reviews for this change that fixes the calculation of 
G1GCPhaseTimes::cur_collection_par_time_ms(): we forgot to consider the 
optional evacuation time.

This causes too long Other time, having minor effects on pause time 
prediction.

CR:
https://bugs.openjdk.java.net/browse/JDK-8232779
Webrev:
http://cr.openjdk.java.net/~tschatzl/8232779/webrev/
Testing:
local compilation

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Tue Oct 22 18:26:22 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 22 Oct 2019 20:26:22 +0200
Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log
 buffer entries
Message-ID: <a4a0fcd1-9310-b109-2a46-eac27e112776@oracle.com>

Hi all,

   can I have reviews for this change that aligns the cost predictions 
to the way we do evacuations, i.e. that we first drop all remembered 
sets onto the card table, and only a fraction of that will be scanned as 
introduced by JDK-8213108.

This code adds all the predictions for ratios etc to align to that code 
in our prediction model too.

After this change (and all previous) changes just sent out for review, 
mostly JDK-8228609 (which is a prerequisite for this change), 
predictions are a bit (noticably) better than before :)

CR:
https://bugs.openjdk.java.net/browse/JDK-8227739
Webrev:
http://cr.openjdk.java.net/~tschatzl/8227739/webrev/
Testing:
hs-tier1-5, perf testing, pause time keeping improves a little

Thanks,
   Thomas


From kim.barrett at oracle.com  Tue Oct 22 19:08:09 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 22 Oct 2019 15:08:09 -0400
Subject: RFR (XS): 8232779: G1 current collection parallel time does not
 include optional evacuation
In-Reply-To: <15f7fa18-0334-c9ff-be69-c8ecb114e363@oracle.com>
References: <15f7fa18-0334-c9ff-be69-c8ecb114e363@oracle.com>
Message-ID: <800ED894-9A67-4590-8C32-51DCE38E9C47@oracle.com>

> On Oct 22, 2019, at 2:05 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  can I have reviews for this change that fixes the calculation of G1GCPhaseTimes::cur_collection_par_time_ms(): we forgot to consider the optional evacuation time.
> 
> This causes too long Other time, having minor effects on pause time prediction.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232779
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232779/webrev/
> Testing:
> local compilation
> 
> Thanks,
>  Thomas

Looks good.


From kim.barrett at oracle.com  Tue Oct 22 19:16:37 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 22 Oct 2019 15:16:37 -0400
Subject: RFR (S): 8232777: Rename G1Policy::_max_rs_length as it is no
 maximum
In-Reply-To: <6d966a0f-2f56-4bae-4b55-47eeec7e9d81@oracle.com>
References: <6d966a0f-2f56-4bae-4b55-47eeec7e9d81@oracle.com>
Message-ID: <B1A3F1C4-27D1-4A74-9E8C-8E91D1BAAF27@oracle.com>

> On Oct 22, 2019, at 2:02 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  can I have reviews for this small cleanup that renames G1Policy::_max_rs_length to just _rs_length because the contained value is simply no maximum. This causes some confusion down the line in its use (imo).
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232777
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232777/webrev/
> Testing:
> local compilation
> 
> Thanks,
>  Thomas

You missed one in a comment:
src/hotspot/share/gc/g1/g1Policy.cpp
 757     // This is defensive. For a while _max_rs_length could get

Otherwise than that, looks good, and trivial.


From kim.barrett at oracle.com  Tue Oct 22 20:08:06 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 22 Oct 2019 16:08:06 -0400
Subject: RFR (S): 8232776: G1 should always take rs_length_diff into
 account when predicting rs_lengths
In-Reply-To: <2e973399-ce75-7ab4-ce21-58fe63c74f9c@oracle.com>
References: <2e973399-ce75-7ab4-ce21-58fe63c74f9c@oracle.com>
Message-ID: <29DA6617-8933-4184-9892-C55DA13989CF@oracle.com>

> On Oct 22, 2019, at 1:35 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  can I have reviews for this small change that makes G1 always use the error term for rs-length prediction, not only if G1 sees fit.
> 
> While rs length prediction is still kind of bad even with this change (and seemingly a band-aid), with that change it is a bit better. While there is a "real" fix for RS length estimation coming that so far looks really good, this change decreases complexity of further changes in G1Policy enough while improving the estimation.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232776
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232776/webrev/
> Testing:
> hs-tier1-5
> 
> Thanks,
>  Thomas

Looks good.


From sangheon.kim at oracle.com  Tue Oct 22 20:46:45 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 22 Oct 2019 13:46:45 -0700
Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for
 G1, Logging (3/3)
In-Reply-To: <b5f39fc2-3319-a81c-25b4-f979282aef9f@oracle.com>
References: <e7c52f60-a5c7-072a-4e3b-65c608907679@oracle.com>
 <e903223b-90a5-9d01-5421-a47011bd5985@oracle.com>
 <ba8c3fa4-9ee1-6a98-d13f-ffaacc59025c@oracle.com>
 <b5f39fc2-3319-a81c-25b4-f979282aef9f@oracle.com>
Message-ID: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com>

Hi Thomas,

Thanks for your review!

On 10/21/19 7:09 AM, Thomas Schatzl wrote:
> Hi,
>
> ? some initial comments looking at the log output:
>
> On 13.10.19 08:16, sangheon.kim at oracle.com wrote:
>> Hi all,
>>
>> Previous patch conflicts because of JDK-8220310, I'm posting rebased 
>> one with some refactoring.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.2
>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
>>
>> Here's the full patch of 8220310, 8220311 and 8220312.
>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.full.2/
>>
>
> ? - I did not performance impact test the additional logging yet, but 
> I do not expect issues.
>
> ? - that's something from the first NUMA patch:
>
> There is this gc+heap+numa=debug log message "Request memory [address, 
> address] to be numa id (X)." for every region.
>
> First, it seems to be on the wrong level, consider a heap with 
> ten-thousands of regions. This imo clogs the log too much, and I would 
> prefer to move this information to trace level.
Moved to Trace level.

>
> Second, the full stop at the end is not necessary :)
Removed.

>
> ? - the G1HRPrinter should be made NUMA aware, i.e. print expected 
> NUMA id for this region
>
> ? - the casing of NUMA changes depending on message, i.e. sometimes 
> "NUMA" and other times "numa" in the log messages themselves. I would 
> recommend uniformly use "NUMA".
Changed to "NUMA".

>
> However I think that all the "NUMA id" in these messages should read 
> "node id" as at that level we do not manage the OS level NUMA ids any 
> more.
We don't manage but users may configure OS level NUMA ids (e.g. via 
numactl), so I wanted to print all logs with NUMA id.

>
> ? - the "numa id" values in the various messages are formatted 
> differently in the different messages with no apparent guideline: 
> sometimes the code adds the leading zeros, sometimes not. Also the 
> separator between node id and value is sometimes ":" and once "="
>
> E.g.
>
> "NUMA id verification: preferred id (matched #): 00 (32), 01 (32), ..."
> "Region Allocated / Requested: 99% xxxx/yyyy (numa id 0: 99% ..."
>
> I am kind of undecided what is best, but probably simply leaving out 
> the leading zeros is best for the large majority of cases.
Okay, will remove leading zeros.

>
> ? - just a suggestion: "Region Allocated / Requested" -> "Placement 
> Match Ratio" or so. Maybe somebody else has a better name.
"Placement match ratio" feels better but to align with below message, 
changed to lower case.

>
> Also in that message I would not print "numa id" at all to make the 
> message shorter.
>
> ? - "Worker threads local object process rate" -> "Worker task 
> locality match rate" seems shorter.
Changed to "Worker task locality match ratio"

>
> Again, to make the message shorter I would prefer that "numa id" were 
> not printed at all in the details.
Tried to minimize but not zero occurrence.

>
> Not sure if that rate at this point is extremely interesting since G1 
> won't even try to improve it at this time, but you can leave it in if 
> you want.
Yeah, I know. But this is sort of logging framework for NUMA, so I would 
like to leave as is.

>
> ? - I would *probably* like to have most of these messages split into 
> "recent" and "total" statistics. Maybe others think that the totals 
> are okay.
Interesting idea.
Could you expand your suggestion a bit more?
What is "recent"? Or do you mean per GC cycle?

>
> ? - Again, to save space I would prefer to have the per-node details 
> in the region summaries in the same line as the original output. I.e. 
> instead of
>
> Eden regions: 28->0 (29)
> ? From numa id 0: 18->0
> ? From numa id 1: 10->0
>
> the following would be much shorter:
>
> Eden regions: 28->0 (29) (0: 18->0, 1: 10->0)
>
> As with higher node counts you will get lots of lines with little 
> content imho. Maybe others think differently?
I like your suggestion.

>
> Also, this would "fix" the problem that when you enabled gc+heap+numa 
> but not gc+heap, you will see these "From numa id" numbers in the log 
> without their required context. Alternatively, gc+heap+numa could 
> automatically enable gc+heap at the same level.
Yeah, I know this issue and this is why I like your suggestion! :)

>
> Comments after some superficial look at the changes themselves:
>
> ? - G1Regions should be renamed as G1RegionCounts and get a single 
> line comment like: "Contains per Node id region count".
Done.

>
> ? - G1NodeTimes::Stat: it would probably be useful to have a "rate()" 
> getter that recalculates the value as needed instead of the member.
I'm okay with your suggestion so I tried. :)

>
> ? - G1HeapTransition::Data::~Data: the "if (soemthing != NULL)" checks 
> are unnecessary. FREE_C_HEAP_ARRAY does that already.
Done.

>
> Same in G1ParScanThreadState::G1ParscanThreadState.
Done.

>
> ? - I do not understand the name "G1NodeTimes" :) What "time" is that 
> referring to?
It meant 'Phase Times' similar to G1GCPhaseTimes or 
ReferenceProcessorPhaseTimes.
G1NUMAPhaseTimes is better?
Or any suggestion for a name?

>
> ? - G1NUMA::clear_statistics() seems to be unused.
Removed G1NUMA::clear_statistics().

>
> ? - G1NodeTimes::print_mutator_alloc_stat_info() and 
> G1NodeTimes::copy_to_sruvivor_stat_info() look very similar. Could the 
> code be refactored a bit?
Good catch. Done.

I mostly addressed your comments except below two:
- I would *probably* like to have most of these messages split into 
"recent" and "total" statistics. Maybe others think that the totals are 
okay.
- I do not understand the name "G1NodeTimes" :) What "time" is that 
referring to?

I will post next webrev, if I get other reviews.

Thanks,
Sangheon


>
> Thanks,
> ? Thomas


From sangheon.kim at oracle.com  Tue Oct 22 20:49:25 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 22 Oct 2019 13:49:25 -0700
Subject: RFR (S): 8232776: G1 should always take rs_length_diff into
 account when predicting rs_lengths
In-Reply-To: <2e973399-ce75-7ab4-ce21-58fe63c74f9c@oracle.com>
References: <2e973399-ce75-7ab4-ce21-58fe63c74f9c@oracle.com>
Message-ID: <f1bb2f33-a451-1430-2c70-de66ef433fec@oracle.com>

Hi Thomas,

On 10/22/19 10:35 AM, Thomas Schatzl wrote:
> Hi all,
>
> ? can I have reviews for this small change that makes G1 always use 
> the error term for rs-length prediction, not only if G1 sees fit.
>
> While rs length prediction is still kind of bad even with this change 
> (and seemingly a band-aid), with that change it is a bit better. While 
> there is a "real" fix for RS length estimation coming that so far 
> looks really good, this change decreases complexity of further changes 
> in G1Policy enough while improving the estimation.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232776
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232776/webrev/
Looks good.

Thanks,
Sangheon


> Testing:
> hs-tier1-5
>
> Thanks,
> ? Thomas


From sangheon.kim at oracle.com  Tue Oct 22 20:50:45 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 22 Oct 2019 13:50:45 -0700
Subject: RFR (XS): 8232779: G1 current collection parallel time does not
 include optional evacuation
In-Reply-To: <15f7fa18-0334-c9ff-be69-c8ecb114e363@oracle.com>
References: <15f7fa18-0334-c9ff-be69-c8ecb114e363@oracle.com>
Message-ID: <fec08635-993f-4054-62b8-d0ac7e11eeed@oracle.com>

Hi Thomas,

On 10/22/19 11:05 AM, Thomas Schatzl wrote:
> Hi all,
>
> ? can I have reviews for this change that fixes the calculation of 
> G1GCPhaseTimes::cur_collection_par_time_ms(): we forgot to consider 
> the optional evacuation time.
>
> This causes too long Other time, having minor effects on pause time 
> prediction.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232779
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232779/webrev/
Looks good.

Thanks,
Sangheon


> Testing:
> local compilation
>
> Thanks,
> ? Thomas


From stefan.johansson at oracle.com  Wed Oct 23 06:16:06 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 23 Oct 2019 08:16:06 +0200
Subject: RFR (S): 8232777: Rename G1Policy::_max_rs_length as it is no
 maximum
In-Reply-To: <B1A3F1C4-27D1-4A74-9E8C-8E91D1BAAF27@oracle.com>
References: <6d966a0f-2f56-4bae-4b55-47eeec7e9d81@oracle.com>
 <B1A3F1C4-27D1-4A74-9E8C-8E91D1BAAF27@oracle.com>
Message-ID: <1366cea1-9e01-2982-e211-7417e63be46f@oracle.com>


On 2019-10-22 21:16, Kim Barrett wrote:
>> On Oct 22, 2019, at 2:02 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi all,
>>
>>   can I have reviews for this small cleanup that renames G1Policy::_max_rs_length to just _rs_length because the contained value is simply no maximum. This causes some confusion down the line in its use (imo).
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8232777
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8232777/webrev/
>> Testing:
>> local compilation
>>
>> Thanks,
>>   Thomas
> 
> You missed one in a comment:
> src/hotspot/share/gc/g1/g1Policy.cpp
>   757     // This is defensive. For a while _max_rs_length could get
> 
> Otherwise than that, looks good, and trivial.
> 
Look good,
Stefan


From sangheon.kim at oracle.com  Wed Oct 23 06:39:19 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 22 Oct 2019 23:39:19 -0700
Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for
 G1, Logging (3/3)
In-Reply-To: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com>
References: <e7c52f60-a5c7-072a-4e3b-65c608907679@oracle.com>
 <e903223b-90a5-9d01-5421-a47011bd5985@oracle.com>
 <ba8c3fa4-9ee1-6a98-d13f-ffaacc59025c@oracle.com>
 <b5f39fc2-3319-a81c-25b4-f979282aef9f@oracle.com>
 <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com>
Message-ID: <c55f0f8f-af07-eb42-202d-760f11170aa7@oracle.com>

Hi Thomas,

I am posting the next webrev as Kim is waiting it.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220312/webrev.3
http://cr.openjdk.java.net/~sangheki/8220312/webrev.3.inc
Testing: hs-tier 1 ~ 4 with/without UseNUMA. hs-tier5 is almost finished 
without new failures.

Thanks,
Sangheon


On 10/22/19 1:46 PM, sangheon.kim at oracle.com wrote:
> Hi Thomas,
>
> Thanks for your review!
>
> On 10/21/19 7:09 AM, Thomas Schatzl wrote:
>> Hi,
>>
>> ? some initial comments looking at the log output:
>>
>> On 13.10.19 08:16, sangheon.kim at oracle.com wrote:
>>> Hi all,
>>>
>>> Previous patch conflicts because of JDK-8220310, I'm posting rebased 
>>> one with some refactoring.
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.2
>>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
>>>
>>> Here's the full patch of 8220310, 8220311 and 8220312.
>>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.full.2/
>>>
>>
>> ? - I did not performance impact test the additional logging yet, but 
>> I do not expect issues.
>>
>> ? - that's something from the first NUMA patch:
>>
>> There is this gc+heap+numa=debug log message "Request memory 
>> [address, address] to be numa id (X)." for every region.
>>
>> First, it seems to be on the wrong level, consider a heap with 
>> ten-thousands of regions. This imo clogs the log too much, and I 
>> would prefer to move this information to trace level.
> Moved to Trace level.
>
>>
>> Second, the full stop at the end is not necessary :)
> Removed.
>
>>
>> ? - the G1HRPrinter should be made NUMA aware, i.e. print expected 
>> NUMA id for this region
>>
>> ? - the casing of NUMA changes depending on message, i.e. sometimes 
>> "NUMA" and other times "numa" in the log messages themselves. I would 
>> recommend uniformly use "NUMA".
> Changed to "NUMA".
>
>>
>> However I think that all the "NUMA id" in these messages should read 
>> "node id" as at that level we do not manage the OS level NUMA ids any 
>> more.
> We don't manage but users may configure OS level NUMA ids (e.g. via 
> numactl), so I wanted to print all logs with NUMA id.
>
>>
>> ? - the "numa id" values in the various messages are formatted 
>> differently in the different messages with no apparent guideline: 
>> sometimes the code adds the leading zeros, sometimes not. Also the 
>> separator between node id and value is sometimes ":" and once "="
>>
>> E.g.
>>
>> "NUMA id verification: preferred id (matched #): 00 (32), 01 (32), ..."
>> "Region Allocated / Requested: 99% xxxx/yyyy (numa id 0: 99% ..."
>>
>> I am kind of undecided what is best, but probably simply leaving out 
>> the leading zeros is best for the large majority of cases.
> Okay, will remove leading zeros.
>
>>
>> ? - just a suggestion: "Region Allocated / Requested" -> "Placement 
>> Match Ratio" or so. Maybe somebody else has a better name.
> "Placement match ratio" feels better but to align with below message, 
> changed to lower case.
>
>>
>> Also in that message I would not print "numa id" at all to make the 
>> message shorter.
>>
>> ? - "Worker threads local object process rate" -> "Worker task 
>> locality match rate" seems shorter.
> Changed to "Worker task locality match ratio"
>
>>
>> Again, to make the message shorter I would prefer that "numa id" were 
>> not printed at all in the details.
> Tried to minimize but not zero occurrence.
>
>>
>> Not sure if that rate at this point is extremely interesting since G1 
>> won't even try to improve it at this time, but you can leave it in if 
>> you want.
> Yeah, I know. But this is sort of logging framework for NUMA, so I 
> would like to leave as is.
>
>>
>> ? - I would *probably* like to have most of these messages split into 
>> "recent" and "total" statistics. Maybe others think that the totals 
>> are okay.
> Interesting idea.
> Could you expand your suggestion a bit more?
> What is "recent"? Or do you mean per GC cycle?
>
>>
>> ? - Again, to save space I would prefer to have the per-node details 
>> in the region summaries in the same line as the original output. I.e. 
>> instead of
>>
>> Eden regions: 28->0 (29)
>> ? From numa id 0: 18->0
>> ? From numa id 1: 10->0
>>
>> the following would be much shorter:
>>
>> Eden regions: 28->0 (29) (0: 18->0, 1: 10->0)
>>
>> As with higher node counts you will get lots of lines with little 
>> content imho. Maybe others think differently?
> I like your suggestion.
>
>>
>> Also, this would "fix" the problem that when you enabled gc+heap+numa 
>> but not gc+heap, you will see these "From numa id" numbers in the log 
>> without their required context. Alternatively, gc+heap+numa could 
>> automatically enable gc+heap at the same level.
> Yeah, I know this issue and this is why I like your suggestion! :)
>
>>
>> Comments after some superficial look at the changes themselves:
>>
>> ? - G1Regions should be renamed as G1RegionCounts and get a single 
>> line comment like: "Contains per Node id region count".
> Done.
>
>>
>> ? - G1NodeTimes::Stat: it would probably be useful to have a "rate()" 
>> getter that recalculates the value as needed instead of the member.
> I'm okay with your suggestion so I tried. :)
>
>>
>> ? - G1HeapTransition::Data::~Data: the "if (soemthing != NULL)" 
>> checks are unnecessary. FREE_C_HEAP_ARRAY does that already.
> Done.
>
>>
>> Same in G1ParScanThreadState::G1ParscanThreadState.
> Done.
>
>>
>> ? - I do not understand the name "G1NodeTimes" :) What "time" is that 
>> referring to?
> It meant 'Phase Times' similar to G1GCPhaseTimes or 
> ReferenceProcessorPhaseTimes.
> G1NUMAPhaseTimes is better?
> Or any suggestion for a name?
>
>>
>> ? - G1NUMA::clear_statistics() seems to be unused.
> Removed G1NUMA::clear_statistics().
>
>>
>> ? - G1NodeTimes::print_mutator_alloc_stat_info() and 
>> G1NodeTimes::copy_to_sruvivor_stat_info() look very similar. Could 
>> the code be refactored a bit?
> Good catch. Done.
>
> I mostly addressed your comments except below two:
> - I would *probably* like to have most of these messages split into 
> "recent" and "total" statistics. Maybe others think that the totals 
> are okay.
> - I do not understand the name "G1NodeTimes" :) What "time" is that 
> referring to?
>
> I will post next webrev, if I get other reviews.
>
> Thanks,
> Sangheon
>
>
>>
>> Thanks,
>> ? Thomas
>


From stefan.johansson at oracle.com  Wed Oct 23 07:05:58 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 23 Oct 2019 09:05:58 +0200
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <7f150234-4080-b2f9-a791-b456038af795@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
 <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
 <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>
 <1898AC1E-0A8C-467C-9CA9-4B02C00A3A07@oracle.com>
 <7f150234-4080-b2f9-a791-b456038af795@oracle.com>
Message-ID: <8126d900-714b-585a-f2f0-4ce13f71501c@oracle.com>

Hi Thomas,

On 2019-10-22 15:45, Thomas Schatzl wrote:
> Hi Kim,
> 
> On 22.10.19 15:44, Kim Barrett wrote:
>>> On Oct 22, 2019, at 6:13 AM, Thomas Schatzl 
>>> <thomas.schatzl at oracle.com> wrote:
>>>
>>> Hi Kim,
>>>
>>> ? thanks a lot for taking the time so quickly.
>>>
>>> On 22.10.19 03:20, Kim Barrett wrote:
>>>>> On Oct 19, 2019, at 9:06 AM, Thomas Schatzl 
>>>>> <thomas.schatzl at oracle.com> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> ? there is a new webrev at
>>>>>
>>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2/ (full only,
>>>>> there is no point in providing a diff)
>>>>>
>>>>> since I like this solution a lot as it removes a lot of additional
>>>>>>> post-processing.
>>>>> [...]
>>>>>
>>>> I'm glad the new state machine worked out, and allowed the extra task
>>>> to be eliminated. Thanks for going the extra mile with the testing.
>>>> And thanks for turning my pseudo-code into something more readable. My
>>>> comments here mostly suggestions for more of that; I don't think I'd
>>>> want to have to decipher this in 6 months without some helpful
>>>> commentary. :)
>>>
>>> I think I addressed all your comments, and thanks for your 
>>> suggestions - I agree about having this tricky code well documented.
>>>
>>> Changes are currently running through hs-tier1-5 with the changes 
>>> that ease reproduction (the webrev.2.testing changes noted in the 
>>> last email). Since there are no significant code changes apart from 
>>> documentation, I am confident there will be no issues.
>>>
>>> Webrevs:
>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2_to_3/ (diff)
>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3/ (full)

This looks good, and well documented :)

One small thing:
src/hotspot/share/gc/g1/g1SharedClosures.hpp
---
  46     _codeblobs(pss->worker_id(), &_oops, Mark == G1MarkFromRoot) {}

What do you think about adding a helper for Mark == G1MarkFromRoot, 
something like need_strong_processing() and a comment explaining that it 
will be true during initial mark.
---

Thanks,
Stefan

>>>
>>> Thanks,
>>> ? Thomas
>>
>> Looks good.
>>
> 
>  ? thanks for your review.
> 
> As expected, the hs-tier1-5 testing found no issues in the meantime.
> 
> Thanks,
>  ? Thomas


From per.liden at oracle.com  Wed Oct 23 08:21:40 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 23 Oct 2019 10:21:40 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
Message-ID: <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>

Hi Sangheon,

I noticed that this patch adds os::numa_get_address_id(). That name is 
misleading as it doesn't return an "address id", but a "numa node id". 
However, the terminology used in the os class for numa node is "group" 
(for example, numa_get_groups_num, numa_get_group_id, etc). So I'd 
suggest we instead name this os::numa_get_group_id(void* address), i.e. 
an overload of os::numa_get_group_id().

Btw, I think that the numa related names used in the os class are odd, 
but I guess that are brought over from Solaris. We can refine those at 
some later time if we want, but for now I think we should follow the 
naming convention that we have there.

Also, I don't think this function should print warnings, as that's up to 
the caller to decide what to do, what to print, etc.

Furthermore, I suggest we remove os::InvalidNUMAId. Other numa functions 
in the os class returns -1 on error, so I think we should do that here too.

Here's a patch with the proposed changes:


diff --git a/src/hotspot/os/linux/os_linux.cpp 
b/src/hotspot/os/linux/os_linux.cpp
--- a/src/hotspot/os/linux/os_linux.cpp
+++ b/src/hotspot/os/linux/os_linux.cpp
@@ -3007,7 +3007,7 @@
    return 0;
  }

-int os::numa_get_address_id(void* address) {
+int os::numa_get_group_id(void* address) {
  #ifndef MPOL_F_NODE
  #define MPOL_F_NODE     (1<<0)  // Return next IL mode instead of node 
mask
  #endif
@@ -3016,11 +3016,10 @@
  #define MPOL_F_ADDR     (1<<1)  // Look up VMA using address
  #endif

-  int id = InvalidNUMAId;
+  int id = 0;

    if (syscall(SYS_get_mempolicy, &id, NULL, 0, address, MPOL_F_NODE | 
MPOL_F_ADDR) == -1) {
-    warning("Failed to get numa id at " PTR_FORMAT " with errno=%d", 
p2i(address), errno);
-    return InvalidNUMAId;
+    return -1;
    }
    return id;
  }
diff --git a/src/hotspot/share/gc/g1/g1NUMA.cpp 
b/src/hotspot/share/gc/g1/g1NUMA.cpp
--- a/src/hotspot/share/gc/g1/g1NUMA.cpp
+++ b/src/hotspot/share/gc/g1/g1NUMA.cpp
@@ -164,7 +164,7 @@

  uint G1NUMA::index_of_address(HeapWord *address) const {
    int numa_id = os::numa_get_address_id((void*)address);
-  if (numa_id == os::InvalidNUMAId) {
+  if (numa_id == -1) {
      return UnknownNodeIndex;
    } else {
      return index_of_node_id(numa_id);
@@ -201,7 +201,7 @@
    if (!is_enabled()) {
      return;
    }
-
+
    if (size_in_bytes == 0) {
      return;
    }
diff --git a/src/hotspot/share/runtime/os.hpp 
b/src/hotspot/share/runtime/os.hpp
--- a/src/hotspot/share/runtime/os.hpp
+++ b/src/hotspot/share/runtime/os.hpp
@@ -374,10 +374,7 @@
    static size_t numa_get_leaf_groups(int *ids, size_t size);
    static bool   numa_topology_changed();
    static int    numa_get_group_id();
-
-  static const int InvalidNUMAId = -1;
-
-  static int numa_get_address_id(void* address);
+  static int    numa_get_group_id(void* address);

    // Page manipulation
    struct page_info {


cheers,
Per


On 10/16/19 7:54 PM, sangheon.kim at oracle.com wrote:
> Hi Kim, Stefan and Thomas,
> 
> Many thanks for the reviews and suggestions!
> 
> Kim,
> I will move page_size() near page_start() before push as you suggested.
> As you know, all 3 patches will be pushed together though.
> 
> Thanks,
> Sangheon
> 
> 
> On 10/16/19 7:00 AM, Kim Barrett wrote:
>>> On Oct 15, 2019, at 10:33 AM, sangheon.kim at oracle.com wrote:
>>>
>>> Hi all,
>>>
>>> Here's revised webrev which addresses:
>>> 1) G1RegionToSpaceMapper checks mtJavaHeap and then conditionally 
>>> calls G1NUMA::request_memory_on_node() (Kim)
>>> 2) The signature of G1NUMA::request_memory_on_node(void* address, ,) 
>>> is changed to have actual address instead of page index. (Stefan)
>>> 3) Some local variable name changes at G1RegionToSpaceMapper. i -> 
>>> region_idx, idx -> page_idx (for local style, used idx instead of index)
>>>
>>> webrev:
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5/
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5.inc/
>>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
>> Looks good.
>>
>> In g1PageBasedVirtualSpace.cpp, could the newly added definition of 
>> page_size()
>> be moved to be near the existing definition of page_start()?? I don?t 
>> need a new
>> webrev if you move it.
>>
> 


From thomas.schatzl at oracle.com  Wed Oct 23 08:39:22 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 23 Oct 2019 10:39:22 +0200
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <8126d900-714b-585a-f2f0-4ce13f71501c@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
 <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
 <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>
 <1898AC1E-0A8C-467C-9CA9-4B02C00A3A07@oracle.com>
 <7f150234-4080-b2f9-a791-b456038af795@oracle.com>
 <8126d900-714b-585a-f2f0-4ce13f71501c@oracle.com>
Message-ID: <531dc0fe-236d-110b-65ee-d224ac028130@oracle.com>

Hi Stefan,

On 23.10.19 09:05, Stefan Johansson wrote:
> Hi Thomas,
> 
> On 2019-10-22 15:45, Thomas Schatzl wrote:
>> Hi Kim,
>>
>> On 22.10.19 15:44, Kim Barrett wrote:
>>>> On Oct 22, 2019, at 6:13 AM, Thomas Schatzl 
[...]>>>> Webrevs:
>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2_to_3/ (diff)
>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3/ (full)
> 
> This looks good, and well documented :)
> 
> One small thing:
> src/hotspot/share/gc/g1/g1SharedClosures.hpp
> ---
>  ?46???? _codeblobs(pss->worker_id(), &_oops, Mark == G1MarkFromRoot) {}
> 
> What do you think about adding a helper for Mark == G1MarkFromRoot, 
> something like need_strong_processing() and a comment explaining that it 
> will be true during initial mark.

Something like this?

http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3_to_4/ (diff)
http://cr.openjdk.java.net/~tschatzl/8230706/webrev.4/ (full)

Not completely sure if that is required as searching for G1MarkFromRoot 
shows that it is only used for the strong shared closures in the initial 
mark closure set. But I understand that it is nice to be reminded about 
this.

Thanks for your and Kim's reviews.

Thanks,
   Thomas


From stefan.johansson at oracle.com  Wed Oct 23 08:47:45 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 23 Oct 2019 10:47:45 +0200
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
Message-ID: <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>

Hi Sangheon,

On 2019-10-22 18:47, sangheon.kim at oracle.com wrote:
> Hi Kim,
> 
> On 10/22/19 12:19 AM, Kim Barrett wrote:
>>> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
>>> What do you think about below comment?
>>>
>>> ?? // Tries to allocate word_sz in the PLAB of the next "generation" 
>>> after trying to
>>> ?? // allocate into dest. Previous_plab_refill_failed indicates 
>>> whether previous
>>> ?? // PLAB refill for the original (source) object was failed.
>> Drop ?was?.? Otherwise looks good.
> Done.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3.inc
Looks good in general, just one minor thing, no need for a new webrev 
though:
src/hotspot/share/gc/g1/g1Allocator.cpp
---
144   for (uint nodex_index = 0; nodex_index < _num_alloc_regions; 
nodex_index++) {

The name nodex_index has one too many x:es =) I would prefer node_index.
---

Thanks,
Stefan

> 
> Thanks,
> Sangheon
> 
> 
>>
>>> ?? // Returns a non-NULL pointer if successful, and updates dest if 
>>> required.
>>> ?? // Also determines whether we should continue to try to allocate 
>>> into the various
>>> ?? // generations or just end trying to allocate.
>>> ?? HeapWord* allocate_in_next_plab(G1HeapRegionAttr* dest,
>>> ...
>>>
>>> Let me post the webrev when we decide. :)
>>>
>>> Thanks,
>>> Sangheon
>>>
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>>
>>>> Looks good, other than that one comment issue.
>>
> 


From stefan.karlsson at oracle.com  Wed Oct 23 08:56:08 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Oct 2019 10:56:08 +0200
Subject: RFR: 8232649: ZGC: Add callbacks to ZMemoryManager
In-Reply-To: <17249488-d0f4-81ae-3a15-b120cac388af@oracle.com>
References: <8793cda6-bec6-dac7-5164-8fc34454286e@oracle.com>
 <17249488-d0f4-81ae-3a15-b120cac388af@oracle.com>
Message-ID: <c4f95108-ea72-634d-d687-7dfdb0bc38a1@oracle.com>

Thanks, Erik.

StefanK

On 2019-10-22 11:18, erik.osterlund at oracle.com wrote:
> Hi Stefan,
>
> Looks good.
>
> Thanks,
> /Erik
>
> On 10/21/19 4:06 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to add callbacks to ZMemoryManager.
>>
>> https://cr.openjdk.java.net/~stefank/8232649/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232649
>>
>> This allows users of ZMemoryManager to get callbacks when memory 
>> regions are inserted, removed, split, and coalesced. This is needed 
>> to support Windows' stricter requirements for placeholder reserved 
>> memory.
>>
>> Thanks,
>> StefanK
>


From stefan.karlsson at oracle.com  Wed Oct 23 08:56:25 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Oct 2019 10:56:25 +0200
Subject: RFR: 8232650: ZGC: Add initialization hooks for OS specific code
In-Reply-To: <c200c515-c38a-be2c-05e0-18622418f777@oracle.com>
References: <5cdd2722-26a4-8e6c-1262-5d97dfd7f46c@oracle.com>
 <c200c515-c38a-be2c-05e0-18622418f777@oracle.com>
Message-ID: <f883be7f-8579-2a90-f18c-63026a850b3f@oracle.com>

Thanks, Erik.

StefanK

On 2019-10-22 11:18, erik.osterlund at oracle.com wrote:
> Hi Stefan,
>
> Looks good.
>
> Thanks,
> /Erik
>
> On 10/21/19 4:37 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to add initialization hooks for OS specific 
>> code.
>>
>> https://cr.openjdk.java.net/~stefank/8232650/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232650
>>
>> These hooks are needed to for a Windows port. ZInitialize allows 
>> syscalls to be dynamically resolved. ZVirtualMemory allows callbacks 
>> from 8232649 to be initialized.
>>
>> Thanks,
>> StefanK
>


From stefan.karlsson at oracle.com  Wed Oct 23 08:56:51 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Oct 2019 10:56:51 +0200
Subject: RFR: 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of
 declarations
In-Reply-To: <201ba7a6-a371-f4c0-340f-a5af14d0323f@oracle.com>
References: <b981daf2-2924-1708-5a2f-7475cef3d85a@oracle.com>
 <201ba7a6-a371-f4c0-340f-a5af14d0323f@oracle.com>
Message-ID: <64c2f0ae-2330-3d60-4e44-3b03878aae9f@oracle.com>

Thanks, Erik.

StefanK

On 2019-10-22 11:18, erik.osterlund at oracle.com wrote:
> Hi Stefan,
>
> Looks good.
>
> Thanks,
> /Erik
>
> On 10/21/19 3:22 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to move ATTRIBUTE_ALIGNED to the front of 
>> declarations.
>>
>> https://cr.openjdk.java.net/~stefank/8232648/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232648
>>
>> This is done because the Windows compiler requires ATTRIBUTE_ALIGNED 
>> to be put at the front of declarations. A new macro (ZCACHE_ALIGNED) 
>> is introduced, and used, to shorten the affected lines.
>>
>> Thanks,
>> StefanK
>


From thomas.schatzl at oracle.com  Wed Oct 23 08:57:08 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 23 Oct 2019 10:57:08 +0200
Subject: RFR (S): 8232777: Rename G1Policy::_max_rs_length as it is no
 maximum
In-Reply-To: <1366cea1-9e01-2982-e211-7417e63be46f@oracle.com>
References: <6d966a0f-2f56-4bae-4b55-47eeec7e9d81@oracle.com>
 <B1A3F1C4-27D1-4A74-9E8C-8E91D1BAAF27@oracle.com>
 <1366cea1-9e01-2982-e211-7417e63be46f@oracle.com>
Message-ID: <3ed4d2f2-ffad-65e1-2d95-27e2d69ecba9@oracle.com>

Hi Kim, Stefan,

   thanks for your reviews.

For reference, I updated the webrev in place.

Thanks,
   Thomas

On 23.10.19 08:16, Stefan Johansson wrote:
> 
> 
> On 2019-10-22 21:16, Kim Barrett wrote:
>>> On Oct 22, 2019, at 2:02 PM, Thomas Schatzl 
>>> <thomas.schatzl at oracle.com> wrote:
>>>
>>> Hi all,
>>>
>>> ? can I have reviews for this small cleanup that renames 
>>> G1Policy::_max_rs_length to just _rs_length because the contained 
>>> value is simply no maximum. This causes some confusion down the line 
>>> in its use (imo).
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8232777
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8232777/webrev/
>>> Testing:
>>> local compilation
>>>
>>> Thanks,
>>> ? Thomas
>>
>> You missed one in a comment:
>> src/hotspot/share/gc/g1/g1Policy.cpp
>> ? 757???? // This is defensive. For a while _max_rs_length could get
>>
>> Otherwise than that, looks good, and trivial.
>>
> Look good,
> Stefan


From stefan.karlsson at oracle.com  Wed Oct 23 08:57:09 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Oct 2019 10:57:09 +0200
Subject: RFR: 8232602: ZGC: Make ZGranuleMap ZAddress agnostic
In-Reply-To: <fcd8f2c9-4cef-6fc5-a34d-fab4bf486e24@oracle.com>
References: <4638080b-9f2e-6965-6ed9-a17b32ad3b94@oracle.com>
 <fcd8f2c9-4cef-6fc5-a34d-fab4bf486e24@oracle.com>
Message-ID: <0711353f-5c0d-23b1-1ae7-c774aa20014a@oracle.com>

Thanks, Erik.

StefanK

On 2019-10-22 11:17, erik.osterlund at oracle.com wrote:
> Hi Stefan,
>
> Looks good.
>
> Thanks,
> /Erik
>
> On 10/21/19 3:09 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to make ZGranuleMap ZAddress agnostic.
>>
>> https://cr.openjdk.java.net/~stefank/8232602/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232602
>>
>> Currently, the ZGranuleMap get and put functions take an address in 
>> the heap as a parameter. The address is then converted into an offset 
>> (into a heap view), before being scaled to a granule.
>>
>> We want to be able to use the ZGranuleMap for physical memory 
>> offsets, and not only heap addresses. Therefore, I propose that we 
>> move the conversions from address to offset out from ZGranuleMap, and 
>> move it to the current users of ZGranuleMap.
>>
>> This patch applies on-top of the patch for JDK-8232601.
>>
>> Thanks,
>> StefanK
>>
>


From stefan.karlsson at oracle.com  Wed Oct 23 08:57:26 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Oct 2019 10:57:26 +0200
Subject: RFR: 8232601: ZGC: Parameterize the ZGranuleMap table size
In-Reply-To: <dd9c696e-1216-d4da-01e5-ee80dfb2c726@oracle.com>
References: <f486753e-cd02-5c22-7390-041cf248f423@oracle.com>
 <dd9c696e-1216-d4da-01e5-ee80dfb2c726@oracle.com>
Message-ID: <4a213165-79ab-d285-e305-421d4bf5f27f@oracle.com>

Thanks, Erik.

StefanK

On 2019-10-22 11:17, erik.osterlund at oracle.com wrote:
> Hi Stefan,
>
> Looks good.
>
> Thanks,
> /Erik
>
> On 10/21/19 3:00 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to parameterize the ZGranuleMap table size.
>>
>> https://cr.openjdk.java.net/~stefank/8232601/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232601
>>
>> Previously, the maps were always bound by the range of a virtual 
>> address space view (ZAddressOffsetMax). We want to be able to use 
>> ZGranuleMap to map against physical memory offsets, so this RFE 
>> suggests that we allow users of ZGranuleMap to specify the max offset.
>>
>> Thanks,
>> StefanK
>


From sakamoto.osamu at nttcom.co.jp  Wed Oct 23 09:57:42 2019
From: sakamoto.osamu at nttcom.co.jp (Osamu Sakamoto)
Date: Wed, 23 Oct 2019 18:57:42 +0900
Subject: Segmentation Fault occurs when ClassLoader and Metaspace is
 released in JDK 8
In-Reply-To: <422c9ca2-5053-c761-cb61-f075877bb666@oss.nttdata.com>
References: <fb308571-cbdf-1f4b-177f-aa6bac986a5f@nttcom.co.jp_1>
 <422c9ca2-5053-c761-cb61-f075877bb666@oss.nttdata.com>
Message-ID: <314f9ad2-17df-1082-8816-7af73a96e9fb@nttcom.co.jp_1>

Hi Yasumasa,

Thank you for answering.

 > What JVM options did you pass?
The following is the JVM options I passed.
-----------------------------------------------------------------
-Xmx2048m
-Xms2048m
-XX:NewSize=412m
-XX:MaxNewSize=412m
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=15
-XX:+UseConcMarkSweepGC
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=80
-XX:+CMSClassUnloadingEnabled
-XX:CompressedClassSpaceSize=64m
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+UseGCLogFileRotation
-XX:GCLogFileSize=0
-Xloggc:/var/log/tomcatm0/gc-%p.log
-XX:+HeapDumpOnOutOfMemoryError
-XX:+AlwaysLockClassLoader
-----------------------------------------------------------------


 > I guess you used CMS because this problem seems to occur on CMS only 
[1] [2].
Yes, I used CMS.

 > So it might be work around not to use CMS.
Thank you for telling me work around.
But it is difficult to change the GC method, so we would like to solve 
this issue with CMS GC if possible.


 > I'm not sure root cause of this issue, but it seems to break 
ClassLoaderDataGraph::_unloading.
 > (like double free (delete) of CLD)
I checked whether the ClassLoaderDataGraph::_unloading is broken or not, 
but I didn't know because of the value has been cleaered by NULL or 
optimized out.

Referring ClassLoaderDataGraph[1].cpp, it looks like that _unloading 
value is saved to ClassLoaderDataGraph::_saved_unloading.
But _saved_unloading had been cleared by NULL, too.

Is there any other way to check it?

[1]http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.cpp#l753

-----------------------------------------------------------------
(gdb) f 10
#10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
818??? ??? delete purge_me;
(gdb) list ClassLoaderDataGraph::purge
810??? void ClassLoaderDataGraph::purge() {
811??? ? assert(SafepointSynchronize::is_at_safepoint(), "must be at 
safepoint!");
812??? ? ClassLoaderData* list = _unloading;
813??? ? _unloading = NULL;
814??? ? ClassLoaderData* next = list;
815??? ? while (next != NULL) {
816??? ??? ClassLoaderData* purge_me = next;
817??? ??? next = purge_me->next();
818??? ??? delete purge_me;
819??? ? }
820??? ? Metaspace::purge();
821??? }
(gdb) p _unloading
$29 = (ClassLoaderData *) 0x0
(gdb) p list
$30 = <optimized out>
(gdb) p next
$31 = <optimized out>
(gdb) p ClassLoaderDataGraph::_saved_unloading
$32 = (ClassLoaderData *) 0x0
-----------------------------------------------------------------

Thanks,
Osamu

On 10/21/19 22:29, Yasumasa Suenaga wrote:
> Hi Osamu,
>
> What JVM options did you pass?
>
> I guess you used CMS because this problem seems to occur on CMS only 
> [1] [2].
> So it might be work around not to use CMS.
>
> I'm not sure root cause of this issue, but it seems to break 
> ClassLoaderDataGraph::_unloading.
> (like double free (delete) of CLD)
>
>
> Thanks,
>
> Yasumasa
>
>
> [1] 
> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.hpp#l100
> [2] 
> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp#l6384
>
>
> On 2019/10/21 17:50, Osamu Sakamoto wrote:
>> Hi all,
>>
>> I have a problem about Segmentation Fault(SEGV) in GC and I can't 
>> make the cause clear.
>> Could you help me solve the problem?
>>
>> Our System uses OpenJDK 1.8.0.181, and crashed by SEGV when purging 
>> ClassLoader at safepoint.
>> This problem can't be reproduced, but this has happened 4 times in a 
>> few months.
>>
>> The following is the summary of my investigation.
>>
>> ============================================================================= 
>>
>>
>> First I checked hs_err, and that shows that the SEGV occurred.
>> VM_Operation is GenCollectForAllocation at safepoint.
>>
>> ----------------------------------------------------------------------------- 
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #? SIGSEGV (0xb) at pc=0x00007f6080c97f88, pid=23931, 
>> tid=0x00007f607c3ed700
>> #
>> # JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 
>> 1.8.0_181-b13)
>> # Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode 
>> linux-amd64 compressed oops)
>> # Problematic frame:
>> # V? [libjvm.so+0x84bf88]
>> #
>> # Core dump written. Default location: /opt/tomcate0/core or core.23931
>> #
>> # If you would like to submit a bug report, please visit:
>> #?? http://bugreport.java.com/bugreport/crash.jsp
>> #
>>
>> ---------------? T H R E A D? ---------------
>>
>> Current thread (0x00007f6078c00000):? VMThread [stack: 
>> 0x00007f607c2ed000,0x00007f607c3ee000] [id=23939]
>>
>> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
>> 0x0000000000000018
>>
>> Registers:
>> RAX=0x0000000000000010, RBX=0x00007f5ff800ad30, 
>> RCX=0x0000000000000010, RDX=0x0000000000000000
>> RSP=0x00007f607c3ecb50, RBP=0x00007f607c3ecb80, 
>> RSI=0x0000000000000002, RDI=0x0000000001cfe570
>> R8 =0x00007f5ff80ae320, R9 =0x00007f5ff8052480, 
>> R10=0x0000000000000000, R11=0x0000000000000400
>> R12=0x0000000001cfe570, R13=0x00007f6081419470, 
>> R14=0x0000000000000002, R15=0x00007f6081418640
>> RIP=0x00007f6080c97f88, EFLAGS=0x0000000000010202, 
>> CSGSFS=0x0000000000000033, ERR=0x0000000000000004
>> ?? TRAPNO=0x000000000000000e
>>
>> Top of Stack: (sp=0x00007f607c3ecb50)
>> 0x00007f607c3ecb50:?? 00007f607c3ecba0 00007f5ff800ad30
>> 0x00007f607c3ecb60:?? 00007f5ff800ad00 0000000000000000
>> 0x00007f607c3ecb70:?? 0000000000000000 0000000000000001
>> 0x00007f607c3ecb80:?? 00007f607c3ecba0 00007f6080c995fa
>> 0x00007f607c3ecb90:?? 00007f5ff800ad00 00007f5ff800ac20
>> 0x00007f607c3ecba0:?? 00007f607c3ecbc0 00007f60808bff5e
>> 0x00007f607c3ecbb0:?? 00007f5ff800ac20 00007f5ff8052870
>> 0x00007f607c3ecbc0:?? 00007f607c3ecbe0 00007f60808c0f0f
>> 0x00007f607c3ecbd0:?? 00007f607c3ecbf0 00007f608140f308
>> 0x00007f607c3ecbe0:?? 00007f607c3ecc30 00007f6080daa0b7
>> 0x00007f607c3ecbf0:?? 00007f6069000100 0000000000000000
>> 0x00007f607c3ecc00:?? 00007f607c3ecc20 00007f6080ed0800
>> 0x00007f607c3ecc10:?? 00000000000000f9 88e95c3ba257ab00
>> 0x00007f607c3ecc20:?? 431bde82d7b634db 00007f607800aa00
>> 0x00007f607c3ecc30:?? 00007f607c3eccc0 00007f6080daa9d5
>> 0x00007f607c3ecc40:?? 0000000000000000 00007f607803bf20
>> 0x00007f607c3ecc50:?? 00007f607803be20 00000000000003e8
>> 0x00007f607c3ecc60:?? 0000000000000001 00007f6078c00000
>> 0x00007f607c3ecc70:?? 00007f607c3eccc0 0000000000000000
>> 0x00007f607c3ecc80:?? 00000004000000f9 00007f60813e2b99
>> 0x00007f607c3ecc90:?? 00007f607803bfa0 00007f6078c00000
>> 0x00007f607c3ecca0:?? 0000000000000000 0000000000000000
>> 0x00007f607c3eccb0:?? 00007f6081418bd0 00007f607803bf20
>> 0x00007f607c3eccc0:?? 00007f607c3ece60 00007f6080f2048a
>> 0x00007f607c3eccd0:?? 00007f607c3ecd20 00007f607c3ecce0
>> 0x00007f607c3ecce0:?? 00007f6078c00000 00007f6078c00980
>> 0x00007f607c3eccf0:?? 00007f6078c009c0 00007f6078c009d0
>> 0x00007f607c3ecd00:?? 00007f6078c00aa8 00000000000000d8
>> 0x00007f607c3ecd10:?? 00007f6078c00be0 0000000000000000
>> 0x00007f607c3ecd20:?? 00007f607c3ecd28 6e69747563657845
>> 0x00007f607c3ecd30:?? 65706f204d562067 203a6e6f69746172
>> 0x00007f607c3ecd40:?? 656c6c6f436e6547 6c6c41726f467463
>>
>> Instructions: (pc=0x00007f6080c97f88)
>> 0x00007f6080c97f68:?? b6 12 80 fa 00 74 01 f0 48 0f c1 01 31 c9 31 f6
>> 0x00007f6080c97f78:?? 48 8b 44 0b 10 31 d2 48 85 c0 74 11 0f 1f 40 00
>> 0x00007f6080c97f88:?? 48 8b 40 08 48 83 c2 01 48 85 c0 75 f3 48 83 c1
>> 0x00007f6080c97f98:?? 08 48 01 d6 48 83 f9 20 75 d6 8b 7b 08 48 8b 05
>>
>> Register to memory mapping:
>>
>> RAX=0x0000000000000010 is an unknown value
>> RBX=0x00007f5ff800ad30 is an unknown value
>> RCX=0x0000000000000010 is an unknown value
>> RDX=0x0000000000000000 is an unknown value
>> RSP=0x00007f607c3ecb50 is an unknown value
>> RBP=0x00007f607c3ecb80 is an unknown value
>> RSI=0x0000000000000002 is an unknown value
>> RDI=0x0000000001cfe570 is an unknown value
>> R8 =0x00007f5ff80ae320 is an unknown value
>> R9 =0x00007f5ff8052480 is an unknown value
>> R10=0x0000000000000000 is an unknown value
>> R11=0x0000000000000400 is an unknown value
>> R12=0x0000000001cfe570 is an unknown value
>> R13=0x00007f6081419470: <offset 0xfcd470> in 
>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so 
>> at 0x00007f608044c000
>> R14=0x0000000000000002 is an unknown value
>> R15=0x00007f6081418640: <offset 0xfcc640> in 
>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so 
>> at 0x00007f608044c000
>>
>>
>> Stack: [0x00007f607c2ed000,0x00007f607c3ee000], 
>> sp=0x00007f607c3ecb50, free space=1022k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
>> C=native code)
>> V? [libjvm.so+0x84bf88]
>> V? [libjvm.so+0x84d5fa]
>> V? [libjvm.so+0x473f5e]
>> V? [libjvm.so+0x474f0f]
>> V? [libjvm.so+0x95e0b7]
>> V? [libjvm.so+0x95e9d5]
>> V? [libjvm.so+0xad448a]
>> V? [libjvm.so+0xad48f1]
>> V? [libjvm.so+0x8beb82]
>>
>> VM_Operation (0x00007f5fd69e6120): GenCollectForAllocation, mode: 
>> safepoint, requested by thread 0x00007f6079013800
>>
>> ...
>> ----------------------------------------------------------------------------- 
>>
>>
>>
>>
>> Next, I used GDB to check the backtrace of the SEGV thread from the 
>> coredump.
>> The following is the backtrace.
>> The SEGV occurred when ClassLoader is purged and Metaspace is 
>> destructed.
>> And frame #7 shows that a signal(SEGV) handler is called after 
>> SpaceManager::~SpaceManager() is executed.
>>
>> ----------------------------------------------------------------------------- 
>>
>> (gdb) bt
>> #0? 0x00007f608146f1f7 in __GI_raise (sig=sig at entry=6) at 
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>> #1? 0x00007f60814708e8 in __GI_abort () at abort.c:90
>> #2? 0x00007f6080d0bc39 in os::abort (dump_core=<optimized out>) at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1519
>> #3? 0x00007f6080f1b816 in VMError::report_and_die 
>> (this=this at entry=0x7f607c3ebd10) at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060
>> #4? 0x00007f6080d15927 in JVM_handle_linux_signal (sig=11, 
>> info=0x7f607c3ebfb0, ucVoid=0x7f607c3ebe80, 
>> abort_if_unrecognized=<optimized out>)
>> ???? at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
>> #5? 0x00007f6080d09038 in signalHandler (sig=11, info=0x7f607c3ebfb0, 
>> uc=0x7f607c3ebe80) at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4446
>> #6? <signal handler called>
>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, 
>> __in_chrg=<optimized out>) at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>> #8? 0x00007f6080c995fa in Metaspace::~Metaspace (this=0x7f5ff800ad00, 
>> __in_chrg=<optimized out>)
>> ???? at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2971
>> #9? 0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData 
>> (this=0x7f5ff800ac20, __in_chrg=<optimized out>)
>> ???? at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
>> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
>> #11 0x00007f6080daa0b7 in ClassLoaderDataGraph::purge_if_needed () at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.hpp:104
>> #12 SafepointSynchronize::do_cleanup_tasks () at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:551
>> #13 0x00007f6080daa9d5 in SafepointSynchronize::begin () at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:402
>> #14 0x00007f6080f2048a in VMThread::loop 
>> (this=this at entry=0x7f6078c00000) at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:501
>> #15 0x00007f6080f208f1 in VMThread::run (this=0x7f6078c00000) at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276 
>>
>> #16 0x00007f6080d0ab82 in java_start (thread=0x7f6078c00000) at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796 
>>
>> #17 0x00007f6081e2de25 in start_thread (arg=0x7f607c3ed700) at 
>> pthread_create.c:308
>> #18 0x00007f608153234d in clone () at 
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
>> ----------------------------------------------------------------------------- 
>>
>>
>>
>> In Frame #7, Line 2028 (chunk = chunk->next()) is the crash point.
>> The variable "chunk" is defined at Line 2025 (Metachunk* chunk = 
>> chunks_in_use(i);).
>> "chunks_in_use(i)" is defined at Line 648 (Metachunk* 
>> chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; 
>> }).
>> So I checked values of "_chunks_in_use", and understood that 
>> "_chunks_in_use[2]" has Illegal Address "0x10".
>> Therefore, I think that the SEGV occurred because of referencing 
>> Illegal Address "0x10" at "chunk = chunk->next()".
>>
>> ----------------------------------------------------------------------------- 
>>
>> (gdb) f 7
>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, 
>> __in_chrg=<optimized out>) at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>> 2028??? ??? chunk = chunk->next();
>> (gdb) list
>> 2023??? size_t SpaceManager::sum_count_in_chunks_in_use(ChunkIndex i) {
>> 2024??? ? size_t count = 0;
>> 2025??? ? Metachunk* chunk = chunks_in_use(i);
>> 2026??? ? while (chunk != NULL) {
>> 2027??? ??? count++;
>> 2028??? ??? chunk = chunk->next();
>> 2029??? ? }
>> 2030??? ? return count;
>> 2031??? }
>> 2032
>> (gdb) list SpaceManager::chunks_in_use
>> 647??? ? // Accessors
>> 648??? ? Metachunk* chunks_in_use(ChunkIndex index) const { return 
>> _chunks_in_use[index]; }
>> ...
>> (gdb) p _chunks_in_use
>> $11 = {0x7f5fcd41c400, 0x7f5fcd41a000, 0x10, 0x0}
>> ----------------------------------------------------------------------------- 
>>
>>
>>
>>
>> The following is disassemble code of "SpaceManager::~SpaceManager()".
>> %rax has 0x10 at "0x00007f6080c97f88 <+200>", but I don't understand 
>> why this "0x10" is inserted to %rax.
>>
>> ----------------------------------------------------------------------------- 
>>
>> (gdb) disas
>> Dump of assembler code for function SpaceManager::~SpaceManager():
>> ??? 0x00007f6080c97ec0 <+0>:??? push?? %rbp
>> ??? 0x00007f6080c97ec1 <+1>:??? mov??? %rsp,%rbp
>> ??? 0x00007f6080c97ec4 <+4>:??? push?? %r15
>> ??? 0x00007f6080c97ec6 <+6>:??? push?? %r14
>> ??? 0x00007f6080c97ec8 <+8>:??? push?? %r13
>> ??? 0x00007f6080c97eca <+10>:??? push?? %r12
>> ??? 0x00007f6080c97ecc <+12>:??? push?? %rbx
>> ??? 0x00007f6080c97ecd <+13>:??? mov??? %rdi,%rbx
>> ??? 0x00007f6080c97ed0 <+16>:??? sub??? $0x8,%rsp
>> ??? 0x00007f6080c97ed4 <+20>:??? mov 0x780785(%rip),%r12??????? # 
>> 0x7f6081418660 <_ZN12SpaceManager12_expand_lockE>
>> ??? 0x00007f6080c97edb <+27>:??? test?? %r12,%r12
>> ??? 0x00007f6080c97ede <+30>:??? je???? 0x7f6080c97ee8 
>> <SpaceManager::~SpaceManager()+40>
>> ??? 0x00007f6080c97ee0 <+32>:??? mov??? %r12,%rdi
>> ??? 0x00007f6080c97ee3 <+35>:??? callq? 0x7f6080cce2f0 
>> <Monitor::lock_without_safepoint_check()>
>> ??? 0x00007f6080c97ee8 <+40>:??? movslq 0x8(%rbx),%rcx
>> ??? 0x00007f6080c97eec <+44>:??? lea 0x78075d(%rip),%rdx??????? # 
>> 0x7f6081418650 <_ZN12MetaspaceAux15_capacity_wordsE>
>> ??? 0x00007f6080c97ef3 <+51>:??? lea 0x781576(%rip),%r13??????? # 
>> 0x7f6081419470 <_ZN2os16_processor_countE>
>> ??? 0x00007f6080c97efa <+58>:??? lea 0x78073f(%rip),%r15??????? # 
>> 0x7f6081418640 <_ZN12MetaspaceAux11_used_wordsE>
>> ??? 0x00007f6080c97f01 <+65>:??? mov??? (%rdx,%rcx,8),%rax
>> ??? 0x00007f6080c97f05 <+69>:??? sub??? 0x40(%rbx),%rax
>> ??? 0x00007f6080c97f09 <+73>:??? mov??? %rax,(%rdx,%rcx,8)
>> ??? 0x00007f6080c97f0d <+77>:??? mov??? 0x38(%rbx),%rax
>> ??? 0x00007f6080c97f11 <+81>:??? movslq 0x8(%rbx),%rdx
>> ??? 0x00007f6080c97f15 <+85>:??? neg??? %rax
>> ??? 0x00007f6080c97f18 <+88>:??? cmpl?? $0x1,0x0(%r13)
>> ??? 0x00007f6080c97f1d <+93>:??? lea??? (%r15,%rdx,8),%rcx
>> ??? 0x00007f6080c97f21 <+97>:??? mov??? $0x1,%edx
>> ??? 0x00007f6080c97f26 <+102>:??? jne??? 0x7f6080c97f32 
>> <SpaceManager::~SpaceManager()+114>
>> ??? 0x00007f6080c97f28 <+104>:??? lea 0x74acb4(%rip),%rdx??????? # 
>> 0x7f60813e2be3 <AssumeMP>
>> ??? 0x00007f6080c97f2f <+111>:??? movzbl (%rdx),%edx
>> ??? 0x00007f6080c97f32 <+114>:??? cmp??? $0x0,%dl
>> ??? 0x00007f6080c97f35 <+117>:??? je???? 0x7f6080c97f38 
>> <SpaceManager::~SpaceManager()+120>
>> ??? 0x00007f6080c97f37 <+119>:??? lock xadd %rax,(%rcx)
>> ??? 0x00007f6080c97f3c <+124>:??? mov??? 0x48(%rbx),%r14
>> ??? 0x00007f6080c97f40 <+128>:??? callq? 0x7f6080c951a0 
>> <Metachunk::overhead()>
>> ??? 0x00007f6080c97f45 <+133>:??? movslq 0x8(%rbx),%rdx
>> ??? 0x00007f6080c97f49 <+137>:??? imul?? %r14,%rax
>> ??? 0x00007f6080c97f4d <+141>:??? lea (%r15,%rdx,8),%rcx
>> ??? 0x00007f6080c97f51 <+145>:??? mov??? $0x1,%edx
>> ??? 0x00007f6080c97f56 <+150>:??? neg??? %rax
>> ??? 0x00007f6080c97f59 <+153>:??? cmpl?? $0x1,0x0(%r13)
>> ??? 0x00007f6080c97f5e <+158>:??? jne??? 0x7f6080c97f6a 
>> <SpaceManager::~SpaceManager()+170>
>> ??? 0x00007f6080c97f60 <+160>:??? lea 0x74ac7c(%rip),%rdx??????? # 
>> 0x7f60813e2be3 <AssumeMP>
>> ??? 0x00007f6080c97f67 <+167>:??? movzbl (%rdx),%edx
>> ??? 0x00007f6080c97f6a <+170>:??? cmp??? $0x0,%dl
>> ??? 0x00007f6080c97f6d <+173>:??? je???? 0x7f6080c97f70 
>> <SpaceManager::~SpaceManager()+176>
>> ??? 0x00007f6080c97f6f <+175>:??? lock xadd %rax,(%rcx)
>> ??? 0x00007f6080c97f74 <+180>:??? xor??? %ecx,%ecx
>> ??? 0x00007f6080c97f76 <+182>:??? xor??? %esi,%esi
>> ??? 0x00007f6080c97f78 <+184>:??? mov 0x10(%rbx,%rcx,1),%rax
>> ??? 0x00007f6080c97f7d <+189>:??? xor??? %edx,%edx
>> ??? 0x00007f6080c97f7f <+191>:??? test?? %rax,%rax
>> ??? 0x00007f6080c97f82 <+194>:??? je???? 0x7f6080c97f95 
>> <SpaceManager::~SpaceManager()+213>
>> ??? 0x00007f6080c97f84 <+196>:??? nopl?? 0x0(%rax)
>> => 0x00007f6080c97f88 <+200>:??? mov??? 0x8(%rax),%rax
>> ??? 0x00007f6080c97f8c <+204>:??? add??? $0x1,%rdx
>> ??? 0x00007f6080c97f90 <+208>:??? test?? %rax,%rax
>> ...
>> (gdb) info registers
>> rax??????????? 0x10??? 16
>> rbx??????????? 0x7f5ff800ad30??? 140050159414576
>> rcx??????????? 0x10??? 16
>> rdx??????????? 0x0??? 0
>> rsi??????????? 0x2??? 2
>> rdi??????????? 0x1cfe570??? 30401904
>> rbp??????????? 0x7f607c3ecb80??? 0x7f607c3ecb80
>> rsp??????????? 0x7f607c3ecb50??? 0x7f607c3ecb50
>> r8???????????? 0x7f5ff80ae320??? 140050160083744
>> r9???????????? 0x7f5ff8052480??? 140050159707264
>> r10??????????? 0x0??? 0
>> r11??????????? 0x400??? 1024
>> r12??????????? 0x1cfe570??? 30401904
>> r13??????????? 0x7f6081419470??? 140052462146672
>> r14??????????? 0x2??? 2
>> r15??????????? 0x7f6081418640??? 140052462143040
>> rip??????????? 0x7f6080c97f88??? 0x7f6080c97f88 
>> <SpaceManager::~SpaceManager()+200>
>> eflags???????? 0x206??? [ PF IF ]
>> cs???????????? 0x33??? 51
>> ss???????????? 0x2b??? 43
>> ds???????????? 0x0??? 0
>> es???????????? 0x0??? 0
>> fs???????????? 0x0??? 0
>> gs???????????? 0x0??? 0
>> k0???????????? <unavailable>
>> k1???????????? <unavailable>
>> k2???????????? <unavailable>
>> k3???????????? <unavailable>
>> k4???????????? <unavailable>
>> k5???????????? <unavailable>
>> k6???????????? <unavailable>
>> k7???????????? <unavailable>
>> ----------------------------------------------------------------------------- 
>>
>>
>> ============================================================================= 
>>
>>
>>
>>
>> Does anyone know about this case?
>>
>> Thanks, Osamu
>>
>>


From per.liden at oracle.com  Wed Oct 23 10:38:09 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 23 Oct 2019 12:38:09 +0200
Subject: RFR: 8231552: ZGC: Refine address space reservation
In-Reply-To: <2b79829d-f577-819d-9577-91351c03fddb@oracle.com>
References: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>
 <c412fdf3-8f74-390e-6c6d-0d6df4e273f5@oracle.com>
 <2b79829d-f577-819d-9577-91351c03fddb@oracle.com>
Message-ID: <da5bdc6c-b52c-f536-0ea7-a28c18882126@oracle.com>

Another update after Stefan found an incorrect comparison:

Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.4
Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.4-diff

/Per

On 10/22/19 2:01 PM, Per Liden wrote:
> Updated webrev after off-line comments from Stefan and Erik.
> 
> Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.3
> Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.3-diff
> 
> /Per
> 
> On 10/16/19 10:41 AM, Per Liden wrote:
>> Latest version of this patch, rebased on today's jdk/jdk:
>>
>> http://cr.openjdk.java.net/~pliden/8231552/webrev.2
>>
>> /Per
>>
>> On 10/3/19 11:45 AM, Per Liden wrote:
>>> We could be slightly more sophisticated and do a better job reserving 
>>> address space in situations where parts of the address space is 
>>> already occupied or when the process is running with address space 
>>> limitations.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231552
>>> Webrev: http://cr.openjdk.java.net/~pliden/8231552/webrev.0
>>>
>>> /Per


From shade at redhat.com  Wed Oct 23 10:56:54 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 23 Oct 2019 12:56:54 +0200
Subject: RFR (S) 8222766: Shenandoah: streamline post-LRB CAS barrier (x86)
Message-ID: <da6c60f7-fa20-4448-953c-2e3bdae12948@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8222766

Fix:
  https://cr.openjdk.java.net/~shade/8222766/webrev.07/

I hope the comments in the new code are self-explanatory. This rewrite allows us to ditch
resolve_fwd_ptr and its awkward borrowing scheme. Since it is removing two of three fwdptr resolves,
it also considerably improves the generated code quality for CAS -- which is measurable on
microbenchmarks.

The AArch64 counterpart comes later in JDK-8232782.

Compare:
  https://cr.openjdk.java.net/~shade/8222766/shenandoah-cas-before.perfasm
  https://cr.openjdk.java.net/~shade/8222766/shenandoah-cas-after.perfasm

Testing: {x86_32, x86_64} hotspot_gc_shenandoah; jcstress runs

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Wed Oct 23 10:59:50 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 23 Oct 2019 12:59:50 +0200
Subject: RFR (S) 8222766: Shenandoah: streamline post-LRB CAS barrier (x86)
In-Reply-To: <da6c60f7-fa20-4448-953c-2e3bdae12948@redhat.com>
References: <da6c60f7-fa20-4448-953c-2e3bdae12948@redhat.com>
Message-ID: <0794092a-76f4-fee3-c537-aa6533701a1c@redhat.com>

> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8222766
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8222766/webrev.07/
> 
> I hope the comments in the new code are self-explanatory. This rewrite allows us to ditch
> resolve_fwd_ptr and its awkward borrowing scheme. Since it is removing two of three fwdptr resolves,
> it also considerably improves the generated code quality for CAS -- which is measurable on
> microbenchmarks.
> 
> The AArch64 counterpart comes later in JDK-8232782.
> 
> Compare:
>   https://cr.openjdk.java.net/~shade/8222766/shenandoah-cas-before.perfasm
>   https://cr.openjdk.java.net/~shade/8222766/shenandoah-cas-after.perfasm
> 
> Testing: {x86_32, x86_64} hotspot_gc_shenandoah; jcstress runs

Looks good to me!

Thanks,
Roman


From erik.osterlund at oracle.com  Wed Oct 23 13:01:04 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Wed, 23 Oct 2019 15:01:04 +0200
Subject: RFR: 8231552: ZGC: Refine address space reservation
In-Reply-To: <da5bdc6c-b52c-f536-0ea7-a28c18882126@oracle.com>
References: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>
 <c412fdf3-8f74-390e-6c6d-0d6df4e273f5@oracle.com>
 <2b79829d-f577-819d-9577-91351c03fddb@oracle.com>
 <da5bdc6c-b52c-f536-0ea7-a28c18882126@oracle.com>
Message-ID: <cdfb08fe-c7b3-4d6e-fade-74652356f886@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

On 10/23/19 12:38 PM, Per Liden wrote:
> Another update after Stefan found an incorrect comparison:
>
> Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.4
> Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.4-diff
>
> /Per
>
> On 10/22/19 2:01 PM, Per Liden wrote:
>> Updated webrev after off-line comments from Stefan and Erik.
>>
>> Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.3
>> Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.3-diff
>>
>> /Per
>>
>> On 10/16/19 10:41 AM, Per Liden wrote:
>>> Latest version of this patch, rebased on today's jdk/jdk:
>>>
>>> http://cr.openjdk.java.net/~pliden/8231552/webrev.2
>>>
>>> /Per
>>>
>>> On 10/3/19 11:45 AM, Per Liden wrote:
>>>> We could be slightly more sophisticated and do a better job 
>>>> reserving address space in situations where parts of the address 
>>>> space is already occupied or when the process is running with 
>>>> address space limitations.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231552
>>>> Webrev: http://cr.openjdk.java.net/~pliden/8231552/webrev.0
>>>>
>>>> /Per


From stefan.karlsson at oracle.com  Wed Oct 23 13:06:38 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Oct 2019 15:06:38 +0200
Subject: RFC: JEP: ZGC on Windows
Message-ID: <9f10be76-8f7f-662f-d4ac-dd96acc1f507@oracle.com>

Hi all,

ZGC is currently available on Linux/x64 and Linux/AArch64. There's 
Candidate JEP to add macOS support [1]. We would also like to add 
support for ZGC on Windows. I've prepared a JEP draft [2] for that work.

Most of the ZGC code base is platform independent and requires no 
Windows-specific changes. The existing load barrier support for x64 is 
OS agnostic and can also be used on Windows. The platform specific code 
that needs to be ported relates to how address space is reserved and how 
physical memory is mapped into a reserved address space.

Please see the details in the JEP for more information. Feedback is welcome!

Thanks,
StefanK

[1] https://openjdk.java.net/jeps/364
[2] https://openjdk.java.net/jeps/8232364


From per.liden at oracle.com  Wed Oct 23 13:28:30 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 23 Oct 2019 15:28:30 +0200
Subject: RFR: 8231552: ZGC: Refine address space reservation
In-Reply-To: <cdfb08fe-c7b3-4d6e-fade-74652356f886@oracle.com>
References: <5015ca7b-3e3e-b2bd-c3f8-0a83ecdb41d8@oracle.com>
 <c412fdf3-8f74-390e-6c6d-0d6df4e273f5@oracle.com>
 <2b79829d-f577-819d-9577-91351c03fddb@oracle.com>
 <da5bdc6c-b52c-f536-0ea7-a28c18882126@oracle.com>
 <cdfb08fe-c7b3-4d6e-fade-74652356f886@oracle.com>
Message-ID: <8a9929f3-cad7-4f14-6315-dec78135d0bd@oracle.com>

Thanks Erik!

/Per

On 2019-10-23 15:01, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 10/23/19 12:38 PM, Per Liden wrote:
>> Another update after Stefan found an incorrect comparison:
>>
>> Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.4
>> Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.4-diff
>>
>> /Per
>>
>> On 10/22/19 2:01 PM, Per Liden wrote:
>>> Updated webrev after off-line comments from Stefan and Erik.
>>>
>>> Full: http://cr.openjdk.java.net/~pliden/8231552/webrev.3
>>> Diff: http://cr.openjdk.java.net/~pliden/8231552/webrev.3-diff
>>>
>>> /Per
>>>
>>> On 10/16/19 10:41 AM, Per Liden wrote:
>>>> Latest version of this patch, rebased on today's jdk/jdk:
>>>>
>>>> http://cr.openjdk.java.net/~pliden/8231552/webrev.2
>>>>
>>>> /Per
>>>>
>>>> On 10/3/19 11:45 AM, Per Liden wrote:
>>>>> We could be slightly more sophisticated and do a better job 
>>>>> reserving address space in situations where parts of the address 
>>>>> space is already occupied or when the process is running with 
>>>>> address space limitations.
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231552
>>>>> Webrev: http://cr.openjdk.java.net/~pliden/8231552/webrev.0
>>>>>
>>>>> /Per
> 


From rkennke at redhat.com  Wed Oct 23 14:29:29 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 23 Oct 2019 16:29:29 +0200
Subject: [11u] RFR: 8231085: C2/GC: Better GC-interface for expanding clone
Message-ID: <582a0140-7b8e-cdd6-d0d3-d58c0964ccb0@redhat.com>

I would like to backport the recent GC interface for expanding clones to
jdk11u. This is a prerequisite to backport related Shenandoah changes to
11u without making a mess.

The change differs from the original jdk14 change because it basically
skips the intermediate GC interface for the same thing that's been
introduced in jdk12. This one wholly replaces that.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8231085
Original webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8231085/webrev.00/

JDK11u webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8231085/webrev.00.jdk11u/

Testing: tier1 and tier2 no regressions

Good?

Roman


From leihouyju at gmail.com  Wed Oct 23 15:15:52 2019
From: leihouyju at gmail.com (Haoyu Li)
Date: Wed, 23 Oct 2019 23:15:52 +0800
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <e72f06af-8847-844b-107c-afd15e01f71b@oracle.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <B723074C-94DF-450D-9715-497736E9CD27@oracle.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
 <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
 <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>
 <CAKSDcxu_t+Ka0LUU3WSzdR52-_+rsvKo_Vxn8av=duFmS4EPyA@mail.gmail.com>
 <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com>
 <e72f06af-8847-844b-107c-afd15e01f71b@oracle.com>
Message-ID: <CAKSDcxsm3-6u0arR4KCRGF=R-1sD9XJAS3Fb98NxzcPASEpGwg@mail.gmail.com>

Hi Stefan,

Thanks for your constructive feedback. I've addressed all the issues you
mentioned, and the updated patch is attached in this email.

During refining the patch, I have a couple of questions:
1) Now the MoveAndUpdateClosure and ShadowClosure assume the destination
address is the very beginning of a region, instead of an arbitrary address
like what it used to be. However, there is an unused function named
PSParallelCompact::move_and_update() uses the MoveAndUpdateClosure to
process a region from its middle, which conflicts with the assumption. I
notice that you removed this function in your patch, and so did I in the
updated patch. Does it matter?
2) Using the same do_addr() in MoveAndUpdateClosure and ShadowClosure is
doable, but it does not reuse all the code neatly. Because storing the
address of the shadow region in _destination requires extra virtual
functions to handle allocating blocks in the start_array and setting
addresses of deferred objects. In particular, allocate_blocks() and
set_deferred_object_for() in both closures are added. Is it worth avoiding
to use _offset to calculate the shadow_destination?

If there are any problems with this patch, please contact me anytime. I'm
more than happy to keep improving the code. Thanks again for reviewing.

Best,
Haoyu Li


Stefan Johansson <stefan.johansson at oracle.com> ?2019?10?22??? ??9:42???

> Hi Haoyu,
>
> I've reviewed the patch now and have some comments and questions.
>
> To simplify the review and have a common base to look at I've created a
> webrev at:
> http://cr.openjdk.java.net/~sjohanss/8220465/00/
>
> One general note first, most of the new code uses four space
> indentation, in hotspot the standard is two spaces, please change this.
> Below are some file by file comments.
>
> src/hotspot/share/gc/parallel/psCompactionManager.cpp
> ---
>    53 GrowableArray<size_t >* ParCompactionManager::_free_shadow = new
> (ResourceObj::C_HEAP, mtInternal) GrowableArray<size_t >(10, true);
>    54 Monitor*                ParCompactionManager::_monitor = NULL;
>
> Set _free_shadow to NULL here like the other statics and then create the
> GrowableArray in initialize(). I also think _shadow_region_array or
> something like that would be a better name and the monitor should also
> be named something that signals that it is used for this array.
> ---
>    70   if (_monitor == NULL) {
>    71       _monitor = new Monitor(Mutex::barrier, "CompactionManager
> monitor",
>    72                              Mutex::_allow_vm_block_flag,
> Monitor::_safepoint_check_never);
>    73   }
>
> Instead of doing the monitor creation here having to check for NULL, do
> it in initialize() below together with the array creation.
> ---
>
> src/hotspot/share/gc/parallel/psParallelCompact.cpp
> ---
> 2974       if (cur->push()) {
>
> Correct me if I'm wrong, if this call to push() returns true it means
> that nobody else has "stolen" it (used a shadow region to prepare it)
> and we mark it as pushed. But when pushed in this code path this is the
> end state for this RegionData? If this is the case I think it would be
> easier to understand the code if we added another function and state for
> when we "steal" it. Haven't thought very much about the names but I
> think you understand what I want to achieve:
> Normal path:
> UNUSED -> push() -> NORMAL
> Steal path:
> UNUSED -> steal() -> STOLEN -> fill() -> FILLED -> copy() -> SHADOW
>
> We could then also assert in set_completed() that the state is either
> NORMAL or SHADOW (or if they have a shared end state DONE). As I said
> the names can be improved (both for the states and the functions) but I
> think we should have names and not just numbers.
> ---
>
> 3060 template <class T>
> 3061 void PSParallelCompact::fill_region(ParCompactionManager* cm,
> size_t region_idx, size_t shadow, size_t offset)
>
> As I told you this was a big improvement from the first patch, but I
> think there is room for even more improvements around the way we pass in
> ignored parameters to MoveAndUpdateClosure. Explaining my idea in text
> is harder than code, so I created a patch, what do you think about this?
> http://cr.openjdk.java.net/~sjohanss/8220465/00-alt/
>
> This alternative is based on 00 and does not take my other comments into
> consideration. So it might have to be altered a bit if you address some
> of my other comments/questions.
> ---
>
> 3196 void PSParallelCompact::copy_back(HeapWord *region_addr, HeapWord
> *shadow_addr) {
>
> I think the paramenter should change place, so that it corresponds with
> the copy below.
> ---
>
> 3200 bool PSParallelCompact::steal_shadow_region(ParCompactionManager*
> cm, size_t &region_idx) {
> 3201     size_t& record = cm->shadow_record();
>
> Did you consider to just let shadow_record() be a simple getter instead
> of getting a reference and then have a next_shadow_record() which
> advances it by active_workers?
> ---
>
> 3236 void PSParallelCompact::initialize_steal_record(uint which) {
>
> I'm having a hard time understanding the details here, or I get that all
> threads should have a separate shadow record, but I'm not sure why it is
> not enough to just do:
> size_t record = _summary_data.addr_to_region_idx(
>    _space_info[old_space_id].dense_prefix());
> cm->set_shadow_record(record + which);
>
> As you can see I'm also suggesting adding a setter for shadow_record.
> ---
>
> 3434 ParMarkBitMapClosure::IterationStatus
> 3435 ShadowClosure::do_addr(HeapWord* addr, size_t words) {
> 3436     HeapWord* shadow_destination = destination() + _offset;
>
> Using an offset instead of a given address feels a bit backwards, did
> you consider letting the closure keep and update a _shadow_destination
> instead? Or would it even be possible to just set destination to be the
> shadow region address? In that case it should be possible to just use
> the do_addr and other functions from the MoveAndUpdateClosure.
>
> I see from looking at this particular function that there is one assert
> that would have to change:
> 3408
> assert(PSParallelCompact::summary_data().calc_new_pointer(source(),
> compaction_manager()) ==
> 3409          destination(), "wrong destination");
>
> This should be easily fixed by adding a virtual function
> check_destination, that has a special implementation for the ShadowClosure.
> ---
>
> src/hotspot/share/gc/parallel/psParallelCompact.hpp
> ---
>   333     // Preempt the region to avoid double processes
>   334     inline bool push();
>   335     // Mark the region as filled and ready to be copied back
>   336     inline bool fill();
>   337     // Preempt the region to copy the shadow region content back
>   338     inline bool copy();
>
> As mentioned, I think there might be better names for those functions
> and the comments. Maybe adding a prefix would make the code more self
> explaining. try_push(), mark_filled(), try_copy() and the new try_steal().
> ---
>
> Thanks again for providing this patch, I look forward to see an updated
> version.
>
> Cheers,
> Stefan
>
>
> On 2019-10-14 15:00, Stefan Johansson wrote:
> > Thanks for the quick update Haoyu,
> >
> > This is a great improvement and I will try to find time to look into the
> > patch in more detail the coming weeks.
> >
> > Thanks,
> > Stefan
> >
> > On 2019-10-11 14:49, Haoyu Li wrote:
> >> Hi Stefan,
> >>
> >> Thanks for your suggestion! It is very redundant that
> >> PSParallelCompact::fill_shadow_region() copies most code from
> >> PSParallelCompact::fill_region(), and therefore I've refactored these
> >> two functions to share code as many as possible. And the attachment is
> >> the updated patch.
> >>
> >> Specifically, the closure, which moves objects, in
> >> PSParallelCompact::fill_region() is now declared as a template of
> >> either MoveAndUpdateClosure or ShadowClosure. So by controlling the
> >> type of closure when invoking the function, we can decide whether to
> >> fill a normal region or a shadow one. Thus, almost all code in
> >> PSParallelCompact::fill_region() can be reused.
> >>
> >> Besides, a virtual function named complete_region() is added in both
> >> closures to do some work after the filling, such setting states and
> >> copying the shadow region back.
> >>
> >> Thanks again for reviewing the patch, looking forward to your insights
> >> and suggestions!
> >>
> >> Best Regards,
> >> Haoyu Li
> >>
> >> 2019-10-10 21:50 GMT+08:00, Stefan Johansson
> >> <stefan.johansson at oracle.com>:
> >>> Thanks for the clarification =)
> >>>
> >>> Moving on to the next part, the code in the patch. So this won't be a
> >>> full review of the patch but just an initial comment that I would like
> >>> to be addressed first.
> >>>
> >>> The new function PSParallelCompact::fill_shadow_region() is more or
> less
> >>> a copy of PSParallelCompact::fill_region() and I understand that from a
> >>> proof of concept point of view it was the easy (and right) way to do
> it.
> >>> I would prefer if the code could be refactored so that fill_region()
> and
> >>> fill_shadow_region() share more code. There might be reasons that I've
> >>> missed, that prevents it, but we should at least explore how much code
> >>> can be shared.
> >>>
> >>> Thanks,
> >>> Stefan
> >>>
> >>> On 2019-10-10 15:10, Haoyu Li wrote:
> >>>> Hi Stefan,
> >>>>
> >>>> Thanks for your quick response! As to your concern about the OCA, I am
> >>>> the sole author of the patch. And it is the case as what the agreement
> >>>> states.
> >>>> Best Regrads,
> >>>> Haoyu Li,
> >>>>
> >>>>
> >>>> Stefan Johansson <stefan.johansson at oracle.com
> >>>> <mailto:stefan.johansson at oracle.com>> ?2019?10?10??? ??8:37
> >>>> ???
> >>>>
> >>>>      Hi,
> >>>>
> >>>>      On 2019-10-10 13:06, Haoyu Li wrote:
> >>>>       > Hi Stefan,
> >>>>       >
> >>>>       > Thanks for your testing! One possible reason for the
> >>>> regressions
> >>>> in
> >>>>       > simple tests is that the region dependencies maybe not heavy
> >>>> enough.
> >>>>       > Because the locality of shadow regions is lower than that of
> >>>> heap
> >>>>       > regions, writing to shadow regions will be slower than to
> >>>> normal
> >>>>       > regions, and this is a part of the reason why I reuse shadow
> >>>>      regions.
> >>>>       > Therefore, if only a few shadow regions are created and not
> >>>>      reused, the
> >>>>       > overhead may not be amortized.
> >>>>
> >>>>      I guess it is something like this. I thought that for "easy"
> heaps
> >>>> the
> >>>>      shadow regions won't be used at all, and should therefor not
> >>>> really
> >>>>      cost
> >>>>      anything.
> >>>>
> >>>>       >
> >>>>       > As to the OCA, it is the case that I'm the only person
> >>>> signing the
> >>>>       > agreement. Please let me know if you have any further
> >>>> questions.
> >>>>      Thanks
> >>>>       > again!
> >>>>
> >>>>      Ok, so you are the sole author of the patch. The important
> >>>> part, as
> >>>> the
> >>>>      agreement states, is:
> >>>>      "no other person or entity, including my employer, has or will
> >>>> have
> >>>>      rights with respect my contributions"
> >>>>
> >>>>      Is that the case?
> >>>>
> >>>>      Thanks,
> >>>>      Stefan
> >>>>
> >>>>       >
> >>>>       > Best Regrads,
> >>>>       > Haoyu Li
> >>>>       >
> >>>>       > Stefan Johansson <stefan.johansson at oracle.com
> >>>>      <mailto:stefan.johansson at oracle.com>
> >>>>       > <mailto:stefan.johansson at oracle.com
> >>>>      <mailto:stefan.johansson at oracle.com>>> ?2019?10?8??? ??
> >>>> 6:49
> >>>>      ???
> >>>>       >
> >>>>       >     Hi Haoyu,
> >>>>       >
> >>>>       >     I've done some more testing and I haven't seen any issues
> >>>>      with the
> >>>>       >     patch
> >>>>       >     so far and the performance looks promising in most
> >>>> cases. For
> >>>>      simple
> >>>>       >     tests I've seen some regressions, but I'm not really sure
> >>>>      why. Will do
> >>>>       >     some more digging.
> >>>>       >
> >>>>       >     To move forward with this the first thing we need to do is
> >>>>      making sure
> >>>>       >     that you being covered by the Oracle Contributor
> >>>> Agreement is
> >>>>      enough.
> >>>>       >       From what we can see it is only you as an individual
> that
> >>>>      has signed
> >>>>       >     the OCA and in that case it is important that this
> >>>> statement
> >>>>      from the
> >>>>       >     OCA is fulfilled: "no other person or entity, including my
> >>>>      employer,
> >>>>       >     has
> >>>>       >     or will have rights with respect my contributions"
> >>>>       >
> >>>>       >     Is this the case for this contribution or should we have
> >>>> the
> >>>>      university
> >>>>       >     sign the OCA as well? For more information regarding the
> >>>> OCA
> >>>>      please
> >>>>       >     refer to:
> >>>>       > https://www.oracle.com/technetwork/oca-faq-405384.pdf
> >>>>       >
> >>>>       >     Thanks,
> >>>>       >     Stefan
> >>>>       >
> >>>>       >     On 2019-09-16 16:02, Haoyu Li wrote:
> >>>>       >      > FYI, the evaluation results on OpenJDK 14 are plotted
> in
> >>>> the
> >>>>       >     attachment.
> >>>>       >      > I compute the full GC throughput by dividing the heap
> >>>> size
> >>>>      before
> >>>>       >     full
> >>>>       >      > GC by the GC pause time, and the results are arithmetic
> >>>> mean
> >>>>       >     values of
> >>>>       >      > ten runs after a warm-up run. The evaluation is
> >>>> conducted on
> >>>> a
> >>>>       >     machine
> >>>>       >      > with dual Intel ?XeonTM E5-2618L v3 CPUs (2 sockets, 16
> >>>>      physical
> >>>>       >     cores
> >>>>       >      > with SMT enabled) and 64G DRAM.
> >>>>       >      >
> >>>>       >      > Best Regrads,
> >>>>       >      > Haoyu Li,
> >>>>       >      > Institute of Parallel and Distributed Systems(IPADS),
> >>>>       >      > School of Software,
> >>>>       >      > Shanghai Jiao Tong University
> >>>>       >      >
> >>>>       >      >
> >>>>       >      > Stefan Johansson <stefan.johansson at oracle.com
> >>>>      <mailto:stefan.johansson at oracle.com>
> >>>>       >     <mailto:stefan.johansson at oracle.com
> >>>>      <mailto:stefan.johansson at oracle.com>>
> >>>>       >      > <mailto:stefan.johansson at oracle.com
> >>>>      <mailto:stefan.johansson at oracle.com>
> >>>>       >     <mailto:stefan.johansson at oracle.com
> >>>>      <mailto:stefan.johansson at oracle.com>>>> ?2019?9?12??? ?
> >>>> ?5:34
> >>>>       >     ???
> >>>>       >      >
> >>>>       >      >     Hi Haoyu,
> >>>>       >      >
> >>>>       >      >     I recently came across your patch and I would
> >>>> like to
> >>>>      pick up on
> >>>>       >      >     some of the things Kim mentioned in his mails. I
> >>>>      especially want
> >>>>       >      >     evaluate and investigate if this is a technique
> >>>> we can
> >>>>      use to
> >>>>       >      >     improve the other GCs as well. To start that work I
> >>>>      want to
> >>>>       >     take the
> >>>>       >      >     patch for a spin in our internal performance
> >>>> testing.
> >>>>      The patch
> >>>>       >      >     doesn?t apply clean to the latest JDK repository,
> so
> >>>>      if you could
> >>>>       >      >     provide an updated patch that would be very
> helpful.
> >>>>       >      >
> >>>>       >      >     It would also be great if you could share some more
> >>>>      information
> >>>>       >      >     around the results presented in the paper. For
> >>>> example,
> >>>> it
> >>>>       >     would be
> >>>>       >      >     good to get the full command lines for the
> different
> >>>>       >     benchmarks so
> >>>>       >      >     we can run them locally and reproduce the
> >>>>      results you?ve seen.
> >>>>       >      >
> >>>>       >      >     Thanks,
> >>>>       >      >     Stefan
> >>>>       >      >
> >>>>       >      >>     12 mars 2019 kl. 03:21 skrev Haoyu Li
> >>>>      <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
> >>>>       >     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
> >>>>       >      >>     <mailto:leihouyju at gmail.com
> >>>>      <mailto:leihouyju at gmail.com> <mailto:leihouyju at gmail.com
> >>>>      <mailto:leihouyju at gmail.com>>>>:
> >>>>       >      >>
> >>>>       >      >>     Hi Kim,
> >>>>       >      >>
> >>>>       >      >>     Thanks for reviewing and testing the patch. If
> >>>> there
> >>>>      are any
> >>>>       >      >>     failures or performance degradation relevant to
> the
> >>>>      work, please
> >>>>       >      >>     let me know and I'll be very happy to keep
> >>>> improving
> >>>> it.
> >>>>       >     Also, any
> >>>>       >      >>     suggestions about code improvements are well
> >>>> appreciated.
> >>>>       >      >>
> >>>>       >      >>     I'm not quite sure if both G1 and Shenandoah
> >>>> have the
> >>>>      similar
> >>>>       >      >>     region dependency issue, since I haven't studied
> >>>> their
> >>>> GC
> >>>>       >      >>     behaviors before. If they have, I'm also willing
> to
> >>>>      propose
> >>>>       >     a more
> >>>>       >      >>     general optimization.
> >>>>       >      >>
> >>>>       >      >>     As to the memory overhead, I believe it will be
> low
> >>>>      because this
> >>>>       >      >>     patch exploits empty regions in the young space
> >>>>      rather than
> >>>>       >      >>     off-heap memory to allocate shadow regions, and
> >>>> also
> >>>>      reuses the
> >>>>       >      >>     /_source_region/ field of each /RegionData /to
> >>>> record
> >>>> the
> >>>>       >      >>     correspongding shadow region index. We only
> >>>> introduce
> >>>>      a new
> >>>>       >      >>     integer filed /_shadow /in the RegionData class to
> >>>>      indicate the
> >>>>       >      >>     status of a region, a global /GrowableArray
> >>>>      _free_shadow/ to
> >>>>       >     store
> >>>>       >      >>     the indices of shadow regions, and a global
> >>>>      /Monitor/ to protect
> >>>>       >      >>     the array. These information might help if the
> >>>> memory
> >>>>      overhead
> >>>>       >      >>     need to be evaluated.
> >>>>       >      >>
> >>>>       >      >>     Looking forward to your insight.
> >>>>       >      >>
> >>>>       >      >>     Best Regrads,
> >>>>       >      >>     Haoyu Li,
> >>>>       >      >>     Institute of Parallel and Distributed
> >>>> Systems(IPADS),
> >>>>       >      >>     School of Software,
> >>>>       >      >>     Shanghai Jiao Tong University
> >>>>       >      >>
> >>>>       >      >>
> >>>>       >      >>     Kim Barrett <kim.barrett at oracle.com
> >>>>      <mailto:kim.barrett at oracle.com>
> >>>>       >     <mailto:kim.barrett at oracle.com
> >>>> <mailto:kim.barrett at oracle.com>>
> >>>>       >      >>     <mailto:kim.barrett at oracle.com
> >>>>      <mailto:kim.barrett at oracle.com>
> >>>>       >     <mailto:kim.barrett at oracle.com
> >>>>      <mailto:kim.barrett at oracle.com>>>> ?2019?3?12??? ??6:11
> >>>> ???
> >>>>       >      >>
> >>>>       >      >>         > On Mar 11, 2019, at 1:45 AM, Kim Barrett
> >>>>       >      >>         <kim.barrett at oracle.com
> >>>>      <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
> >>>>      <mailto:kim.barrett at oracle.com>>
> >>>>       >     <mailto:kim.barrett at oracle.com
> >>>>      <mailto:kim.barrett at oracle.com> <mailto:kim.barrett at oracle.com
> >>>>      <mailto:kim.barrett at oracle.com>>>> wrote:
> >>>>       >      >>         >
> >>>>       >      >>         >> On Jan 24, 2019, at 3:58 AM, Haoyu Li
> >>>>       >     <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
> >>>>      <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
> >>>>       >      >>         <mailto:leihouyju at gmail.com
> >>>>      <mailto:leihouyju at gmail.com>
> >>>>       >     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com
> >>>>
> >>>>      wrote:
> >>>>       >      >>         >>
> >>>>       >      >>         >> Hi Kim,
> >>>>       >      >>         >>
> >>>>       >      >>         >> I have ported my patch to OpenJDK 13
> >>>> according
> >>>>      to your
> >>>>       >      >>         instructions in your last mail, and the
> >>>> patch is
> >>>>      attached in
> >>>>       >      >>         this mail. The patch does not change much
> since
> >>>>      PSGC is
> >>>>       >     indeed
> >>>>       >      >>         pretty stable.
> >>>>       >      >>         >>
> >>>>       >      >>         >> Also, I evaluate the correctness and
> >>>>      performance of
> >>>>       >     PS full
> >>>>       >      >>         GC with benchmarks from DaCapo, SPECjvm2008,
> >>>> and
> >>>>      JOlden
> >>>>       >     suits
> >>>>       >      >>         on a machine with dual Intel Xeon E5-2618L v3
> >>>> CPUs(16
> >>>>       >     physical
> >>>>       >      >>         cores), 64G DRAM and linux kernel 4.17. The
> >>>>      evaluation
> >>>>       >     result,
> >>>>       >      >>         indicating 1.9X GC throughput improvement on
> >>>>      average, is
> >>>>       >      >>         attached, too.
> >>>>       >      >>         >>
> >>>>       >      >>         >> However, I have no idea how to further test
> >>>> this
> >>>>       >     patch for
> >>>>       >      >>         both correctness and performance. Can I please
> >>>>      get any
> >>>>       >      >>         guidance from you or some sponsor?
> >>>>       >      >>         >
> >>>>       >      >>         > Sorry I missed that you had sent an updated
> >>>>      version of the
> >>>>       >      >>         patch.
> >>>>       >      >>         >
> >>>>       >      >>         > I?ve run the full regression suite across
> >>>>      Oracle-supported
> >>>>       >      >>         platforms.  There are some
> >>>>       >      >>         > failures, but there are almost always some
> >>>>      failures in the
> >>>>       >      >>         later tiers right now.  I?ll start
> >>>>       >      >>         > looking at them tomorrow to figure out
> >>>> whether
> >>>>      any of them
> >>>>       >      >>         are relevant.
> >>>>       >      >>         >
> >>>>       >      >>         > I?m also planning to run some of our
> >>>> performance
> >>>>       >     benchmarks.
> >>>>       >      >>         >
> >>>>       >      >>         > I?ve lightly skimmed the proposed changes.
> >>>>      There might be
> >>>>       >      >>         some code improvements
> >>>>       >      >>         > to be made.
> >>>>       >      >>         >
> >>>>       >      >>         > I?m also wondering if this technique
> >>>> applies to
> >>>>      other
> >>>>       >      >>         collectors.  It seems like both G1 and
> >>>>       >      >>         > Shenandoah full gc?s might have similar
> >>>>      issues?  If so, a
> >>>>       >      >>         solution that is ParallelGC-specific
> >>>>       >      >>         > is less interesting than one that has
> broader
> >>>>       >      >>         applicability.  Though maybe this optimization
> >>>>       >      >>         > is less important for G1 and Shenandoah,
> >>>> since
> >>>> they
> >>>>       >     actively
> >>>>       >      >>         try to avoid full gc?s.
> >>>>       >      >>         >
> >>>>       >      >>         > I?m also not clear on how much additional
> >>>>      memory might be
> >>>>       >      >>         temporarily allocated by this
> >>>>       >      >>         > mechanism.
> >>>>       >      >>
> >>>>       >      >>         I?ve created a CR for this:
> >>>>       >      >> https://bugs.openjdk.java.net/browse/JDK-8220465
> >>>>       >      >>
> >>>>       >      >
> >>>>       >
> >>>>
> >>>
> >>
> >>
>


From sangheon.kim at oracle.com  Wed Oct 23 16:20:47 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Wed, 23 Oct 2019 09:20:47 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
Message-ID: <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>

Hi Per,

Thanks for taking a look at this.

I agree all your comments and here's the webrev.
- All comments from Per.
- Move G1PageBasedVirtualSpace::page_size() near to page_start() from Kim.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220310/webrev.6
http://cr.openjdk.java.net/~sangheki/8220310/webrev.6.inc
Testing: build test for linux, solaris, windows and mac.

FYI, as I think existing numa related API names and -1 stuff seem not 
good, I planned to refine those later after pushing. But as you said 
following existing rule and then refine all together later seems better.

Thanks,
Sangheon


On 10/23/19 1:21 AM, Per Liden wrote:
> Hi Sangheon,
>
> I noticed that this patch adds os::numa_get_address_id(). That name is 
> misleading as it doesn't return an "address id", but a "numa node id". 
> However, the terminology used in the os class for numa node is "group" 
> (for example, numa_get_groups_num, numa_get_group_id, etc). So I'd 
> suggest we instead name this os::numa_get_group_id(void* address), 
> i.e. an overload of os::numa_get_group_id().
>
> Btw, I think that the numa related names used in the os class are odd, 
> but I guess that are brought over from Solaris. We can refine those at 
> some later time if we want, but for now I think we should follow the 
> naming convention that we have there.
>
> Also, I don't think this function should print warnings, as that's up 
> to the caller to decide what to do, what to print, etc.
>
> Furthermore, I suggest we remove os::InvalidNUMAId. Other numa 
> functions in the os class returns -1 on error, so I think we should do 
> that here too.
>
> Here's a patch with the proposed changes:
>
>
> diff --git a/src/hotspot/os/linux/os_linux.cpp 
> b/src/hotspot/os/linux/os_linux.cpp
> --- a/src/hotspot/os/linux/os_linux.cpp
> +++ b/src/hotspot/os/linux/os_linux.cpp
> @@ -3007,7 +3007,7 @@
> ?? return 0;
> ?}
>
> -int os::numa_get_address_id(void* address) {
> +int os::numa_get_group_id(void* address) {
> ?#ifndef MPOL_F_NODE
> ?#define MPOL_F_NODE???? (1<<0)? // Return next IL mode instead of 
> node mask
> ?#endif
> @@ -3016,11 +3016,10 @@
> ?#define MPOL_F_ADDR???? (1<<1)? // Look up VMA using address
> ?#endif
>
> -? int id = InvalidNUMAId;
> +? int id = 0;
>
> ?? if (syscall(SYS_get_mempolicy, &id, NULL, 0, address, MPOL_F_NODE | 
> MPOL_F_ADDR) == -1) {
> -??? warning("Failed to get numa id at " PTR_FORMAT " with errno=%d", 
> p2i(address), errno);
> -??? return InvalidNUMAId;
> +??? return -1;
> ?? }
> ?? return id;
> ?}
> diff --git a/src/hotspot/share/gc/g1/g1NUMA.cpp 
> b/src/hotspot/share/gc/g1/g1NUMA.cpp
> --- a/src/hotspot/share/gc/g1/g1NUMA.cpp
> +++ b/src/hotspot/share/gc/g1/g1NUMA.cpp
> @@ -164,7 +164,7 @@
>
> ?uint G1NUMA::index_of_address(HeapWord *address) const {
> ?? int numa_id = os::numa_get_address_id((void*)address);
> -? if (numa_id == os::InvalidNUMAId) {
> +? if (numa_id == -1) {
> ???? return UnknownNodeIndex;
> ?? } else {
> ???? return index_of_node_id(numa_id);
> @@ -201,7 +201,7 @@
> ?? if (!is_enabled()) {
> ???? return;
> ?? }
> -
> +
> ?? if (size_in_bytes == 0) {
> ???? return;
> ?? }
> diff --git a/src/hotspot/share/runtime/os.hpp 
> b/src/hotspot/share/runtime/os.hpp
> --- a/src/hotspot/share/runtime/os.hpp
> +++ b/src/hotspot/share/runtime/os.hpp
> @@ -374,10 +374,7 @@
> ?? static size_t numa_get_leaf_groups(int *ids, size_t size);
> ?? static bool?? numa_topology_changed();
> ?? static int??? numa_get_group_id();
> -
> -? static const int InvalidNUMAId = -1;
> -
> -? static int numa_get_address_id(void* address);
> +? static int??? numa_get_group_id(void* address);
>
> ?? // Page manipulation
> ?? struct page_info {
>
>
> cheers,
> Per
>
>
> On 10/16/19 7:54 PM, sangheon.kim at oracle.com wrote:
>> Hi Kim, Stefan and Thomas,
>>
>> Many thanks for the reviews and suggestions!
>>
>> Kim,
>> I will move page_size() near page_start() before push as you suggested.
>> As you know, all 3 patches will be pushed together though.
>>
>> Thanks,
>> Sangheon
>>
>>
>> On 10/16/19 7:00 AM, Kim Barrett wrote:
>>>> On Oct 15, 2019, at 10:33 AM, sangheon.kim at oracle.com wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Here's revised webrev which addresses:
>>>> 1) G1RegionToSpaceMapper checks mtJavaHeap and then conditionally 
>>>> calls G1NUMA::request_memory_on_node() (Kim)
>>>> 2) The signature of G1NUMA::request_memory_on_node(void* address, 
>>>> ,) is changed to have actual address instead of page index. (Stefan)
>>>> 3) Some local variable name changes at G1RegionToSpaceMapper. i -> 
>>>> region_idx, idx -> page_idx (for local style, used idx instead of 
>>>> index)
>>>>
>>>> webrev:
>>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5/
>>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5.inc/
>>>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
>>> Looks good.
>>>
>>> In g1PageBasedVirtualSpace.cpp, could the newly added definition of 
>>> page_size()
>>> be moved to be near the existing definition of page_start()? I don?t 
>>> need a new
>>> webrev if you move it.
>>>
>>


From shade at redhat.com  Wed Oct 23 18:17:45 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 23 Oct 2019 20:17:45 +0200
Subject: RFR (XS) 8232908: Shenandoah: compact heuristics has incorrect
 trigger "Free is lower than allocated recently"
Message-ID: <97e90b6e-524b-99d0-9d15-464c23553d28@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8232908

See the discussion in the bug. The fix is to remove the offending trigger:

diff -r da4578a0f73d src/hotspot/share/gc/shenandoah/heuristics/shenandoahCompactHeuristics.cpp
--- a/src/hotspot/share/gc/shenandoah/heuristics/shenandoahCompactHeuristics.cpp        Mon Sep 30
22:39:11 2019 +0200
+++ b/src/hotspot/share/gc/shenandoah/heuristics/shenandoahCompactHeuristics.cpp        Wed Oct 23
20:14:48 2019 +0200
@@ -66,11 +66,4 @@
   }

-  if (available < threshold_bytes_allocated) {
-    log_info(gc)("Trigger: Free (" SIZE_FORMAT "%s) is lower than allocated recently (" SIZE_FORMAT
"%s)",
-                 byte_size_in_proper_unit(available),
proper_unit_for_byte_size(available),
-                 byte_size_in_proper_unit(threshold_bytes_allocated),
proper_unit_for_byte_size(threshold_bytes_allocated));
-    return true;
-  }
-
   size_t bytes_allocated = heap->bytes_allocated_since_gc_start();
   if (bytes_allocated > threshold_bytes_allocated) {


Testing: hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From hohensee at amazon.com  Wed Oct 23 20:37:35 2019
From: hohensee at amazon.com (Hohensee, Paul)
Date: Wed, 23 Oct 2019 20:37:35 +0000
Subject: [11u] RFR: 8231085: C2/GC: Better GC-interface for expanding clone
In-Reply-To: <582a0140-7b8e-cdd6-d0d3-d58c0964ccb0@redhat.com>
References: <582a0140-7b8e-cdd6-d0d3-d58c0964ccb0@redhat.com>
Message-ID: <360A0C79-6CBF-467A-AF49-EB9F9CD003AC@amazon.com>

Ok. Still a tiny skipped change.

Paul

?On 10/23/19, 7:30 AM, "hotspot-compiler-dev on behalf of Roman Kennke" <hotspot-compiler-dev-bounces at openjdk.java.net on behalf of rkennke at redhat.com> wrote:

    I would like to backport the recent GC interface for expanding clones to
    jdk11u. This is a prerequisite to backport related Shenandoah changes to
    11u without making a mess.
    
    The change differs from the original jdk14 change because it basically
    skips the intermediate GC interface for the same thing that's been
    introduced in jdk12. This one wholly replaces that.
    
    Bug:
    https://bugs.openjdk.java.net/browse/JDK-8231085
    Original webrev:
    http://cr.openjdk.java.net/~rkennke/JDK-8231085/webrev.00/
    
    JDK11u webrev:
    http://cr.openjdk.java.net/~rkennke/JDK-8231085/webrev.00.jdk11u/
    
    Testing: tier1 and tier2 no regressions
    
    Good?
    
    Roman
    
    
From stefan.johansson at oracle.com  Wed Oct 23 20:48:14 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 23 Oct 2019 22:48:14 +0200
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <531dc0fe-236d-110b-65ee-d224ac028130@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
 <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
 <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>
 <1898AC1E-0A8C-467C-9CA9-4B02C00A3A07@oracle.com>
 <7f150234-4080-b2f9-a791-b456038af795@oracle.com>
 <8126d900-714b-585a-f2f0-4ce13f71501c@oracle.com>
 <531dc0fe-236d-110b-65ee-d224ac028130@oracle.com>
Message-ID: <3BD075B5-12B8-4516-AB0C-5CFAEC10BF30@oracle.com>


> 23 okt. 2019 kl. 10:39 skrev Thomas Schatzl <thomas.schatzl at oracle.com>:
> 
> Hi Stefan,
> 
> On 23.10.19 09:05, Stefan Johansson wrote:
>> Hi Thomas,
>> On 2019-10-22 15:45, Thomas Schatzl wrote:
>>> Hi Kim,
>>> 
>>> On 22.10.19 15:44, Kim Barrett wrote:
>>>>> On Oct 22, 2019, at 6:13 AM, Thomas Schatzl 
> [...]>>>> Webrevs:
>>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2_to_3/ (diff)
>>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3/ (full)
>> This looks good, and well documented :)
>> One small thing:
>> src/hotspot/share/gc/g1/g1SharedClosures.hpp
>> ---
>>  46     _codeblobs(pss->worker_id(), &_oops, Mark == G1MarkFromRoot) {}
>> What do you think about adding a helper for Mark == G1MarkFromRoot, something like need_strong_processing() and a comment explaining that it will be true during initial mark.
> 
> Something like this?
> 
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3_to_4/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.4/ (full)
> 
> Not completely sure if that is required as searching for G1MarkFromRoot shows that it is only used for the strong shared closures in the initial mark closure set. But I understand that it is nice to be reminded about this.
> 
Thanks for addressing it, look good!

Stefan

> Thanks for your and Kim's reviews.
> 
> Thanks,
>  Thomas


From rkennke at redhat.com  Wed Oct 23 20:50:11 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 23 Oct 2019 22:50:11 +0200
Subject: RFR (XS) 8232908: Shenandoah: compact heuristics has incorrect
 trigger "Free is lower than allocated recently"
In-Reply-To: <97e90b6e-524b-99d0-9d15-464c23553d28@redhat.com>
References: <97e90b6e-524b-99d0-9d15-464c23553d28@redhat.com>
Message-ID: <21225bdb-381b-44b7-030d-0b147ca1b5fd@redhat.com>

Ok, good. Thanks,
Roman

> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8232908
> 
> See the discussion in the bug. The fix is to remove the offending trigger:
> 
> diff -r da4578a0f73d src/hotspot/share/gc/shenandoah/heuristics/shenandoahCompactHeuristics.cpp
> --- a/src/hotspot/share/gc/shenandoah/heuristics/shenandoahCompactHeuristics.cpp        Mon Sep 30
> 22:39:11 2019 +0200
> +++ b/src/hotspot/share/gc/shenandoah/heuristics/shenandoahCompactHeuristics.cpp        Wed Oct 23
> 20:14:48 2019 +0200
> @@ -66,11 +66,4 @@
>    }
> 
> -  if (available < threshold_bytes_allocated) {
> -    log_info(gc)("Trigger: Free (" SIZE_FORMAT "%s) is lower than allocated recently (" SIZE_FORMAT
> "%s)",
> -                 byte_size_in_proper_unit(available),
> proper_unit_for_byte_size(available),
> -                 byte_size_in_proper_unit(threshold_bytes_allocated),
> proper_unit_for_byte_size(threshold_bytes_allocated));
> -    return true;
> -  }
> -
>    size_t bytes_allocated = heap->bytes_allocated_since_gc_start();
>    if (bytes_allocated > threshold_bytes_allocated) {
> 
> 
> Testing: hotspot_gc_shenandoah
> 


From suenaga at oss.nttdata.com  Thu Oct 24 00:49:55 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 24 Oct 2019 09:49:55 +0900
Subject: Segmentation Fault occurs when ClassLoader and Metaspace is
 released in JDK 8
In-Reply-To: <314f9ad2-17df-1082-8816-7af73a96e9fb@nttcom.co.jp_1>
References: <fb308571-cbdf-1f4b-177f-aa6bac986a5f@nttcom.co.jp_1>
 <422c9ca2-5053-c761-cb61-f075877bb666@oss.nttdata.com>
 <314f9ad2-17df-1082-8816-7af73a96e9fb@nttcom.co.jp_1>
Message-ID: <1ccb4f35-7f21-4aa2-4cbb-b75244b6d12d@oss.nttdata.com>

Hi Osamu,

I guess this is a bug in combination of Metaspace and CMS.
However current jdk/jdk has different implementation, so it might not be occur in modern JDK.
I want to hear the comments from others.

My comments is below:

On 2019/10/23 18:57, Osamu Sakamoto wrote:
> Hi Yasumasa,
> 
> Thank you for answering.
> 
>  > What JVM options did you pass?
> The following is the JVM options I passed.
> -----------------------------------------------------------------
> -Xmx2048m
> -Xms2048m
> -XX:NewSize=412m
> -XX:MaxNewSize=412m
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=15
> -XX:+UseConcMarkSweepGC
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSInitiatingOccupancyFraction=80
> -XX:+CMSClassUnloadingEnabled
> -XX:CompressedClassSpaceSize=64m
> -XX:+PrintGCDetails
> -XX:+PrintGCDateStamps
> -XX:+UseGCLogFileRotation
> -XX:GCLogFileSize=0
> -Xloggc:/var/log/tomcatm0/gc-%p.log
> -XX:+HeapDumpOnOutOfMemoryError
> -XX:+AlwaysLockClassLoader
> -----------------------------------------------------------------
> 
> 
>  > I guess you used CMS because this problem seems to occur on CMS only [1] [2].
> Yes, I used CMS.
> 
>  > So it might be work around not to use CMS.
> Thank you for telling me work around.
> But it is difficult to change the GC method, so we would like to solve this issue with CMS GC if possible.
> 
> 
>  > I'm not sure root cause of this issue, but it seems to break ClassLoaderDataGraph::_unloading.
>  > (like double free (delete) of CLD)
> I checked whether the ClassLoaderDataGraph::_unloading is broken or not, but I didn't know because of the value has been cleaered by NULL or optimized out.
> 
> Referring ClassLoaderDataGraph[1].cpp, it looks like that _unloading value is saved to ClassLoaderDataGraph::_saved_unloading.
> But _saved_unloading had been cleared by NULL, too.
> 
> Is there any other way to check it?
> 
> [1]http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.cpp#l753
> 
> -----------------------------------------------------------------
> (gdb) f 10
> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
> 818??? ??? delete purge_me;
> (gdb) list ClassLoaderDataGraph::purge
> 810??? void ClassLoaderDataGraph::purge() {
> 811??? ? assert(SafepointSynchronize::is_at_safepoint(), "must be at safepoint!");
> 812??? ? ClassLoaderData* list = _unloading;
> 813??? ? _unloading = NULL;
> 814??? ? ClassLoaderData* next = list;
> 815??? ? while (next != NULL) {
> 816??? ??? ClassLoaderData* purge_me = next;
> 817??? ??? next = purge_me->next();
> 818??? ??? delete purge_me;
> 819??? ? }
> 820??? ? Metaspace::purge();
> 821??? }
> (gdb) p _unloading
> $29 = (ClassLoaderData *) 0x0
> (gdb) p list
> $30 = <optimized out>
> (gdb) p next
> $31 = <optimized out>
> (gdb) p ClassLoaderDataGraph::_saved_unloading
> $32 = (ClassLoaderData *) 0x0
> -----------------------------------------------------------------

AFAICS you cannot find head of _unloading at this point.
However you can traverse CLD list with purge_me->_next .


BTW, CLD has OOP for class loader in ClassLoaderData::_class_loader .
If you check it on (CL)HSDB, you might get any hints from it.
For example, use system class loader instead of custom class loader from framework.


Thanks,

Yasumasa


> Thanks,
> Osamu
> 
> On 10/21/19 22:29, Yasumasa Suenaga wrote:
>> Hi Osamu,
>>
>> What JVM options did you pass?
>>
>> I guess you used CMS because this problem seems to occur on CMS only [1] [2].
>> So it might be work around not to use CMS.
>>
>> I'm not sure root cause of this issue, but it seems to break ClassLoaderDataGraph::_unloading.
>> (like double free (delete) of CLD)
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.hpp#l100
>> [2] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp#l6384
>>
>>
>> On 2019/10/21 17:50, Osamu Sakamoto wrote:
>>> Hi all,
>>>
>>> I have a problem about Segmentation Fault(SEGV) in GC and I can't make the cause clear.
>>> Could you help me solve the problem?
>>>
>>> Our System uses OpenJDK 1.8.0.181, and crashed by SEGV when purging ClassLoader at safepoint.
>>> This problem can't be reproduced, but this has happened 4 times in a few months.
>>>
>>> The following is the summary of my investigation.
>>>
>>> =============================================================================
>>>
>>> First I checked hs_err, and that shows that the SEGV occurred.
>>> VM_Operation is GenCollectForAllocation at safepoint.
>>>
>>> -----------------------------------------------------------------------------
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #? SIGSEGV (0xb) at pc=0x00007f6080c97f88, pid=23931, tid=0x00007f607c3ed700
>>> #
>>> # JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
>>> # Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
>>> # Problematic frame:
>>> # V? [libjvm.so+0x84bf88]
>>> #
>>> # Core dump written. Default location: /opt/tomcate0/core or core.23931
>>> #
>>> # If you would like to submit a bug report, please visit:
>>> #?? http://bugreport.java.com/bugreport/crash.jsp
>>> #
>>>
>>> ---------------? T H R E A D? ---------------
>>>
>>> Current thread (0x00007f6078c00000):? VMThread [stack: 0x00007f607c2ed000,0x00007f607c3ee000] [id=23939]
>>>
>>> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000018
>>>
>>> Registers:
>>> RAX=0x0000000000000010, RBX=0x00007f5ff800ad30, RCX=0x0000000000000010, RDX=0x0000000000000000
>>> RSP=0x00007f607c3ecb50, RBP=0x00007f607c3ecb80, RSI=0x0000000000000002, RDI=0x0000000001cfe570
>>> R8 =0x00007f5ff80ae320, R9 =0x00007f5ff8052480, R10=0x0000000000000000, R11=0x0000000000000400
>>> R12=0x0000000001cfe570, R13=0x00007f6081419470, R14=0x0000000000000002, R15=0x00007f6081418640
>>> RIP=0x00007f6080c97f88, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000004
>>> ?? TRAPNO=0x000000000000000e
>>>
>>> Top of Stack: (sp=0x00007f607c3ecb50)
>>> 0x00007f607c3ecb50:?? 00007f607c3ecba0 00007f5ff800ad30
>>> 0x00007f607c3ecb60:?? 00007f5ff800ad00 0000000000000000
>>> 0x00007f607c3ecb70:?? 0000000000000000 0000000000000001
>>> 0x00007f607c3ecb80:?? 00007f607c3ecba0 00007f6080c995fa
>>> 0x00007f607c3ecb90:?? 00007f5ff800ad00 00007f5ff800ac20
>>> 0x00007f607c3ecba0:?? 00007f607c3ecbc0 00007f60808bff5e
>>> 0x00007f607c3ecbb0:?? 00007f5ff800ac20 00007f5ff8052870
>>> 0x00007f607c3ecbc0:?? 00007f607c3ecbe0 00007f60808c0f0f
>>> 0x00007f607c3ecbd0:?? 00007f607c3ecbf0 00007f608140f308
>>> 0x00007f607c3ecbe0:?? 00007f607c3ecc30 00007f6080daa0b7
>>> 0x00007f607c3ecbf0:?? 00007f6069000100 0000000000000000
>>> 0x00007f607c3ecc00:?? 00007f607c3ecc20 00007f6080ed0800
>>> 0x00007f607c3ecc10:?? 00000000000000f9 88e95c3ba257ab00
>>> 0x00007f607c3ecc20:?? 431bde82d7b634db 00007f607800aa00
>>> 0x00007f607c3ecc30:?? 00007f607c3eccc0 00007f6080daa9d5
>>> 0x00007f607c3ecc40:?? 0000000000000000 00007f607803bf20
>>> 0x00007f607c3ecc50:?? 00007f607803be20 00000000000003e8
>>> 0x00007f607c3ecc60:?? 0000000000000001 00007f6078c00000
>>> 0x00007f607c3ecc70:?? 00007f607c3eccc0 0000000000000000
>>> 0x00007f607c3ecc80:?? 00000004000000f9 00007f60813e2b99
>>> 0x00007f607c3ecc90:?? 00007f607803bfa0 00007f6078c00000
>>> 0x00007f607c3ecca0:?? 0000000000000000 0000000000000000
>>> 0x00007f607c3eccb0:?? 00007f6081418bd0 00007f607803bf20
>>> 0x00007f607c3eccc0:?? 00007f607c3ece60 00007f6080f2048a
>>> 0x00007f607c3eccd0:?? 00007f607c3ecd20 00007f607c3ecce0
>>> 0x00007f607c3ecce0:?? 00007f6078c00000 00007f6078c00980
>>> 0x00007f607c3eccf0:?? 00007f6078c009c0 00007f6078c009d0
>>> 0x00007f607c3ecd00:?? 00007f6078c00aa8 00000000000000d8
>>> 0x00007f607c3ecd10:?? 00007f6078c00be0 0000000000000000
>>> 0x00007f607c3ecd20:?? 00007f607c3ecd28 6e69747563657845
>>> 0x00007f607c3ecd30:?? 65706f204d562067 203a6e6f69746172
>>> 0x00007f607c3ecd40:?? 656c6c6f436e6547 6c6c41726f467463
>>>
>>> Instructions: (pc=0x00007f6080c97f88)
>>> 0x00007f6080c97f68:?? b6 12 80 fa 00 74 01 f0 48 0f c1 01 31 c9 31 f6
>>> 0x00007f6080c97f78:?? 48 8b 44 0b 10 31 d2 48 85 c0 74 11 0f 1f 40 00
>>> 0x00007f6080c97f88:?? 48 8b 40 08 48 83 c2 01 48 85 c0 75 f3 48 83 c1
>>> 0x00007f6080c97f98:?? 08 48 01 d6 48 83 f9 20 75 d6 8b 7b 08 48 8b 05
>>>
>>> Register to memory mapping:
>>>
>>> RAX=0x0000000000000010 is an unknown value
>>> RBX=0x00007f5ff800ad30 is an unknown value
>>> RCX=0x0000000000000010 is an unknown value
>>> RDX=0x0000000000000000 is an unknown value
>>> RSP=0x00007f607c3ecb50 is an unknown value
>>> RBP=0x00007f607c3ecb80 is an unknown value
>>> RSI=0x0000000000000002 is an unknown value
>>> RDI=0x0000000001cfe570 is an unknown value
>>> R8 =0x00007f5ff80ae320 is an unknown value
>>> R9 =0x00007f5ff8052480 is an unknown value
>>> R10=0x0000000000000000 is an unknown value
>>> R11=0x0000000000000400 is an unknown value
>>> R12=0x0000000001cfe570 is an unknown value
>>> R13=0x00007f6081419470: <offset 0xfcd470> in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so at 0x00007f608044c000
>>> R14=0x0000000000000002 is an unknown value
>>> R15=0x00007f6081418640: <offset 0xfcc640> in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so at 0x00007f608044c000
>>>
>>>
>>> Stack: [0x00007f607c2ed000,0x00007f607c3ee000], sp=0x00007f607c3ecb50, free space=1022k
>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>>> V? [libjvm.so+0x84bf88]
>>> V? [libjvm.so+0x84d5fa]
>>> V? [libjvm.so+0x473f5e]
>>> V? [libjvm.so+0x474f0f]
>>> V? [libjvm.so+0x95e0b7]
>>> V? [libjvm.so+0x95e9d5]
>>> V? [libjvm.so+0xad448a]
>>> V? [libjvm.so+0xad48f1]
>>> V? [libjvm.so+0x8beb82]
>>>
>>> VM_Operation (0x00007f5fd69e6120): GenCollectForAllocation, mode: safepoint, requested by thread 0x00007f6079013800
>>>
>>> ...
>>> -----------------------------------------------------------------------------
>>>
>>>
>>>
>>> Next, I used GDB to check the backtrace of the SEGV thread from the coredump.
>>> The following is the backtrace.
>>> The SEGV occurred when ClassLoader is purged and Metaspace is destructed.
>>> And frame #7 shows that a signal(SEGV) handler is called after SpaceManager::~SpaceManager() is executed.
>>>
>>> -----------------------------------------------------------------------------
>>> (gdb) bt
>>> #0? 0x00007f608146f1f7 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>>> #1? 0x00007f60814708e8 in __GI_abort () at abort.c:90
>>> #2? 0x00007f6080d0bc39 in os::abort (dump_core=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1519
>>> #3? 0x00007f6080f1b816 in VMError::report_and_die (this=this at entry=0x7f607c3ebd10) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060
>>> #4? 0x00007f6080d15927 in JVM_handle_linux_signal (sig=11, info=0x7f607c3ebfb0, ucVoid=0x7f607c3ebe80, abort_if_unrecognized=<optimized out>)
>>> ???? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
>>> #5? 0x00007f6080d09038 in signalHandler (sig=11, info=0x7f607c3ebfb0, uc=0x7f607c3ebe80) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4446
>>> #6? <signal handler called>
>>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, __in_chrg=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>>> #8? 0x00007f6080c995fa in Metaspace::~Metaspace (this=0x7f5ff800ad00, __in_chrg=<optimized out>)
>>> ???? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2971
>>> #9? 0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData (this=0x7f5ff800ac20, __in_chrg=<optimized out>)
>>> ???? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
>>> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
>>> #11 0x00007f6080daa0b7 in ClassLoaderDataGraph::purge_if_needed () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.hpp:104
>>> #12 SafepointSynchronize::do_cleanup_tasks () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:551
>>> #13 0x00007f6080daa9d5 in SafepointSynchronize::begin () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:402
>>> #14 0x00007f6080f2048a in VMThread::loop (this=this at entry=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:501
>>> #15 0x00007f6080f208f1 in VMThread::run (this=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276
>>> #16 0x00007f6080d0ab82 in java_start (thread=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796
>>> #17 0x00007f6081e2de25 in start_thread (arg=0x7f607c3ed700) at pthread_create.c:308
>>> #18 0x00007f608153234d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
>>> -----------------------------------------------------------------------------
>>>
>>>
>>> In Frame #7, Line 2028 (chunk = chunk->next()) is the crash point.
>>> The variable "chunk" is defined at Line 2025 (Metachunk* chunk = chunks_in_use(i);).
>>> "chunks_in_use(i)" is defined at Line 648 (Metachunk* chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }).
>>> So I checked values of "_chunks_in_use", and understood that "_chunks_in_use[2]" has Illegal Address "0x10".
>>> Therefore, I think that the SEGV occurred because of referencing Illegal Address "0x10" at "chunk = chunk->next()".
>>>
>>> -----------------------------------------------------------------------------
>>> (gdb) f 7
>>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, __in_chrg=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>>> 2028??? ??? chunk = chunk->next();
>>> (gdb) list
>>> 2023??? size_t SpaceManager::sum_count_in_chunks_in_use(ChunkIndex i) {
>>> 2024??? ? size_t count = 0;
>>> 2025??? ? Metachunk* chunk = chunks_in_use(i);
>>> 2026??? ? while (chunk != NULL) {
>>> 2027??? ??? count++;
>>> 2028??? ??? chunk = chunk->next();
>>> 2029??? ? }
>>> 2030??? ? return count;
>>> 2031??? }
>>> 2032
>>> (gdb) list SpaceManager::chunks_in_use
>>> 647??? ? // Accessors
>>> 648??? ? Metachunk* chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }
>>> ...
>>> (gdb) p _chunks_in_use
>>> $11 = {0x7f5fcd41c400, 0x7f5fcd41a000, 0x10, 0x0}
>>> -----------------------------------------------------------------------------
>>>
>>>
>>>
>>> The following is disassemble code of "SpaceManager::~SpaceManager()".
>>> %rax has 0x10 at "0x00007f6080c97f88 <+200>", but I don't understand why this "0x10" is inserted to %rax.
>>>
>>> -----------------------------------------------------------------------------
>>> (gdb) disas
>>> Dump of assembler code for function SpaceManager::~SpaceManager():
>>> ??? 0x00007f6080c97ec0 <+0>:??? push?? %rbp
>>> ??? 0x00007f6080c97ec1 <+1>:??? mov??? %rsp,%rbp
>>> ??? 0x00007f6080c97ec4 <+4>:??? push?? %r15
>>> ??? 0x00007f6080c97ec6 <+6>:??? push?? %r14
>>> ??? 0x00007f6080c97ec8 <+8>:??? push?? %r13
>>> ??? 0x00007f6080c97eca <+10>:??? push?? %r12
>>> ??? 0x00007f6080c97ecc <+12>:??? push?? %rbx
>>> ??? 0x00007f6080c97ecd <+13>:??? mov??? %rdi,%rbx
>>> ??? 0x00007f6080c97ed0 <+16>:??? sub??? $0x8,%rsp
>>> ??? 0x00007f6080c97ed4 <+20>:??? mov 0x780785(%rip),%r12??????? # 0x7f6081418660 <_ZN12SpaceManager12_expand_lockE>
>>> ??? 0x00007f6080c97edb <+27>:??? test?? %r12,%r12
>>> ??? 0x00007f6080c97ede <+30>:??? je???? 0x7f6080c97ee8 <SpaceManager::~SpaceManager()+40>
>>> ??? 0x00007f6080c97ee0 <+32>:??? mov??? %r12,%rdi
>>> ??? 0x00007f6080c97ee3 <+35>:??? callq? 0x7f6080cce2f0 <Monitor::lock_without_safepoint_check()>
>>> ??? 0x00007f6080c97ee8 <+40>:??? movslq 0x8(%rbx),%rcx
>>> ??? 0x00007f6080c97eec <+44>:??? lea 0x78075d(%rip),%rdx??????? # 0x7f6081418650 <_ZN12MetaspaceAux15_capacity_wordsE>
>>> ??? 0x00007f6080c97ef3 <+51>:??? lea 0x781576(%rip),%r13??????? # 0x7f6081419470 <_ZN2os16_processor_countE>
>>> ??? 0x00007f6080c97efa <+58>:??? lea 0x78073f(%rip),%r15??????? # 0x7f6081418640 <_ZN12MetaspaceAux11_used_wordsE>
>>> ??? 0x00007f6080c97f01 <+65>:??? mov??? (%rdx,%rcx,8),%rax
>>> ??? 0x00007f6080c97f05 <+69>:??? sub??? 0x40(%rbx),%rax
>>> ??? 0x00007f6080c97f09 <+73>:??? mov??? %rax,(%rdx,%rcx,8)
>>> ??? 0x00007f6080c97f0d <+77>:??? mov??? 0x38(%rbx),%rax
>>> ??? 0x00007f6080c97f11 <+81>:??? movslq 0x8(%rbx),%rdx
>>> ??? 0x00007f6080c97f15 <+85>:??? neg??? %rax
>>> ??? 0x00007f6080c97f18 <+88>:??? cmpl?? $0x1,0x0(%r13)
>>> ??? 0x00007f6080c97f1d <+93>:??? lea??? (%r15,%rdx,8),%rcx
>>> ??? 0x00007f6080c97f21 <+97>:??? mov??? $0x1,%edx
>>> ??? 0x00007f6080c97f26 <+102>:??? jne??? 0x7f6080c97f32 <SpaceManager::~SpaceManager()+114>
>>> ??? 0x00007f6080c97f28 <+104>:??? lea 0x74acb4(%rip),%rdx??????? # 0x7f60813e2be3 <AssumeMP>
>>> ??? 0x00007f6080c97f2f <+111>:??? movzbl (%rdx),%edx
>>> ??? 0x00007f6080c97f32 <+114>:??? cmp??? $0x0,%dl
>>> ??? 0x00007f6080c97f35 <+117>:??? je???? 0x7f6080c97f38 <SpaceManager::~SpaceManager()+120>
>>> ??? 0x00007f6080c97f37 <+119>:??? lock xadd %rax,(%rcx)
>>> ??? 0x00007f6080c97f3c <+124>:??? mov??? 0x48(%rbx),%r14
>>> ??? 0x00007f6080c97f40 <+128>:??? callq? 0x7f6080c951a0 <Metachunk::overhead()>
>>> ??? 0x00007f6080c97f45 <+133>:??? movslq 0x8(%rbx),%rdx
>>> ??? 0x00007f6080c97f49 <+137>:??? imul?? %r14,%rax
>>> ??? 0x00007f6080c97f4d <+141>:??? lea (%r15,%rdx,8),%rcx
>>> ??? 0x00007f6080c97f51 <+145>:??? mov??? $0x1,%edx
>>> ??? 0x00007f6080c97f56 <+150>:??? neg??? %rax
>>> ??? 0x00007f6080c97f59 <+153>:??? cmpl?? $0x1,0x0(%r13)
>>> ??? 0x00007f6080c97f5e <+158>:??? jne??? 0x7f6080c97f6a <SpaceManager::~SpaceManager()+170>
>>> ??? 0x00007f6080c97f60 <+160>:??? lea 0x74ac7c(%rip),%rdx??????? # 0x7f60813e2be3 <AssumeMP>
>>> ??? 0x00007f6080c97f67 <+167>:??? movzbl (%rdx),%edx
>>> ??? 0x00007f6080c97f6a <+170>:??? cmp??? $0x0,%dl
>>> ??? 0x00007f6080c97f6d <+173>:??? je???? 0x7f6080c97f70 <SpaceManager::~SpaceManager()+176>
>>> ??? 0x00007f6080c97f6f <+175>:??? lock xadd %rax,(%rcx)
>>> ??? 0x00007f6080c97f74 <+180>:??? xor??? %ecx,%ecx
>>> ??? 0x00007f6080c97f76 <+182>:??? xor??? %esi,%esi
>>> ??? 0x00007f6080c97f78 <+184>:??? mov 0x10(%rbx,%rcx,1),%rax
>>> ??? 0x00007f6080c97f7d <+189>:??? xor??? %edx,%edx
>>> ??? 0x00007f6080c97f7f <+191>:??? test?? %rax,%rax
>>> ??? 0x00007f6080c97f82 <+194>:??? je???? 0x7f6080c97f95 <SpaceManager::~SpaceManager()+213>
>>> ??? 0x00007f6080c97f84 <+196>:??? nopl?? 0x0(%rax)
>>> => 0x00007f6080c97f88 <+200>:??? mov??? 0x8(%rax),%rax
>>> ??? 0x00007f6080c97f8c <+204>:??? add??? $0x1,%rdx
>>> ??? 0x00007f6080c97f90 <+208>:??? test?? %rax,%rax
>>> ...
>>> (gdb) info registers
>>> rax??????????? 0x10??? 16
>>> rbx??????????? 0x7f5ff800ad30??? 140050159414576
>>> rcx??????????? 0x10??? 16
>>> rdx??????????? 0x0??? 0
>>> rsi??????????? 0x2??? 2
>>> rdi??????????? 0x1cfe570??? 30401904
>>> rbp??????????? 0x7f607c3ecb80??? 0x7f607c3ecb80
>>> rsp??????????? 0x7f607c3ecb50??? 0x7f607c3ecb50
>>> r8???????????? 0x7f5ff80ae320??? 140050160083744
>>> r9???????????? 0x7f5ff8052480??? 140050159707264
>>> r10??????????? 0x0??? 0
>>> r11??????????? 0x400??? 1024
>>> r12??????????? 0x1cfe570??? 30401904
>>> r13??????????? 0x7f6081419470??? 140052462146672
>>> r14??????????? 0x2??? 2
>>> r15??????????? 0x7f6081418640??? 140052462143040
>>> rip??????????? 0x7f6080c97f88??? 0x7f6080c97f88 <SpaceManager::~SpaceManager()+200>
>>> eflags???????? 0x206??? [ PF IF ]
>>> cs???????????? 0x33??? 51
>>> ss???????????? 0x2b??? 43
>>> ds???????????? 0x0??? 0
>>> es???????????? 0x0??? 0
>>> fs???????????? 0x0??? 0
>>> gs???????????? 0x0??? 0
>>> k0???????????? <unavailable>
>>> k1???????????? <unavailable>
>>> k2???????????? <unavailable>
>>> k3???????????? <unavailable>
>>> k4???????????? <unavailable>
>>> k5???????????? <unavailable>
>>> k6???????????? <unavailable>
>>> k7???????????? <unavailable>
>>> -----------------------------------------------------------------------------
>>>
>>> =============================================================================
>>>
>>>
>>>
>>> Does anyone know about this case?
>>>
>>> Thanks, Osamu
>>>
>>>
> 


From kim.barrett at oracle.com  Thu Oct 24 04:26:04 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 24 Oct 2019 00:26:04 -0400
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <531dc0fe-236d-110b-65ee-d224ac028130@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
 <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
 <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>
 <1898AC1E-0A8C-467C-9CA9-4B02C00A3A07@oracle.com>
 <7f150234-4080-b2f9-a791-b456038af795@oracle.com>
 <8126d900-714b-585a-f2f0-4ce13f71501c@oracle.com>
 <531dc0fe-236d-110b-65ee-d224ac028130@oracle.com>
Message-ID: <36DA8B60-2E22-4E91-8CD9-35AA0B3A53B3@oracle.com>

> On Oct 23, 2019, at 4:39 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Stefan,
> 
> On 23.10.19 09:05, Stefan Johansson wrote:
>> Hi Thomas,
>> On 2019-10-22 15:45, Thomas Schatzl wrote:
>>> Hi Kim,
>>> 
>>> On 22.10.19 15:44, Kim Barrett wrote:
>>>>> On Oct 22, 2019, at 6:13 AM, Thomas Schatzl 
> [...]>>>> Webrevs:
>>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2_to_3/ (diff)
>>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3/ (full)
>> This looks good, and well documented :)
>> One small thing:
>> src/hotspot/share/gc/g1/g1SharedClosures.hpp
>> ---
>>  46     _codeblobs(pss->worker_id(), &_oops, Mark == G1MarkFromRoot) {}
>> What do you think about adding a helper for Mark == G1MarkFromRoot, something like need_strong_processing() and a comment explaining that it will be true during initial mark.
> 
> Something like this?
> 
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3_to_4/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.4/ (full)
> 
> Not completely sure if that is required as searching for G1MarkFromRoot shows that it is only used for the strong shared closures in the initial mark closure set. But I understand that it is nice to be reminded about this.
> 
> Thanks for your and Kim's reviews.
> 
> Thanks,
>  Thomas

Still good.


From erik.osterlund at oracle.com  Thu Oct 24 10:38:37 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Thu, 24 Oct 2019 12:38:37 +0200
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
Message-ID: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>

Hi,

Now that some curling has been performed, paving way for this patch:

 ??? 8229027: Improve how JNIHandleBlock::oops_do distinguishes oops 
from non-oops
 ??? 8229278: Improve hs_err location printing to assume less about GC 
internals
 ??? 8229189: Improve JFR leak profiler tracing to deal with 
discontiguous heaps
 ??? 8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
 ??? 8224820: ZGC: Support discontiguous heap reservations

...the remaining thing to do is plugging in a few platform specific ZGC 
files. This patch does that.
Decided to go with mach_vm_map/mach_vm_remap to implement multi-mapping. 
Previously I didn't want to do that as I couldn't figure out how to 
mach_vm_remap memory on top of reserved VA (acquired using mmap). But 
apparently VM_FLAGS_OVERWRITE was the missing ingredient there. With 
that in place, dodging the terrible ftruncate implementation on macOS 
seemed like a good idea. That also implies this port supports large 
pages (unlike other GCs on macOS today). Yay!

CR:
http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8229358

Thanks,
/Erik


From thomas.schatzl at oracle.com  Thu Oct 24 10:44:39 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 24 Oct 2019 12:44:39 +0200
Subject: RFR (L): 8230706: Waiting on completion of strong nmethod
 processing causes long pause times with G1
In-Reply-To: <36DA8B60-2E22-4E91-8CD9-35AA0B3A53B3@oracle.com>
References: <f19bc69a-9bde-c063-2674-9564721ceede@oracle.com>
 <0F637570-EC97-47C5-B493-B33681133149@oracle.com>
 <5c6b06b1-de44-3cb7-7fc8-0b641df5f353@oracle.com>
 <DE7A950D-A877-4093-AFE8-363E6E079A28@oracle.com>
 <80DA3FD5-C2FA-44BF-83C5-AE0EA6AA3684@oracle.com>
 <d0b624a1fc2c7310986b79da1f65f3a8a851d20a.camel@oracle.com>
 <0D820E95-361A-4CAC-9BC3-99C39512D396@oracle.com>
 <b8a41ba8-31d0-f9e8-4daa-41e861fb2856@oracle.com>
 <1898AC1E-0A8C-467C-9CA9-4B02C00A3A07@oracle.com>
 <7f150234-4080-b2f9-a791-b456038af795@oracle.com>
 <8126d900-714b-585a-f2f0-4ce13f71501c@oracle.com>
 <531dc0fe-236d-110b-65ee-d224ac028130@oracle.com>
 <36DA8B60-2E22-4E91-8CD9-35AA0B3A53B3@oracle.com>
Message-ID: <56554e4c-d511-73d2-e29d-3b7b51260e51@oracle.com>

Hi Stefan, Kim,

   thanks for your reviews. The change has finally been pushed :)

Thomas

On 24.10.19 06:26, Kim Barrett wrote:
>> On Oct 23, 2019, at 4:39 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi Stefan,
>>
>> On 23.10.19 09:05, Stefan Johansson wrote:
>>> Hi Thomas,
>>> On 2019-10-22 15:45, Thomas Schatzl wrote:
>>>> Hi Kim,
>>>>
>>>> On 22.10.19 15:44, Kim Barrett wrote:
>>>>>> On Oct 22, 2019, at 6:13 AM, Thomas Schatzl
>> [...]>>>> Webrevs:
>>>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2_to_3/ (diff)
>>>>>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3/ (full)
>>> This looks good, and well documented :)
>>> One small thing:
>>> src/hotspot/share/gc/g1/g1SharedClosures.hpp
>>> ---
>>>   46     _codeblobs(pss->worker_id(), &_oops, Mark == G1MarkFromRoot) {}
>>> What do you think about adding a helper for Mark == G1MarkFromRoot, something like need_strong_processing() and a comment explaining that it will be true during initial mark.
>>
>> Something like this?
>>
>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.3_to_4/ (diff)
>> http://cr.openjdk.java.net/~tschatzl/8230706/webrev.4/ (full)
>>
>> Not completely sure if that is required as searching for G1MarkFromRoot shows that it is only used for the strong shared closures in the initial mark closure set. But I understand that it is nice to be reminded about this.
>>
>> Thanks for your and Kim's reviews.
>>
>> Thanks,
>>   Thomas
> 
> Still good.
> 


From thomas.schatzl at oracle.com  Thu Oct 24 11:50:27 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 24 Oct 2019 13:50:27 +0200
Subject: RFR (XS): 8232951: TestG1ParallelPhases.java fails with phase
 NonYoungFreeCSet not found
Message-ID: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com>

Hi all,

   can I have reviews for this small fix to the 
TestG1ParallelPhases.java test so that it is more robust?

As far as I can tell from the failure and program execution the test 
tries to force mixed gcs expecting particular JFR events.

In particular the failure is about the failure to get the 
"NonYoungFreeCSet" parallel phase.

This event is sent when freeing the remembered sets of an old region 
(during mixed gc).

However due to how the test is set up, while it successfully forces 
mixed gcs, it does not make sure that there ever is waste in old regions 
(even if the test sets the threshold to 0.0). In my unsuccessful 
reproduction tries I noticed that the actual waste at the start of mixed 
gc is very close to 0.0 (or even 0.0) in all mixed gcs (often something 
like 8 bytes to reclaim only in total), and that while the threshold 
forces a mixed gc, sometimes during region selection no old gen regions 
are selected, and so no freeing of an old gen region and the JFR event 
occurs.

This seems to be correct to me, so I changed the test a little to be 
sure to actually generate waste.

The alternative would be to send a fake "NonYoungFreeCSet" parallel 
phase jfr event in the collector, but I do not like sending fake events 
for the sake of a test.

I also added some useful logging options in case this occurs again in 
CI, and curbed the amount of (young only) GCs performed.

Note that I did not manage to reproduce the issue myselves - the only 
occurrence has been a month ago that has been linked to a wrong bug. 
Obviously the failure (that we do not get any reclaimable old gen 
region) depends on a lot of other factors.

CR:
https://bugs.openjdk.java.net/browse/JDK-8232951
Webrev:
http://cr.openjdk.java.net/~tschatzl/8232951/webrev/
Testing:
400 runs of the changed test without issues

Thanks,
   Thomas


From per.liden at oracle.com  Thu Oct 24 12:00:28 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 24 Oct 2019 14:00:28 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
Message-ID: <1fa725ed-7cc9-68f5-0976-5d588ccfec68@oracle.com>

Hi Sangheon,

On 10/23/19 6:20 PM, sangheon.kim at oracle.com wrote:
> Hi Per,
> 
> Thanks for taking a look at this.
> 
> I agree all your comments and here's the webrev.
> - All comments from Per.
> - Move G1PageBasedVirtualSpace::page_size() near to page_start() from Kim.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6.inc
> Testing: build test for linux, solaris, windows and mac.

Thanks for fixing. os changes look good to me.

/Per

> 
> FYI, as I think existing numa related API names and -1 stuff seem not 
> good, I planned to refine those later after pushing. But as you said 
> following existing rule and then refine all together later seems better.
> 
> Thanks,
> Sangheon
> 
> 
> On 10/23/19 1:21 AM, Per Liden wrote:
>> Hi Sangheon,
>>
>> I noticed that this patch adds os::numa_get_address_id(). That name is 
>> misleading as it doesn't return an "address id", but a "numa node id". 
>> However, the terminology used in the os class for numa node is "group" 
>> (for example, numa_get_groups_num, numa_get_group_id, etc). So I'd 
>> suggest we instead name this os::numa_get_group_id(void* address), 
>> i.e. an overload of os::numa_get_group_id().
>>
>> Btw, I think that the numa related names used in the os class are odd, 
>> but I guess that are brought over from Solaris. We can refine those at 
>> some later time if we want, but for now I think we should follow the 
>> naming convention that we have there.
>>
>> Also, I don't think this function should print warnings, as that's up 
>> to the caller to decide what to do, what to print, etc.
>>
>> Furthermore, I suggest we remove os::InvalidNUMAId. Other numa 
>> functions in the os class returns -1 on error, so I think we should do 
>> that here too.
>>
>> Here's a patch with the proposed changes:
>>
>>
>> diff --git a/src/hotspot/os/linux/os_linux.cpp 
>> b/src/hotspot/os/linux/os_linux.cpp
>> --- a/src/hotspot/os/linux/os_linux.cpp
>> +++ b/src/hotspot/os/linux/os_linux.cpp
>> @@ -3007,7 +3007,7 @@
>> ?? return 0;
>> ?}
>>
>> -int os::numa_get_address_id(void* address) {
>> +int os::numa_get_group_id(void* address) {
>> ?#ifndef MPOL_F_NODE
>> ?#define MPOL_F_NODE???? (1<<0)? // Return next IL mode instead of 
>> node mask
>> ?#endif
>> @@ -3016,11 +3016,10 @@
>> ?#define MPOL_F_ADDR???? (1<<1)? // Look up VMA using address
>> ?#endif
>>
>> -? int id = InvalidNUMAId;
>> +? int id = 0;
>>
>> ?? if (syscall(SYS_get_mempolicy, &id, NULL, 0, address, MPOL_F_NODE | 
>> MPOL_F_ADDR) == -1) {
>> -??? warning("Failed to get numa id at " PTR_FORMAT " with errno=%d", 
>> p2i(address), errno);
>> -??? return InvalidNUMAId;
>> +??? return -1;
>> ?? }
>> ?? return id;
>> ?}
>> diff --git a/src/hotspot/share/gc/g1/g1NUMA.cpp 
>> b/src/hotspot/share/gc/g1/g1NUMA.cpp
>> --- a/src/hotspot/share/gc/g1/g1NUMA.cpp
>> +++ b/src/hotspot/share/gc/g1/g1NUMA.cpp
>> @@ -164,7 +164,7 @@
>>
>> ?uint G1NUMA::index_of_address(HeapWord *address) const {
>> ?? int numa_id = os::numa_get_address_id((void*)address);
>> -? if (numa_id == os::InvalidNUMAId) {
>> +? if (numa_id == -1) {
>> ???? return UnknownNodeIndex;
>> ?? } else {
>> ???? return index_of_node_id(numa_id);
>> @@ -201,7 +201,7 @@
>> ?? if (!is_enabled()) {
>> ???? return;
>> ?? }
>> -
>> +
>> ?? if (size_in_bytes == 0) {
>> ???? return;
>> ?? }
>> diff --git a/src/hotspot/share/runtime/os.hpp 
>> b/src/hotspot/share/runtime/os.hpp
>> --- a/src/hotspot/share/runtime/os.hpp
>> +++ b/src/hotspot/share/runtime/os.hpp
>> @@ -374,10 +374,7 @@
>> ?? static size_t numa_get_leaf_groups(int *ids, size_t size);
>> ?? static bool?? numa_topology_changed();
>> ?? static int??? numa_get_group_id();
>> -
>> -? static const int InvalidNUMAId = -1;
>> -
>> -? static int numa_get_address_id(void* address);
>> +? static int??? numa_get_group_id(void* address);
>>
>> ?? // Page manipulation
>> ?? struct page_info {
>>
>>
>> cheers,
>> Per
>>
>>
>> On 10/16/19 7:54 PM, sangheon.kim at oracle.com wrote:
>>> Hi Kim, Stefan and Thomas,
>>>
>>> Many thanks for the reviews and suggestions!
>>>
>>> Kim,
>>> I will move page_size() near page_start() before push as you suggested.
>>> As you know, all 3 patches will be pushed together though.
>>>
>>> Thanks,
>>> Sangheon
>>>
>>>
>>> On 10/16/19 7:00 AM, Kim Barrett wrote:
>>>>> On Oct 15, 2019, at 10:33 AM, sangheon.kim at oracle.com wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Here's revised webrev which addresses:
>>>>> 1) G1RegionToSpaceMapper checks mtJavaHeap and then conditionally 
>>>>> calls G1NUMA::request_memory_on_node() (Kim)
>>>>> 2) The signature of G1NUMA::request_memory_on_node(void* address, 
>>>>> ,) is changed to have actual address instead of page index. (Stefan)
>>>>> 3) Some local variable name changes at G1RegionToSpaceMapper. i -> 
>>>>> region_idx, idx -> page_idx (for local style, used idx instead of 
>>>>> index)
>>>>>
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5/
>>>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.5.inc/
>>>>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
>>>> Looks good.
>>>>
>>>> In g1PageBasedVirtualSpace.cpp, could the newly added definition of 
>>>> page_size()
>>>> be moved to be near the existing definition of page_start()? I don?t 
>>>> need a new
>>>> webrev if you move it.
>>>>
>>>
> 


From stefan.johansson at oracle.com  Thu Oct 24 12:16:03 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 24 Oct 2019 14:16:03 +0200
Subject: [PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC
 Performance
In-Reply-To: <CAKSDcxsm3-6u0arR4KCRGF=R-1sD9XJAS3Fb98NxzcPASEpGwg@mail.gmail.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
 <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
 <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>
 <CAKSDcxu_t+Ka0LUU3WSzdR52-_+rsvKo_Vxn8av=duFmS4EPyA@mail.gmail.com>
 <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com>
 <e72f06af-8847-844b-107c-afd15e01f71b@oracle.com>
 <CAKSDcxsm3-6u0arR4KCRGF=R-1sD9XJAS3Fb98NxzcPASEpGwg@mail.gmail.com>
Message-ID: <a1eeabca-f70e-f01a-9459-12bf913688d4@oracle.com>

Hi Haoyu,

On 2019-10-23 17:15, Haoyu Li wrote:
> Hi Stefan,
> 
> Thanks for your constructive feedback. I've addressed all the issues you 
> mentioned, and the updated patch is attached in this email.
Nice, I will look at the patch next week, but I'll shortly answer your 
questions right away.

> 
> During refining the patch, I have a couple of questions:
> 1) Now the MoveAndUpdateClosure and ShadowClosure assume the destination 
> address is the very beginning of a region, instead of an arbitrary 
> address like what it used to be. However, there is an unused function 
> named PSParallelCompact::move_and_update() uses the MoveAndUpdateClosure 
> to process a region from its middle, which conflicts with the 
> assumption. I notice that you removed this function in your patch, and 
> so did I in the updated patch. Does it matter?
Yes, I found this function during my code review and it should be 
removed, but I think that should be handled as a separate issue. We can 
do this removal before this patch goes in.

> 2) Using the same do_addr() in MoveAndUpdateClosure and ShadowClosure is 
> doable, but it does not reuse all the code neatly. Because storing the 
> address of the shadow region in _destination requires extra virtual 
> functions to handle allocating blocks in the start_array and setting 
> addresses of deferred objects. In particular, allocate_blocks() and 
> set_deferred_object_for() in both closures are added. Is it worth 
> avoiding to use _offset to calculate the shadow_destination?
Ok, sounds like it might be better to have specific do_addr() functions 
then. I'll think some more around this when reviewing the new patch in 
depth.

> 
> If there are any problems with this patch, please contact me anytime. 
> I'm more than happy to keep improving the code. Thanks again for reviewing.
>
Sound good, thanks,
Stefan

> Best,
> Haoyu Li
> 
> 
> Stefan Johansson <stefan.johansson at oracle.com 
> <mailto:stefan.johansson at oracle.com>> ?2019?10?22??? ??9:42???
> 
>     Hi Haoyu,
> 
>     I've reviewed the patch now and have some comments and questions.
> 
>     To simplify the review and have a common base to look at I've created a
>     webrev at:
>     http://cr.openjdk.java.net/~sjohanss/8220465/00/
> 
>     One general note first, most of the new code uses four space
>     indentation, in hotspot the standard is two spaces, please change this.
>     Below are some file by file comments.
> 
>     src/hotspot/share/gc/parallel/psCompactionManager.cpp
>     ---
>      ? ?53 GrowableArray<size_t >* ParCompactionManager::_free_shadow = new
>     (ResourceObj::C_HEAP, mtInternal) GrowableArray<size_t >(10, true);
>      ? ?54 Monitor*? ? ? ? ? ? ? ? ParCompactionManager::_monitor = NULL;
> 
>     Set _free_shadow to NULL here like the other statics and then create
>     the
>     GrowableArray in initialize(). I also think _shadow_region_array or
>     something like that would be a better name and the monitor should also
>     be named something that signals that it is used for this array.
>     ---
>      ? ?70? ?if (_monitor == NULL) {
>      ? ?71? ? ? ?_monitor = new Monitor(Mutex::barrier, "CompactionManager
>     monitor",
>      ? ?72? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Mutex::_allow_vm_block_flag,
>     Monitor::_safepoint_check_never);
>      ? ?73? ?}
> 
>     Instead of doing the monitor creation here having to check for NULL, do
>     it in initialize() below together with the array creation.
>     ---
> 
>     src/hotspot/share/gc/parallel/psParallelCompact.cpp
>     ---
>     2974? ? ? ?if (cur->push()) {
> 
>     Correct me if I'm wrong, if this call to push() returns true it means
>     that nobody else has "stolen" it (used a shadow region to prepare it)
>     and we mark it as pushed. But when pushed in this code path this is the
>     end state for this RegionData? If this is the case I think it would be
>     easier to understand the code if we added another function and state
>     for
>     when we "steal" it. Haven't thought very much about the names but I
>     think you understand what I want to achieve:
>     Normal path:
>     UNUSED -> push() -> NORMAL
>     Steal path:
>     UNUSED -> steal() -> STOLEN -> fill() -> FILLED -> copy() -> SHADOW
> 
>     We could then also assert in set_completed() that the state is either
>     NORMAL or SHADOW (or if they have a shared end state DONE). As I said
>     the names can be improved (both for the states and the functions) but I
>     think we should have names and not just numbers.
>     ---
> 
>     3060 template <class T>
>     3061 void PSParallelCompact::fill_region(ParCompactionManager* cm,
>     size_t region_idx, size_t shadow, size_t offset)
> 
>     As I told you this was a big improvement from the first patch, but I
>     think there is room for even more improvements around the way we
>     pass in
>     ignored parameters to MoveAndUpdateClosure. Explaining my idea in text
>     is harder than code, so I created a patch, what do you think about this?
>     http://cr.openjdk.java.net/~sjohanss/8220465/00-alt/
> 
>     This alternative is based on 00 and does not take my other comments
>     into
>     consideration. So it might have to be altered a bit if you address some
>     of my other comments/questions.
>     ---
> 
>     3196 void PSParallelCompact::copy_back(HeapWord *region_addr, HeapWord
>     *shadow_addr) {
> 
>     I think the paramenter should change place, so that it corresponds with
>     the copy below.
>     ---
> 
>     3200 bool PSParallelCompact::steal_shadow_region(ParCompactionManager*
>     cm, size_t &region_idx) {
>     3201? ? ?size_t& record = cm->shadow_record();
> 
>     Did you consider to just let shadow_record() be a simple getter instead
>     of getting a reference and then have a next_shadow_record() which
>     advances it by active_workers?
>     ---
> 
>     3236 void PSParallelCompact::initialize_steal_record(uint which) {
> 
>     I'm having a hard time understanding the details here, or I get that
>     all
>     threads should have a separate shadow record, but I'm not sure why
>     it is
>     not enough to just do:
>     size_t record = _summary_data.addr_to_region_idx(
>      ? ?_space_info[old_space_id].dense_prefix());
>     cm->set_shadow_record(record + which);
> 
>     As you can see I'm also suggesting adding a setter for shadow_record.
>     ---
> 
>     3434 ParMarkBitMapClosure::IterationStatus
>     3435 ShadowClosure::do_addr(HeapWord* addr, size_t words) {
>     3436? ? ?HeapWord* shadow_destination = destination() + _offset;
> 
>     Using an offset instead of a given address feels a bit backwards, did
>     you consider letting the closure keep and update a _shadow_destination
>     instead? Or would it even be possible to just set destination to be the
>     shadow region address? In that case it should be possible to just use
>     the do_addr and other functions from the MoveAndUpdateClosure.
> 
>     I see from looking at this particular function that there is one assert
>     that would have to change:
>     3408
>     assert(PSParallelCompact::summary_data().calc_new_pointer(source(),
>     compaction_manager()) ==
>     3409? ? ? ? ? destination(), "wrong destination");
> 
>     This should be easily fixed by adding a virtual function
>     check_destination, that has a special implementation for the
>     ShadowClosure.
>     ---
> 
>     src/hotspot/share/gc/parallel/psParallelCompact.hpp
>     ---
>      ? 333? ? ?// Preempt the region to avoid double processes
>      ? 334? ? ?inline bool push();
>      ? 335? ? ?// Mark the region as filled and ready to be copied back
>      ? 336? ? ?inline bool fill();
>      ? 337? ? ?// Preempt the region to copy the shadow region content back
>      ? 338? ? ?inline bool copy();
> 
>     As mentioned, I think there might be better names for those functions
>     and the comments. Maybe adding a prefix would make the code more self
>     explaining. try_push(), mark_filled(), try_copy() and the new
>     try_steal().
>     ---
> 
>     Thanks again for providing this patch, I look forward to see an updated
>     version.
> 
>     Cheers,
>     Stefan
> 
> 
>     On 2019-10-14 15:00, Stefan Johansson wrote:
>      > Thanks for the quick update Haoyu,
>      >
>      > This is a great improvement and I will try to find time to look
>     into the
>      > patch in more detail the coming weeks.
>      >
>      > Thanks,
>      > Stefan
>      >
>      > On 2019-10-11 14:49, Haoyu Li wrote:
>      >> Hi Stefan,
>      >>
>      >> Thanks for your suggestion! It is very redundant that
>      >> PSParallelCompact::fill_shadow_region() copies most code from
>      >> PSParallelCompact::fill_region(), and therefore I've refactored
>     these
>      >> two functions to share code as many as possible. And the
>     attachment is
>      >> the updated patch.
>      >>
>      >> Specifically, the closure, which moves objects, in
>      >> PSParallelCompact::fill_region() is now declared as a template of
>      >> either MoveAndUpdateClosure or ShadowClosure. So by controlling the
>      >> type of closure when invoking the function, we can decide whether to
>      >> fill a normal region or a shadow one. Thus, almost all code in
>      >> PSParallelCompact::fill_region() can be reused.
>      >>
>      >> Besides, a virtual function named complete_region() is added in both
>      >> closures to do some work after the filling, such setting states and
>      >> copying the shadow region back.
>      >>
>      >> Thanks again for reviewing the patch, looking forward to your
>     insights
>      >> and suggestions!
>      >>
>      >> Best Regards,
>      >> Haoyu Li
>      >>
>      >> 2019-10-10 21:50 GMT+08:00, Stefan Johansson
>      >> <stefan.johansson at oracle.com <mailto:stefan.johansson at oracle.com>>:
>      >>> Thanks for the clarification =)
>      >>>
>      >>> Moving on to the next part, the code in the patch. So this
>     won't be a
>      >>> full review of the patch but just an initial comment that I
>     would like
>      >>> to be addressed first.
>      >>>
>      >>> The new function PSParallelCompact::fill_shadow_region() is
>     more or less
>      >>> a copy of PSParallelCompact::fill_region() and I understand
>     that from a
>      >>> proof of concept point of view it was the easy (and right) way
>     to do it.
>      >>> I would prefer if the code could be refactored so that
>     fill_region() and
>      >>> fill_shadow_region() share more code. There might be reasons
>     that I've
>      >>> missed, that prevents it, but we should at least explore how
>     much code
>      >>> can be shared.
>      >>>
>      >>> Thanks,
>      >>> Stefan
>      >>>
>      >>> On 2019-10-10 15:10, Haoyu Li wrote:
>      >>>> Hi Stefan,
>      >>>>
>      >>>> Thanks for your quick response! As to your concern about the
>     OCA, I am
>      >>>> the sole author of the patch. And it is the case as what the
>     agreement
>      >>>> states.
>      >>>> Best Regrads,
>      >>>> Haoyu Li,
>      >>>>
>      >>>>
>      >>>> Stefan Johansson <stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >>>> <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>> ?2019?10?10??? ??8:37
>      >>>> ???
>      >>>>
>      >>>> ???? Hi,
>      >>>>
>      >>>> ???? On 2019-10-10 13:06, Haoyu Li wrote:
>      >>>> ????? > Hi Stefan,
>      >>>> ????? >
>      >>>> ????? > Thanks for your testing! One possible reason for the
>      >>>> regressions
>      >>>> in
>      >>>> ????? > simple tests is that the region dependencies maybe not
>     heavy
>      >>>> enough.
>      >>>> ????? > Because the locality of shadow regions is lower than
>     that of
>      >>>> heap
>      >>>> ????? > regions, writing to shadow regions will be slower than to
>      >>>> normal
>      >>>> ????? > regions, and this is a part of the reason why I reuse
>     shadow
>      >>>> ???? regions.
>      >>>> ????? > Therefore, if only a few shadow regions are created
>     and not
>      >>>> ???? reused, the
>      >>>> ????? > overhead may not be amortized.
>      >>>>
>      >>>> ???? I guess it is something like this. I thought that for
>     "easy" heaps
>      >>>> the
>      >>>> ???? shadow regions won't be used at all, and should therefor not
>      >>>> really
>      >>>> ???? cost
>      >>>> ???? anything.
>      >>>>
>      >>>> ????? >
>      >>>> ????? > As to the OCA, it is the case that I'm the only person
>      >>>> signing the
>      >>>> ????? > agreement. Please let me know if you have any further
>      >>>> questions.
>      >>>> ???? Thanks
>      >>>> ????? > again!
>      >>>>
>      >>>> ???? Ok, so you are the sole author of the patch. The important
>      >>>> part, as
>      >>>> the
>      >>>> ???? agreement states, is:
>      >>>> ???? "no other person or entity, including my employer, has or
>     will
>      >>>> have
>      >>>> ???? rights with respect my contributions"
>      >>>>
>      >>>> ???? Is that the case?
>      >>>>
>      >>>> ???? Thanks,
>      >>>> ???? Stefan
>      >>>>
>      >>>> ????? >
>      >>>> ????? > Best Regrads,
>      >>>> ????? > Haoyu Li
>      >>>> ????? >
>      >>>> ????? > Stefan Johansson <stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >>>> ???? <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>
>      >>>> ????? > <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >>>> ???? <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>>> ?2019?10?8??? ??
>      >>>> 6:49
>      >>>> ???? ???
>      >>>> ????? >
>      >>>> ????? >???? Hi Haoyu,
>      >>>> ????? >
>      >>>> ????? >???? I've done some more testing and I haven't seen any
>     issues
>      >>>> ???? with the
>      >>>> ????? >???? patch
>      >>>> ????? >???? so far and the performance looks promising in most
>      >>>> cases. For
>      >>>> ???? simple
>      >>>> ????? >???? tests I've seen some regressions, but I'm not
>     really sure
>      >>>> ???? why. Will do
>      >>>> ????? >???? some more digging.
>      >>>> ????? >
>      >>>> ????? >???? To move forward with this the first thing we need
>     to do is
>      >>>> ???? making sure
>      >>>> ????? >???? that you being covered by the Oracle Contributor
>      >>>> Agreement is
>      >>>> ???? enough.
>      >>>> ????? >?????? From what we can see it is only you as an
>     individual that
>      >>>> ???? has signed
>      >>>> ????? >???? the OCA and in that case it is important that this
>      >>>> statement
>      >>>> ???? from the
>      >>>> ????? >???? OCA is fulfilled: "no other person or entity,
>     including my
>      >>>> ???? employer,
>      >>>> ????? >???? has
>      >>>> ????? >???? or will have rights with respect my contributions"
>      >>>> ????? >
>      >>>> ????? >???? Is this the case for this contribution or should
>     we have
>      >>>> the
>      >>>> ???? university
>      >>>> ????? >???? sign the OCA as well? For more information
>     regarding the
>      >>>> OCA
>      >>>> ???? please
>      >>>> ????? >???? refer to:
>      >>>> ????? > https://www.oracle.com/technetwork/oca-faq-405384.pdf
>      >>>> ????? >
>      >>>> ????? >???? Thanks,
>      >>>> ????? >???? Stefan
>      >>>> ????? >
>      >>>> ????? >???? On 2019-09-16 16:02, Haoyu Li wrote:
>      >>>> ????? >????? > FYI, the evaluation results on OpenJDK 14 are
>     plotted in
>      >>>> the
>      >>>> ????? >???? attachment.
>      >>>> ????? >????? > I compute the full GC throughput by dividing
>     the heap
>      >>>> size
>      >>>> ???? before
>      >>>> ????? >???? full
>      >>>> ????? >????? > GC by the GC pause time, and the results are
>     arithmetic
>      >>>> mean
>      >>>> ????? >???? values of
>      >>>> ????? >????? > ten runs after a warm-up run. The evaluation is
>      >>>> conducted on
>      >>>> a
>      >>>> ????? >???? machine
>      >>>> ????? >????? > with dual Intel ?XeonTM E5-2618L v3 CPUs (2
>     sockets, 16
>      >>>> ???? physical
>      >>>> ????? >???? cores
>      >>>> ????? >????? > with SMT enabled) and 64G DRAM.
>      >>>> ????? >????? >
>      >>>> ????? >????? > Best Regrads,
>      >>>> ????? >????? > Haoyu Li,
>      >>>> ????? >????? > Institute of Parallel and Distributed
>     Systems(IPADS),
>      >>>> ????? >????? > School of Software,
>      >>>> ????? >????? > Shanghai Jiao Tong University
>      >>>> ????? >????? >
>      >>>> ????? >????? >
>      >>>> ????? >????? > Stefan Johansson <stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >>>> ???? <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>
>      >>>> ????? >???? <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >>>> ???? <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>>
>      >>>> ????? >????? > <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >>>> ???? <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>
>      >>>> ????? >???? <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>
>      >>>> ???? <mailto:stefan.johansson at oracle.com
>     <mailto:stefan.johansson at oracle.com>>>>> ?2019?9?12??? ?
>      >>>> ?5:34
>      >>>> ????? >???? ???
>      >>>> ????? >????? >
>      >>>> ????? >????? >???? Hi Haoyu,
>      >>>> ????? >????? >
>      >>>> ????? >????? >???? I recently came across your patch and I would
>      >>>> like to
>      >>>> ???? pick up on
>      >>>> ????? >????? >???? some of the things Kim mentioned in his
>     mails. I
>      >>>> ???? especially want
>      >>>> ????? >????? >???? evaluate and investigate if this is a
>     technique
>      >>>> we can
>      >>>> ???? use to
>      >>>> ????? >????? >???? improve the other GCs as well. To start
>     that work I
>      >>>> ???? want to
>      >>>> ????? >???? take the
>      >>>> ????? >????? >???? patch for a spin in our internal performance
>      >>>> testing.
>      >>>> ???? The patch
>      >>>> ????? >????? >???? doesn?t apply clean to the latest JDK
>     repository, so
>      >>>> ???? if you could
>      >>>> ????? >????? >???? provide an updated patch that would be very
>     helpful.
>      >>>> ????? >????? >
>      >>>> ????? >????? >???? It would also be great if you could share
>     some more
>      >>>> ???? information
>      >>>> ????? >????? >???? around the results presented in the paper. For
>      >>>> example,
>      >>>> it
>      >>>> ????? >???? would be
>      >>>> ????? >????? >???? good to get the full command lines for the
>     different
>      >>>> ????? >???? benchmarks so
>      >>>> ????? >????? >???? we can run them locally and reproduce the
>      >>>> ???? results you?ve seen.
>      >>>> ????? >????? >
>      >>>> ????? >????? >???? Thanks,
>      >>>> ????? >????? >???? Stefan
>      >>>> ????? >????? >
>      >>>> ????? >????? >>???? 12 mars 2019 kl. 03:21 skrev Haoyu Li
>      >>>> ???? <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>      >>>> ????? >???? <mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com> <mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com>>>
>      >>>> ????? >????? >>???? <mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com>
>      >>>> ???? <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>      >>>> ???? <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>>>>:
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???? Hi Kim,
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???? Thanks for reviewing and testing the
>     patch. If
>      >>>> there
>      >>>> ???? are any
>      >>>> ????? >????? >>???? failures or performance degradation
>     relevant to the
>      >>>> ???? work, please
>      >>>> ????? >????? >>???? let me know and I'll be very happy to keep
>      >>>> improving
>      >>>> it.
>      >>>> ????? >???? Also, any
>      >>>> ????? >????? >>???? suggestions about code improvements are well
>      >>>> appreciated.
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???? I'm not quite sure if both G1 and Shenandoah
>      >>>> have the
>      >>>> ???? similar
>      >>>> ????? >????? >>???? region dependency issue, since I haven't
>     studied
>      >>>> their
>      >>>> GC
>      >>>> ????? >????? >>???? behaviors before. If they have, I'm also
>     willing to
>      >>>> ???? propose
>      >>>> ????? >???? a more
>      >>>> ????? >????? >>???? general optimization.
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???? As to the memory overhead, I believe it
>     will be low
>      >>>> ???? because this
>      >>>> ????? >????? >>???? patch exploits empty regions in the young
>     space
>      >>>> ???? rather than
>      >>>> ????? >????? >>???? off-heap memory to allocate shadow
>     regions, and
>      >>>> also
>      >>>> ???? reuses the
>      >>>> ????? >????? >>???? /_source_region/ field of each /RegionData
>     /to
>      >>>> record
>      >>>> the
>      >>>> ????? >????? >>???? correspongding shadow region index. We only
>      >>>> introduce
>      >>>> ???? a new
>      >>>> ????? >????? >>???? integer filed /_shadow /in the RegionData
>     class to
>      >>>> ???? indicate the
>      >>>> ????? >????? >>???? status of a region, a global /GrowableArray
>      >>>> ???? _free_shadow/ to
>      >>>> ????? >???? store
>      >>>> ????? >????? >>???? the indices of shadow regions, and a global
>      >>>> ???? /Monitor/ to protect
>      >>>> ????? >????? >>???? the array. These information might help if
>     the
>      >>>> memory
>      >>>> ???? overhead
>      >>>> ????? >????? >>???? need to be evaluated.
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???? Looking forward to your insight.
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???? Best Regrads,
>      >>>> ????? >????? >>???? Haoyu Li,
>      >>>> ????? >????? >>???? Institute of Parallel and Distributed
>      >>>> Systems(IPADS),
>      >>>> ????? >????? >>???? School of Software,
>      >>>> ????? >????? >>???? Shanghai Jiao Tong University
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???? Kim Barrett <kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>>> ???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>
>      >>>> ????? >???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>>> <mailto:kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>>>
>      >>>> ????? >????? >>???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>>> ???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>
>      >>>> ????? >???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>>> ???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>>>> ?2019?3?12??? ??6:11
>      >>>> ???
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???????? > On Mar 11, 2019, at 1:45 AM, Kim Barrett
>      >>>> ????? >????? >>???????? <kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>>> ???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>> <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>>> ???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>>
>      >>>> ????? >???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>>> ???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>> <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>
>      >>>> ???? <mailto:kim.barrett at oracle.com
>     <mailto:kim.barrett at oracle.com>>>>> wrote:
>      >>>> ????? >????? >>???????? >
>      >>>> ????? >????? >>???????? >> On Jan 24, 2019, at 3:58 AM, Haoyu Li
>      >>>> ????? >???? <leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>      >>>> ???? <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>
>     <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>>
>      >>>> ????? >????? >>???????? <mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com>
>      >>>> ???? <mailto:leihouyju at gmail.com <mailto:leihouyju at gmail.com>>
>      >>>> ????? >???? <mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com> <mailto:leihouyju at gmail.com
>     <mailto:leihouyju at gmail.com>>>>>
>      >>>> ???? wrote:
>      >>>> ????? >????? >>???????? >>
>      >>>> ????? >????? >>???????? >> Hi Kim,
>      >>>> ????? >????? >>???????? >>
>      >>>> ????? >????? >>???????? >> I have ported my patch to OpenJDK 13
>      >>>> according
>      >>>> ???? to your
>      >>>> ????? >????? >>???????? instructions in your last mail, and the
>      >>>> patch is
>      >>>> ???? attached in
>      >>>> ????? >????? >>???????? this mail. The patch does not change
>     much since
>      >>>> ???? PSGC is
>      >>>> ????? >???? indeed
>      >>>> ????? >????? >>???????? pretty stable.
>      >>>> ????? >????? >>???????? >>
>      >>>> ????? >????? >>???????? >> Also, I evaluate the correctness and
>      >>>> ???? performance of
>      >>>> ????? >???? PS full
>      >>>> ????? >????? >>???????? GC with benchmarks from DaCapo,
>     SPECjvm2008,
>      >>>> and
>      >>>> ???? JOlden
>      >>>> ????? >????? suits
>      >>>> ????? >????? >>???????? on a machine with dual Intel Xeon
>     E5-2618L v3
>      >>>> CPUs(16
>      >>>> ????? >???? physical
>      >>>> ????? >????? >>???????? cores), 64G DRAM and linux kernel
>     4.17. The
>      >>>> ???? evaluation
>      >>>> ????? >???? result,
>      >>>> ????? >????? >>???????? indicating 1.9X GC throughput
>     improvement on
>      >>>> ???? average, is
>      >>>> ????? >????? >>???????? attached, too.
>      >>>> ????? >????? >>???????? >>
>      >>>> ????? >????? >>???????? >> However, I have no idea how to
>     further test
>      >>>> this
>      >>>> ????? >???? patch for
>      >>>> ????? >????? >>???????? both correctness and performance. Can
>     I please
>      >>>> ???? get any
>      >>>> ????? >????? >>???????? guidance from you or some sponsor?
>      >>>> ????? >????? >>???????? >
>      >>>> ????? >????? >>???????? > Sorry I missed that you had sent an
>     updated
>      >>>> ???? version of the
>      >>>> ????? >????? >>???????? patch.
>      >>>> ????? >????? >>???????? >
>      >>>> ????? >????? >>???????? > I?ve run the full regression suite
>     across
>      >>>> ???? Oracle-supported
>      >>>> ????? >????? >>???????? platforms.? There are some
>      >>>> ????? >????? >>???????? > failures, but there are almost
>     always some
>      >>>> ???? failures in the
>      >>>> ????? >????? >>???????? later tiers right now.? I?ll start
>      >>>> ????? >????? >>???????? > looking at them tomorrow to figure out
>      >>>> whether
>      >>>> ???? any of them
>      >>>> ????? >????? >>???????? are relevant.
>      >>>> ????? >????? >>???????? >
>      >>>> ????? >????? >>???????? > I?m also planning to run some of our
>      >>>> performance
>      >>>> ????? >???? benchmarks.
>      >>>> ????? >????? >>???????? >
>      >>>> ????? >????? >>???????? > I?ve lightly skimmed the proposed
>     changes.
>      >>>> ???? There might be
>      >>>> ????? >????? >>???????? some code improvements
>      >>>> ????? >????? >>???????? > to be made.
>      >>>> ????? >????? >>???????? >
>      >>>> ????? >????? >>???????? > I?m also wondering if this technique
>      >>>> applies to
>      >>>> ???? other
>      >>>> ????? >????? >>???????? collectors.? It seems like both G1 and
>      >>>> ????? >????? >>???????? > Shenandoah full gc?s might have similar
>      >>>> ???? issues?? If so, a
>      >>>> ????? >????? >>???????? solution that is ParallelGC-specific
>      >>>> ????? >????? >>???????? > is less interesting than one that
>     has broader
>      >>>> ????? >????? >>???????? applicability.? Though maybe this
>     optimization
>      >>>> ????? >????? >>???????? > is less important for G1 and
>     Shenandoah,
>      >>>> since
>      >>>> they
>      >>>> ????? >???? actively
>      >>>> ????? >????? >>???????? try to avoid full gc?s.
>      >>>> ????? >????? >>???????? >
>      >>>> ????? >????? >>???????? > I?m also not clear on how much
>     additional
>      >>>> ???? memory might be
>      >>>> ????? >????? >>???????? temporarily allocated by this
>      >>>> ????? >????? >>???????? > mechanism.
>      >>>> ????? >????? >>
>      >>>> ????? >????? >>???????? I?ve created a CR for this:
>      >>>> ????? >????? >> https://bugs.openjdk.java.net/browse/JDK-8220465
>      >>>> ????? >????? >>
>      >>>> ????? >????? >
>      >>>> ????? >
>      >>>>
>      >>>
>      >>
>      >>
> 

From per.liden at oracle.com  Thu Oct 24 15:47:38 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 24 Oct 2019 17:47:38 +0200
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>
References: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>
Message-ID: <fc142b1b-57f1-8b1d-3562-4719037c7bb1@oracle.com>

Hi,

On 10/24/19 12:38 PM, erik.osterlund at oracle.com wrote:
> Hi,
> 
> Now that some curling has been performed, paving way for this patch:
> 
>  ??? 8229027: Improve how JNIHandleBlock::oops_do distinguishes oops 
> from non-oops
>  ??? 8229278: Improve hs_err location printing to assume less about GC 
> internals
>  ??? 8229189: Improve JFR leak profiler tracing to deal with 
> discontiguous heaps
>  ??? 8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>  ??? 8224820: ZGC: Support discontiguous heap reservations
> 
> ...the remaining thing to do is plugging in a few platform specific ZGC 
> files. This patch does that.
> Decided to go with mach_vm_map/mach_vm_remap to implement multi-mapping. 
> Previously I didn't want to do that as I couldn't figure out how to 
> mach_vm_remap memory on top of reserved VA (acquired using mmap). But 
> apparently VM_FLAGS_OVERWRITE was the missing ingredient there. With 
> that in place, dodging the terrible ftruncate implementation on macOS 
> seemed like a good idea. That also implies this port supports large 
> pages (unlike other GCs on macOS today). Yay!
> 
> CR:
> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00/

As I've pre-reviewed this code, all my comments have already been 
addressed. Looks super!

/Per

> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8229358
> 
> Thanks,
> /Erik


From thomas.schatzl at oracle.com  Thu Oct 24 15:57:12 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 24 Oct 2019 17:57:12 +0200
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <fc142b1b-57f1-8b1d-3562-4719037c7bb1@oracle.com>
References: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>
 <fc142b1b-57f1-8b1d-3562-4719037c7bb1@oracle.com>
Message-ID: <788a7929-28bc-23c3-d89e-0ced7286fc82@oracle.com>

Hi,

On 24.10.19 17:47, Per Liden wrote:
> Hi,
> 
> On 10/24/19 12:38 PM, erik.osterlund at oracle.com wrote:
>> Hi,
>>
>> Now that some curling has been performed, paving way for this patch:
>>
>> ???? 8229027: Improve how JNIHandleBlock::oops_do distinguishes oops 
>> from non-oops
>> ???? 8229278: Improve hs_err location printing to assume less about GC 
>> internals
>> ???? 8229189: Improve JFR leak profiler tracing to deal with 
>> discontiguous heaps
>> ???? 8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>> ???? 8224820: ZGC: Support discontiguous heap reservations
>>
>> ...the remaining thing to do is plugging in a few platform specific 
>> ZGC files. This patch does that.
>> Decided to go with mach_vm_map/mach_vm_remap to implement 
>> multi-mapping. Previously I didn't want to do that as I couldn't 
>> figure out how to mach_vm_remap memory on top of reserved VA (acquired 
>> using mmap). But apparently VM_FLAGS_OVERWRITE was the missing 
>> ingredient there. With that in place, dodging the terrible ftruncate 
>> implementation on macOS seemed like a good idea. That also implies 
>> this port supports large pages (unlike other GCs on macOS today). Yay!

Not completely related and not a review:

Please file an RFE with a link to this mechanism. It would be nice to do 
such changes in a generic way so that all collectors benefit in the 
future, not just one.

Its confusing as already is with one collector supporting this and other 
collectors supporting that, adding to that is not nice.

Thanks,
   Thomas


From erik.osterlund at oracle.com  Thu Oct 24 16:27:21 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Thu, 24 Oct 2019 18:27:21 +0200
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <fc142b1b-57f1-8b1d-3562-4719037c7bb1@oracle.com>
References: <fc142b1b-57f1-8b1d-3562-4719037c7bb1@oracle.com>
Message-ID: <C8D49EAA-0955-4B4E-9E7F-1400A68C4164@oracle.com>

Hi Per,

Thanks for the review.

/Erik

> On 24 Oct 2019, at 17:47, Per Liden <per.liden at oracle.com> wrote:
> 
> ?Hi,
> 
>> On 10/24/19 12:38 PM, erik.osterlund at oracle.com wrote:
>> Hi,
>> Now that some curling has been performed, paving way for this patch:
>>     8229027: Improve how JNIHandleBlock::oops_do distinguishes oops from non-oops
>>     8229278: Improve hs_err location printing to assume less about GC internals
>>     8229189: Improve JFR leak profiler tracing to deal with discontiguous heaps
>>     8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>>     8224820: ZGC: Support discontiguous heap reservations
>> ...the remaining thing to do is plugging in a few platform specific ZGC files. This patch does that.
>> Decided to go with mach_vm_map/mach_vm_remap to implement multi-mapping. Previously I didn't want to do that as I couldn't figure out how to mach_vm_remap memory on top of reserved VA (acquired using mmap). But apparently VM_FLAGS_OVERWRITE was the missing ingredient there. With that in place, dodging the terrible ftruncate implementation on macOS seemed like a good idea. That also implies this port supports large pages (unlike other GCs on macOS today). Yay!
>> CR:
>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00/
> 
> As I've pre-reviewed this code, all my comments have already been addressed. Looks super!
> 
> /Per
> 
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8229358
>> Thanks,
>> /Erik


From stefan.karlsson at oracle.com  Thu Oct 24 16:36:42 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 24 Oct 2019 18:36:42 +0200
Subject: RFR: 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise
Message-ID: <813a25f8-4540-859d-502d-8ffdfc503f72@oracle.com>

Hi all,

Please review this patch to make the ZVerifyViews mapping and unmapping 
precise.

https://cr.openjdk.java.net/~stefank/8232604/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8232604

Today, when the ZVerifyViews flag is turned on, we unmap all bad views. 
The intention is to catch stray-pointer bugs.

The current implementation takes a short-cut and unmap all memory en 
masse. This works for Linux, but not on Windows, where we must be 
precise in what we unmap.

There are three places where allocated pages are registered today:
- In the page table - actively used
- In the page cache - free pages waiting to be used
- In-flight from the alloc queue

The proposed patch registers all satisfied alloc requests, lets the 
requesting threads deregister the satisfied request when the page is 
received, and makes sure that the GC visits all in-flight satisfied 
alloc requests when it performs the ZVerifyViews flip.

Thanks,
StefanK


From erik.osterlund at oracle.com  Thu Oct 24 16:39:34 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Thu, 24 Oct 2019 18:39:34 +0200
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <788a7929-28bc-23c3-d89e-0ced7286fc82@oracle.com>
References: <788a7929-28bc-23c3-d89e-0ced7286fc82@oracle.com>
Message-ID: <27583414-08E0-4AEE-A779-F6F8CE2C0F0B@oracle.com>

Hi Thomas,

Sure I can file an RFE. For anonymous mmaped memory, a seemingly undocumented feature is that you can pass in superpage flags for the mach VM system via the file descriptor parameter. Anyway, I will detail it in the RFE.

/Erik

> On 24 Oct 2019, at 17:57, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> ?Hi,
> 
>> On 24.10.19 17:47, Per Liden wrote:
>> Hi,
>>> On 10/24/19 12:38 PM, erik.osterlund at oracle.com wrote:
>>> Hi,
>>> 
>>> Now that some curling has been performed, paving way for this patch:
>>> 
>>>      8229027: Improve how JNIHandleBlock::oops_do distinguishes oops from non-oops
>>>      8229278: Improve hs_err location printing to assume less about GC internals
>>>      8229189: Improve JFR leak profiler tracing to deal with discontiguous heaps
>>>      8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>>>      8224820: ZGC: Support discontiguous heap reservations
>>> 
>>> ...the remaining thing to do is plugging in a few platform specific ZGC files. This patch does that.
>>> Decided to go with mach_vm_map/mach_vm_remap to implement multi-mapping. Previously I didn't want to do that as I couldn't figure out how to mach_vm_remap memory on top of reserved VA (acquired using mmap). But apparently VM_FLAGS_OVERWRITE was the missing ingredient there. With that in place, dodging the terrible ftruncate implementation on macOS seemed like a good idea. That also implies this port supports large pages (unlike other GCs on macOS today). Yay!
> 
> Not completely related and not a review:
> 
> Please file an RFE with a link to this mechanism. It would be nice to do such changes in a generic way so that all collectors benefit in the future, not just one.
> 
> Its confusing as already is with one collector supporting this and other collectors supporting that, adding to that is not nice.
> 
> Thanks,
>  Thomas


From per.liden at oracle.com  Thu Oct 24 19:40:18 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 24 Oct 2019 21:40:18 +0200
Subject: RFR: 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise
In-Reply-To: <813a25f8-4540-859d-502d-8ffdfc503f72@oracle.com>
References: <813a25f8-4540-859d-502d-8ffdfc503f72@oracle.com>
Message-ID: <cf456042-2691-ea8d-09d7-e5947936e360@oracle.com>

Looks good! Just one minor nit:

   ZVerifyViewsFlip(ZPageAllocator* allocator);

could become:

   ZVerifyViewsFlip(const ZPageAllocator* allocator);

I don't need to see a new webrev.

cheers,
Per

On 10/24/19 6:36 PM, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to make the ZVerifyViews mapping and unmapping 
> precise.
> 
> https://cr.openjdk.java.net/~stefank/8232604/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232604
> 
> Today, when the ZVerifyViews flag is turned on, we unmap all bad views. 
> The intention is to catch stray-pointer bugs.
> 
> The current implementation takes a short-cut and unmap all memory en 
> masse. This works for Linux, but not on Windows, where we must be 
> precise in what we unmap.
> 
> There are three places where allocated pages are registered today:
> - In the page table - actively used
> - In the page cache - free pages waiting to be used
> - In-flight from the alloc queue
> 
> The proposed patch registers all satisfied alloc requests, lets the 
> requesting threads deregister the satisfied request when the page is 
> received, and makes sure that the GC visits all in-flight satisfied 
> alloc requests when it performs the ZVerifyViews flip.
> 
> Thanks,
> StefanK


From stefan.karlsson at oracle.com  Thu Oct 24 19:41:31 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 24 Oct 2019 21:41:31 +0200
Subject: RFR: 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise
In-Reply-To: <cf456042-2691-ea8d-09d7-e5947936e360@oracle.com>
References: <813a25f8-4540-859d-502d-8ffdfc503f72@oracle.com>
 <cf456042-2691-ea8d-09d7-e5947936e360@oracle.com>
Message-ID: <84f91120-aa7c-1981-01ad-b7edb9cf643d@oracle.com>

OK. Thanks!

StefanK

On 2019-10-24 21:40, Per Liden wrote:
> Looks good! Just one minor nit:
>
> ? ZVerifyViewsFlip(ZPageAllocator* allocator);
>
> could become:
>
> ? ZVerifyViewsFlip(const ZPageAllocator* allocator);
>
> I don't need to see a new webrev.
>
> cheers,
> Per
>
> On 10/24/19 6:36 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to make the ZVerifyViews mapping and 
>> unmapping precise.
>>
>> https://cr.openjdk.java.net/~stefank/8232604/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232604
>>
>> Today, when the ZVerifyViews flag is turned on, we unmap all bad 
>> views. The intention is to catch stray-pointer bugs.
>>
>> The current implementation takes a short-cut and unmap all memory en 
>> masse. This works for Linux, but not on Windows, where we must be 
>> precise in what we unmap.
>>
>> There are three places where allocated pages are registered today:
>> - In the page table - actively used
>> - In the page cache - free pages waiting to be used
>> - In-flight from the alloc queue
>>
>> The proposed patch registers all satisfied alloc requests, lets the 
>> requesting threads deregister the satisfied request when the page is 
>> received, and makes sure that the GC visits all in-flight satisfied 
>> alloc requests when it performs the ZVerifyViews flip.
>>
>> Thanks,
>> StefanK


From kim.barrett at oracle.com  Thu Oct 24 23:05:59 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 24 Oct 2019 19:05:59 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
Message-ID: <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>

> On Oct 23, 2019, at 12:20 PM, sangheon.kim at oracle.com wrote:
> 
> Hi Per,
> 
> Thanks for taking a look at this.
> 
> I agree all your comments and here's the webrev.
> - All comments from Per.
> - Move G1PageBasedVirtualSpace::page_size() near to page_start() from Kim.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6.inc
> Testing: build test for linux, solaris, windows and mac.
> 
> FYI, as I think existing numa related API names and -1 stuff seem not good, I planned to refine those later after pushing. But as you said following existing rule and then refine all together later seems better.

The type of the argument for numa_get_group_id(void* address) should
be "const void*".  Sorry I didn't notice that earlier.  Of course,
this will require a const_cast to remove the const qualifier when
calling get_mempolicy, but it is better to isolate the workaround for
that missing qualifier to that one place.

I'm not sure I like the overload for os::numa_get_group_id.  While
both are getting the numa id associated with something, the associations
involved seem pretty different to me.

Spelling them out, they could be

numa_get_group_id_for_current_thread()
numa_get_group_id_for_address(const void* address)

Those seem semantically unrelated to me, so violate the usual guidance
of only overloading operations that are roughly equivalent (*).  Or put
another way, one should not need to determine which overload is selected
to understand a call site.

Of course, "roughly equivalent" is in the eye of the beholder.

(*) Operator overloading sometimes violates this on the basis that the
syntactic concision of using operators is more important, and there
are a limited set of operators.  Such violations are often used as an
argument against using operator overloading at all.


From dean.long at oracle.com  Fri Oct 25 00:37:56 2019
From: dean.long at oracle.com (dean.long at oracle.com)
Date: Thu, 24 Oct 2019 17:37:56 -0700
Subject: RFR: 8231955: ARM32: Address displacement is 0 for volatile field
 access because of Unsafe field access.
In-Reply-To: <20191015073212.7FCCA319074@aojmv0009>
References: <20191010143426.BA4B6319F46@aojmv0009>
 <20191015073212.7FCCA319074@aojmv0009>
Message-ID: <f40cbf84-aef3-f235-4861-403ce30dc03d@oracle.com>

The shared code used to call generate_address(), which correctly handles 
various displacements, but I guess it got lost in the barrier 
refactoring in jdk11.? I think the correct fix is for the caller to use 
generate_address() again.? CCing GC alias. Alternatively, the arm code 
could call generate_address rather than changing the shared code.

dl

On 10/15/19 12:30 AM, christoph.goettschkes at microdoc.com wrote:
> Is there anyone who could take a look at this change and give feedback
> please?
>
> Thanks,
> Christoph
>
> "hotspot-compiler-dev" <hotspot-compiler-dev-bounces at openjdk.java.net>
> wrote on 2019-10-10 16:29:11:
>
>> From: christoph.goettschkes at microdoc.com
>> To: hotspot-compiler-dev at openjdk.java.net
>> Date: 2019-10-10 16:35
>> Subject: RFR: 8231955: ARM32: Address displacement is 0 for volatile
> field
>> access because of Unsafe field access.
>> Sent by: "hotspot-compiler-dev"
> <hotspot-compiler-dev-bounces at openjdk.java.net>
>> Hi,
>>
>> please review the following changeset. This patch fixes the volatile
> field
>> access for 32-bit ARM. The functions LIRGenerator::volatile_field_store
>> and LIRGenerator::volatile_field_load both assume that the displacement
>> for the given address is always 0. Both use the given address and pass
> the
>> values to add_large_constant() [1], which asserts that the given
>> displacement is not 0. The change does not call add_large_constant if
> the
>> given displacement is 0. The displacement can be 0, because of the
>> implementation of the unsafe intrinsics. This happens, because the
> offset
>> into the object from which the field is accessed is not a constant
> value.
>> This fixes the hotspot tier1 tests mentioned in the issue.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231955
>> Webrev: https://cr.openjdk.java.net/~bulasevich/8231955/webrev.00/
>>
>> Thanks,
>> Christoph
>>
>> [1]
>>
> https://hg.openjdk.java.net/jdk/jdk/file/30a9612a657d/src/hotspot/cpu/arm/
>> c1_LIRGenerator_arm.cpp#l166
>>


From dean.long at oracle.com  Fri Oct 25 00:55:52 2019
From: dean.long at oracle.com (dean.long at oracle.com)
Date: Thu, 24 Oct 2019 17:55:52 -0700
Subject: RFR: 8231955: ARM32: Address displacement is 0 for volatile field
 access because of Unsafe field access.
In-Reply-To: <f40cbf84-aef3-f235-4861-403ce30dc03d@oracle.com>
References: <20191010143426.BA4B6319F46@aojmv0009>
 <20191015073212.7FCCA319074@aojmv0009>
 <f40cbf84-aef3-f235-4861-403ce30dc03d@oracle.com>
Message-ID: <587f6363-bbdc-da12-9e50-82acc5bc5853@oracle.com>

I see now that BarrierSetC1::resolve_address() is calling 
generate_address(), at least when access isn't patched.? So now I'm 
thinking that the address passed to 
volatile_field_load/volatile_field_store should be correct, and the call 
to add_large_constant() isn't necessary.

dl

On 10/24/19 5:37 PM, dean.long at oracle.com wrote:
> The shared code used to call generate_address(), which correctly 
> handles various displacements, but I guess it got lost in the barrier 
> refactoring in jdk11.? I think the correct fix is for the caller to 
> use generate_address() again.? CCing GC alias. Alternatively, the arm 
> code could call generate_address rather than changing the shared code.
>
> dl
>
> On 10/15/19 12:30 AM, christoph.goettschkes at microdoc.com wrote:
>> Is there anyone who could take a look at this change and give feedback
>> please?
>>
>> Thanks,
>> Christoph
>>
>> "hotspot-compiler-dev" <hotspot-compiler-dev-bounces at openjdk.java.net>
>> wrote on 2019-10-10 16:29:11:
>>
>>> From: christoph.goettschkes at microdoc.com
>>> To: hotspot-compiler-dev at openjdk.java.net
>>> Date: 2019-10-10 16:35
>>> Subject: RFR: 8231955: ARM32: Address displacement is 0 for volatile
>> field
>>> access because of Unsafe field access.
>>> Sent by: "hotspot-compiler-dev"
>> <hotspot-compiler-dev-bounces at openjdk.java.net>
>>> Hi,
>>>
>>> please review the following changeset. This patch fixes the volatile
>> field
>>> access for 32-bit ARM. The functions LIRGenerator::volatile_field_store
>>> and LIRGenerator::volatile_field_load both assume that the displacement
>>> for the given address is always 0. Both use the given address and pass
>> the
>>> values to add_large_constant() [1], which asserts that the given
>>> displacement is not 0. The change does not call add_large_constant if
>> the
>>> given displacement is 0. The displacement can be 0, because of the
>>> implementation of the unsafe intrinsics. This happens, because the
>> offset
>>> into the object from which the field is accessed is not a constant
>> value.
>>> This fixes the hotspot tier1 tests mentioned in the issue.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231955
>>> Webrev: https://cr.openjdk.java.net/~bulasevich/8231955/webrev.00/
>>>
>>> Thanks,
>>> Christoph
>>>
>>> [1]
>>>
>> https://hg.openjdk.java.net/jdk/jdk/file/30a9612a657d/src/hotspot/cpu/arm/ 
>>
>>> c1_LIRGenerator_arm.cpp#l166
>>>
>


From sakamoto.osamu at nttcom.co.jp  Fri Oct 25 08:53:35 2019
From: sakamoto.osamu at nttcom.co.jp (Osamu Sakamoto)
Date: Fri, 25 Oct 2019 17:53:35 +0900
Subject: Segmentation Fault occurs when ClassLoader and Metaspace is
 released in JDK 8
In-Reply-To: <1ccb4f35-7f21-4aa2-4cbb-b75244b6d12d@oss.nttdata.com>
References: <fb308571-cbdf-1f4b-177f-aa6bac986a5f@nttcom.co.jp_1>
 <422c9ca2-5053-c761-cb61-f075877bb666@oss.nttdata.com>
 <314f9ad2-17df-1082-8816-7af73a96e9fb@nttcom.co.jp_1>
 <1ccb4f35-7f21-4aa2-4cbb-b75244b6d12d@oss.nttdata.com>
Message-ID: <af915a45-9d20-c637-9ee8-ca76e3006967@nttcom.co.jp_1>

Hi Yasumasa,


 > I guess this is a bug in combination of Metaspace and CMS.
 > However current jdk/jdk has different implementation, so it might not 
be occur in modern JDK.
 > I want to hear the comments from others.
Thank you for your comment.
I want to hear from others, too


 > AFAICS you cannot find head of _unloading at this point.
 > However you can traverse CLD list with purge_me->_next .
Thank you for telling me how to traverse CLD list.
I could start to traverse the CLD list, but this list is too long to 
traverse manually.
I recursively chekced _next -> _next -> next ... about 500 times with 
GDB print command, but NULL termination or address loop isn't found yet.
I'll try to find a good way to traverse the CLD list to the end.


 > BTW, CLD has OOP for class loader in ClassLoaderData::_class_loader .
 > If you check it on (CL)HSDB, you might get any hints from it.
 > For example, use system class loader instead of custom class loader 
from framework.
I checked CLD oop, but I don't understand what type of ClassLoader is.
The result is below.
It looks like that this ClassLoaderData::_class_loader oop indicates 
character array.
Is it normal?
If so, what is this class loader?(Bootstrap ClassLoader?)

---------------------------------------------------
(gdb) p ClassLoaderData::_class_loader
$21 = (oop) 0xa3afc1f0

hsdb> inspect 0xa3afc1f0
instance of [C @ 0x00000000a3afc1f0 @ 0x00000000a3afc1f0 (size = 72)
_mark: 1
_metadata._compressed_klass: TypeArrayKlass for [C
0: 'c'
1: 'o'
2: 'l'
3: 'u'
4: 'm'
5: 'n'
6: '1'
7: '5'
8: '6'
9: '5'
10: '7'
11: '5'
12: '5'
13: '9'
14: '8'
15: '6'
16: '3'
17: '3'
18: '1'
19: '_'
20: '8'
21: '0'
22: '0'
23: '3'
---------------------------------------------------


Thanks,

Osamu


On 10/24/19 09:49, Yasumasa Suenaga wrote:
> Hi Osamu,
>
> I guess this is a bug in combination of Metaspace and CMS.
> However current jdk/jdk has different implementation, so it might not 
> be occur in modern JDK.
> I want to hear the comments from others.
>
> My comments is below:
>
> On 2019/10/23 18:57, Osamu Sakamoto wrote:
>> Hi Yasumasa,
>>
>> Thank you for answering.
>>
>> ?> What JVM options did you pass?
>> The following is the JVM options I passed.
>> -----------------------------------------------------------------
>> -Xmx2048m
>> -Xms2048m
>> -XX:NewSize=412m
>> -XX:MaxNewSize=412m
>> -XX:SurvivorRatio=8
>> -XX:MaxTenuringThreshold=15
>> -XX:+UseConcMarkSweepGC
>> -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:CMSInitiatingOccupancyFraction=80
>> -XX:+CMSClassUnloadingEnabled
>> -XX:CompressedClassSpaceSize=64m
>> -XX:+PrintGCDetails
>> -XX:+PrintGCDateStamps
>> -XX:+UseGCLogFileRotation
>> -XX:GCLogFileSize=0
>> -Xloggc:/var/log/tomcatm0/gc-%p.log
>> -XX:+HeapDumpOnOutOfMemoryError
>> -XX:+AlwaysLockClassLoader
>> -----------------------------------------------------------------
>>
>>
>> ?> I guess you used CMS because this problem seems to occur on CMS 
>> only [1] [2].
>> Yes, I used CMS.
>>
>> ?> So it might be work around not to use CMS.
>> Thank you for telling me work around.
>> But it is difficult to change the GC method, so we would like to 
>> solve this issue with CMS GC if possible.
>>
>>
>> ?> I'm not sure root cause of this issue, but it seems to break 
>> ClassLoaderDataGraph::_unloading.
>> ?> (like double free (delete) of CLD)
>> I checked whether the ClassLoaderDataGraph::_unloading is broken or 
>> not, but I didn't know because of the value has been cleaered by NULL 
>> or optimized out.
>>
>> Referring ClassLoaderDataGraph[1].cpp, it looks like that _unloading 
>> value is saved to ClassLoaderDataGraph::_saved_unloading.
>> But _saved_unloading had been cleared by NULL, too.
>>
>> Is there any other way to check it?
>>
>> [1]http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.cpp#l753 
>>
>>
>> -----------------------------------------------------------------
>> (gdb) f 10
>> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at 
>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
>> 818??? ??? delete purge_me;
>> (gdb) list ClassLoaderDataGraph::purge
>> 810??? void ClassLoaderDataGraph::purge() {
>> 811??? ? assert(SafepointSynchronize::is_at_safepoint(), "must be at 
>> safepoint!");
>> 812??? ? ClassLoaderData* list = _unloading;
>> 813??? ? _unloading = NULL;
>> 814??? ? ClassLoaderData* next = list;
>> 815??? ? while (next != NULL) {
>> 816??? ??? ClassLoaderData* purge_me = next;
>> 817??? ??? next = purge_me->next();
>> 818??? ??? delete purge_me;
>> 819??? ? }
>> 820??? ? Metaspace::purge();
>> 821??? }
>> (gdb) p _unloading
>> $29 = (ClassLoaderData *) 0x0
>> (gdb) p list
>> $30 = <optimized out>
>> (gdb) p next
>> $31 = <optimized out>
>> (gdb) p ClassLoaderDataGraph::_saved_unloading
>> $32 = (ClassLoaderData *) 0x0
>> -----------------------------------------------------------------
>
> AFAICS you cannot find head of _unloading at this point.
> However you can traverse CLD list with purge_me->_next .
>
>
> BTW, CLD has OOP for class loader in ClassLoaderData::_class_loader .
> If you check it on (CL)HSDB, you might get any hints from it.
> For example, use system class loader instead of custom class loader 
> from framework.
>
>
> Thanks,
>
> Yasumasa
>
>
>> Thanks,
>> Osamu
>>
>> On 10/21/19 22:29, Yasumasa Suenaga wrote:
>>> Hi Osamu,
>>>
>>> What JVM options did you pass?
>>>
>>> I guess you used CMS because this problem seems to occur on CMS only 
>>> [1] [2].
>>> So it might be work around not to use CMS.
>>>
>>> I'm not sure root cause of this issue, but it seems to break 
>>> ClassLoaderDataGraph::_unloading.
>>> (like double free (delete) of CLD)
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> [1] 
>>> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.hpp#l100
>>> [2] 
>>> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp#l6384
>>>
>>>
>>> On 2019/10/21 17:50, Osamu Sakamoto wrote:
>>>> Hi all,
>>>>
>>>> I have a problem about Segmentation Fault(SEGV) in GC and I can't 
>>>> make the cause clear.
>>>> Could you help me solve the problem?
>>>>
>>>> Our System uses OpenJDK 1.8.0.181, and crashed by SEGV when purging 
>>>> ClassLoader at safepoint.
>>>> This problem can't be reproduced, but this has happened 4 times in 
>>>> a few months.
>>>>
>>>> The following is the summary of my investigation.
>>>>
>>>> ============================================================================= 
>>>>
>>>>
>>>> First I checked hs_err, and that shows that the SEGV occurred.
>>>> VM_Operation is GenCollectForAllocation at safepoint.
>>>>
>>>> ----------------------------------------------------------------------------- 
>>>>
>>>> #
>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>> #
>>>> #? SIGSEGV (0xb) at pc=0x00007f6080c97f88, pid=23931, 
>>>> tid=0x00007f607c3ed700
>>>> #
>>>> # JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 
>>>> 1.8.0_181-b13)
>>>> # Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode 
>>>> linux-amd64 compressed oops)
>>>> # Problematic frame:
>>>> # V? [libjvm.so+0x84bf88]
>>>> #
>>>> # Core dump written. Default location: /opt/tomcate0/core or 
>>>> core.23931
>>>> #
>>>> # If you would like to submit a bug report, please visit:
>>>> #?? http://bugreport.java.com/bugreport/crash.jsp
>>>> #
>>>>
>>>> ---------------? T H R E A D? ---------------
>>>>
>>>> Current thread (0x00007f6078c00000):? VMThread [stack: 
>>>> 0x00007f607c2ed000,0x00007f607c3ee000] [id=23939]
>>>>
>>>> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
>>>> 0x0000000000000018
>>>>
>>>> Registers:
>>>> RAX=0x0000000000000010, RBX=0x00007f5ff800ad30, 
>>>> RCX=0x0000000000000010, RDX=0x0000000000000000
>>>> RSP=0x00007f607c3ecb50, RBP=0x00007f607c3ecb80, 
>>>> RSI=0x0000000000000002, RDI=0x0000000001cfe570
>>>> R8 =0x00007f5ff80ae320, R9 =0x00007f5ff8052480, 
>>>> R10=0x0000000000000000, R11=0x0000000000000400
>>>> R12=0x0000000001cfe570, R13=0x00007f6081419470, 
>>>> R14=0x0000000000000002, R15=0x00007f6081418640
>>>> RIP=0x00007f6080c97f88, EFLAGS=0x0000000000010202, 
>>>> CSGSFS=0x0000000000000033, ERR=0x0000000000000004
>>>> ?? TRAPNO=0x000000000000000e
>>>>
>>>> Top of Stack: (sp=0x00007f607c3ecb50)
>>>> 0x00007f607c3ecb50:?? 00007f607c3ecba0 00007f5ff800ad30
>>>> 0x00007f607c3ecb60:?? 00007f5ff800ad00 0000000000000000
>>>> 0x00007f607c3ecb70:?? 0000000000000000 0000000000000001
>>>> 0x00007f607c3ecb80:?? 00007f607c3ecba0 00007f6080c995fa
>>>> 0x00007f607c3ecb90:?? 00007f5ff800ad00 00007f5ff800ac20
>>>> 0x00007f607c3ecba0:?? 00007f607c3ecbc0 00007f60808bff5e
>>>> 0x00007f607c3ecbb0:?? 00007f5ff800ac20 00007f5ff8052870
>>>> 0x00007f607c3ecbc0:?? 00007f607c3ecbe0 00007f60808c0f0f
>>>> 0x00007f607c3ecbd0:?? 00007f607c3ecbf0 00007f608140f308
>>>> 0x00007f607c3ecbe0:?? 00007f607c3ecc30 00007f6080daa0b7
>>>> 0x00007f607c3ecbf0:?? 00007f6069000100 0000000000000000
>>>> 0x00007f607c3ecc00:?? 00007f607c3ecc20 00007f6080ed0800
>>>> 0x00007f607c3ecc10:?? 00000000000000f9 88e95c3ba257ab00
>>>> 0x00007f607c3ecc20:?? 431bde82d7b634db 00007f607800aa00
>>>> 0x00007f607c3ecc30:?? 00007f607c3eccc0 00007f6080daa9d5
>>>> 0x00007f607c3ecc40:?? 0000000000000000 00007f607803bf20
>>>> 0x00007f607c3ecc50:?? 00007f607803be20 00000000000003e8
>>>> 0x00007f607c3ecc60:?? 0000000000000001 00007f6078c00000
>>>> 0x00007f607c3ecc70:?? 00007f607c3eccc0 0000000000000000
>>>> 0x00007f607c3ecc80:?? 00000004000000f9 00007f60813e2b99
>>>> 0x00007f607c3ecc90:?? 00007f607803bfa0 00007f6078c00000
>>>> 0x00007f607c3ecca0:?? 0000000000000000 0000000000000000
>>>> 0x00007f607c3eccb0:?? 00007f6081418bd0 00007f607803bf20
>>>> 0x00007f607c3eccc0:?? 00007f607c3ece60 00007f6080f2048a
>>>> 0x00007f607c3eccd0:?? 00007f607c3ecd20 00007f607c3ecce0
>>>> 0x00007f607c3ecce0:?? 00007f6078c00000 00007f6078c00980
>>>> 0x00007f607c3eccf0:?? 00007f6078c009c0 00007f6078c009d0
>>>> 0x00007f607c3ecd00:?? 00007f6078c00aa8 00000000000000d8
>>>> 0x00007f607c3ecd10:?? 00007f6078c00be0 0000000000000000
>>>> 0x00007f607c3ecd20:?? 00007f607c3ecd28 6e69747563657845
>>>> 0x00007f607c3ecd30:?? 65706f204d562067 203a6e6f69746172
>>>> 0x00007f607c3ecd40:?? 656c6c6f436e6547 6c6c41726f467463
>>>>
>>>> Instructions: (pc=0x00007f6080c97f88)
>>>> 0x00007f6080c97f68:?? b6 12 80 fa 00 74 01 f0 48 0f c1 01 31 c9 31 f6
>>>> 0x00007f6080c97f78:?? 48 8b 44 0b 10 31 d2 48 85 c0 74 11 0f 1f 40 00
>>>> 0x00007f6080c97f88:?? 48 8b 40 08 48 83 c2 01 48 85 c0 75 f3 48 83 c1
>>>> 0x00007f6080c97f98:?? 08 48 01 d6 48 83 f9 20 75 d6 8b 7b 08 48 8b 05
>>>>
>>>> Register to memory mapping:
>>>>
>>>> RAX=0x0000000000000010 is an unknown value
>>>> RBX=0x00007f5ff800ad30 is an unknown value
>>>> RCX=0x0000000000000010 is an unknown value
>>>> RDX=0x0000000000000000 is an unknown value
>>>> RSP=0x00007f607c3ecb50 is an unknown value
>>>> RBP=0x00007f607c3ecb80 is an unknown value
>>>> RSI=0x0000000000000002 is an unknown value
>>>> RDI=0x0000000001cfe570 is an unknown value
>>>> R8 =0x00007f5ff80ae320 is an unknown value
>>>> R9 =0x00007f5ff8052480 is an unknown value
>>>> R10=0x0000000000000000 is an unknown value
>>>> R11=0x0000000000000400 is an unknown value
>>>> R12=0x0000000001cfe570 is an unknown value
>>>> R13=0x00007f6081419470: <offset 0xfcd470> in 
>>>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so 
>>>> at 0x00007f608044c000
>>>> R14=0x0000000000000002 is an unknown value
>>>> R15=0x00007f6081418640: <offset 0xfcc640> in 
>>>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so 
>>>> at 0x00007f608044c000
>>>>
>>>>
>>>> Stack: [0x00007f607c2ed000,0x00007f607c3ee000], 
>>>> sp=0x00007f607c3ecb50, free space=1022k
>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
>>>> C=native code)
>>>> V? [libjvm.so+0x84bf88]
>>>> V? [libjvm.so+0x84d5fa]
>>>> V? [libjvm.so+0x473f5e]
>>>> V? [libjvm.so+0x474f0f]
>>>> V? [libjvm.so+0x95e0b7]
>>>> V? [libjvm.so+0x95e9d5]
>>>> V? [libjvm.so+0xad448a]
>>>> V? [libjvm.so+0xad48f1]
>>>> V? [libjvm.so+0x8beb82]
>>>>
>>>> VM_Operation (0x00007f5fd69e6120): GenCollectForAllocation, mode: 
>>>> safepoint, requested by thread 0x00007f6079013800
>>>>
>>>> ...
>>>> ----------------------------------------------------------------------------- 
>>>>
>>>>
>>>>
>>>>
>>>> Next, I used GDB to check the backtrace of the SEGV thread from the 
>>>> coredump.
>>>> The following is the backtrace.
>>>> The SEGV occurred when ClassLoader is purged and Metaspace is 
>>>> destructed.
>>>> And frame #7 shows that a signal(SEGV) handler is called after 
>>>> SpaceManager::~SpaceManager() is executed.
>>>>
>>>> ----------------------------------------------------------------------------- 
>>>>
>>>> (gdb) bt
>>>> #0? 0x00007f608146f1f7 in __GI_raise (sig=sig at entry=6) at 
>>>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>>>> #1? 0x00007f60814708e8 in __GI_abort () at abort.c:90
>>>> #2? 0x00007f6080d0bc39 in os::abort (dump_core=<optimized out>) at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1519
>>>> #3? 0x00007f6080f1b816 in VMError::report_and_die 
>>>> (this=this at entry=0x7f607c3ebd10) at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060
>>>> #4? 0x00007f6080d15927 in JVM_handle_linux_signal (sig=11, 
>>>> info=0x7f607c3ebfb0, ucVoid=0x7f607c3ebe80, 
>>>> abort_if_unrecognized=<optimized out>)
>>>> ???? at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
>>>> #5? 0x00007f6080d09038 in signalHandler (sig=11, 
>>>> info=0x7f607c3ebfb0, uc=0x7f607c3ebe80) at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4446
>>>> #6? <signal handler called>
>>>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, 
>>>> __in_chrg=<optimized out>) at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>>>> #8? 0x00007f6080c995fa in Metaspace::~Metaspace 
>>>> (this=0x7f5ff800ad00, __in_chrg=<optimized out>)
>>>> ???? at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2971
>>>> #9? 0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData 
>>>> (this=0x7f5ff800ac20, __in_chrg=<optimized out>)
>>>> ???? at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
>>>> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818 
>>>>
>>>> #11 0x00007f6080daa0b7 in ClassLoaderDataGraph::purge_if_needed () 
>>>> at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.hpp:104
>>>> #12 SafepointSynchronize::do_cleanup_tasks () at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:551
>>>> #13 0x00007f6080daa9d5 in SafepointSynchronize::begin () at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:402 
>>>>
>>>> #14 0x00007f6080f2048a in VMThread::loop 
>>>> (this=this at entry=0x7f6078c00000) at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:501
>>>> #15 0x00007f6080f208f1 in VMThread::run (this=0x7f6078c00000) at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276
>>>> #16 0x00007f6080d0ab82 in java_start (thread=0x7f6078c00000) at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796
>>>> #17 0x00007f6081e2de25 in start_thread (arg=0x7f607c3ed700) at 
>>>> pthread_create.c:308
>>>> #18 0x00007f608153234d in clone () at 
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
>>>> ----------------------------------------------------------------------------- 
>>>>
>>>>
>>>>
>>>> In Frame #7, Line 2028 (chunk = chunk->next()) is the crash point.
>>>> The variable "chunk" is defined at Line 2025 (Metachunk* chunk = 
>>>> chunks_in_use(i);).
>>>> "chunks_in_use(i)" is defined at Line 648 (Metachunk* 
>>>> chunks_in_use(ChunkIndex index) const { return 
>>>> _chunks_in_use[index]; }).
>>>> So I checked values of "_chunks_in_use", and understood that 
>>>> "_chunks_in_use[2]" has Illegal Address "0x10".
>>>> Therefore, I think that the SEGV occurred because of referencing 
>>>> Illegal Address "0x10" at "chunk = chunk->next()".
>>>>
>>>> ----------------------------------------------------------------------------- 
>>>>
>>>> (gdb) f 7
>>>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, 
>>>> __in_chrg=<optimized out>) at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>>>> 2028??? ??? chunk = chunk->next();
>>>> (gdb) list
>>>> 2023??? size_t SpaceManager::sum_count_in_chunks_in_use(ChunkIndex 
>>>> i) {
>>>> 2024??? ? size_t count = 0;
>>>> 2025??? ? Metachunk* chunk = chunks_in_use(i);
>>>> 2026??? ? while (chunk != NULL) {
>>>> 2027??? ??? count++;
>>>> 2028??? ??? chunk = chunk->next();
>>>> 2029??? ? }
>>>> 2030??? ? return count;
>>>> 2031??? }
>>>> 2032
>>>> (gdb) list SpaceManager::chunks_in_use
>>>> 647??? ? // Accessors
>>>> 648??? ? Metachunk* chunks_in_use(ChunkIndex index) const { return 
>>>> _chunks_in_use[index]; }
>>>> ...
>>>> (gdb) p _chunks_in_use
>>>> $11 = {0x7f5fcd41c400, 0x7f5fcd41a000, 0x10, 0x0}
>>>> ----------------------------------------------------------------------------- 
>>>>
>>>>
>>>>
>>>>
>>>> The following is disassemble code of "SpaceManager::~SpaceManager()".
>>>> %rax has 0x10 at "0x00007f6080c97f88 <+200>", but I don't 
>>>> understand why this "0x10" is inserted to %rax.
>>>>
>>>> ----------------------------------------------------------------------------- 
>>>>
>>>> (gdb) disas
>>>> Dump of assembler code for function SpaceManager::~SpaceManager():
>>>> ??? 0x00007f6080c97ec0 <+0>:??? push?? %rbp
>>>> ??? 0x00007f6080c97ec1 <+1>:??? mov??? %rsp,%rbp
>>>> ??? 0x00007f6080c97ec4 <+4>:??? push?? %r15
>>>> ??? 0x00007f6080c97ec6 <+6>:??? push?? %r14
>>>> ??? 0x00007f6080c97ec8 <+8>:??? push?? %r13
>>>> ??? 0x00007f6080c97eca <+10>:??? push?? %r12
>>>> ??? 0x00007f6080c97ecc <+12>:??? push?? %rbx
>>>> ??? 0x00007f6080c97ecd <+13>:??? mov??? %rdi,%rbx
>>>> ??? 0x00007f6080c97ed0 <+16>:??? sub??? $0x8,%rsp
>>>> ??? 0x00007f6080c97ed4 <+20>:??? mov 0x780785(%rip),%r12??????? # 
>>>> 0x7f6081418660 <_ZN12SpaceManager12_expand_lockE>
>>>> ??? 0x00007f6080c97edb <+27>:??? test?? %r12,%r12
>>>> ??? 0x00007f6080c97ede <+30>:??? je???? 0x7f6080c97ee8 
>>>> <SpaceManager::~SpaceManager()+40>
>>>> ??? 0x00007f6080c97ee0 <+32>:??? mov??? %r12,%rdi
>>>> ??? 0x00007f6080c97ee3 <+35>:??? callq? 0x7f6080cce2f0 
>>>> <Monitor::lock_without_safepoint_check()>
>>>> ??? 0x00007f6080c97ee8 <+40>:??? movslq 0x8(%rbx),%rcx
>>>> ??? 0x00007f6080c97eec <+44>:??? lea 0x78075d(%rip),%rdx??????? # 
>>>> 0x7f6081418650 <_ZN12MetaspaceAux15_capacity_wordsE>
>>>> ??? 0x00007f6080c97ef3 <+51>:??? lea 0x781576(%rip),%r13??????? # 
>>>> 0x7f6081419470 <_ZN2os16_processor_countE>
>>>> ??? 0x00007f6080c97efa <+58>:??? lea 0x78073f(%rip),%r15??????? # 
>>>> 0x7f6081418640 <_ZN12MetaspaceAux11_used_wordsE>
>>>> ??? 0x00007f6080c97f01 <+65>:??? mov (%rdx,%rcx,8),%rax
>>>> ??? 0x00007f6080c97f05 <+69>:??? sub 0x40(%rbx),%rax
>>>> ??? 0x00007f6080c97f09 <+73>:??? mov %rax,(%rdx,%rcx,8)
>>>> ??? 0x00007f6080c97f0d <+77>:??? mov 0x38(%rbx),%rax
>>>> ??? 0x00007f6080c97f11 <+81>:??? movslq 0x8(%rbx),%rdx
>>>> ??? 0x00007f6080c97f15 <+85>:??? neg??? %rax
>>>> ??? 0x00007f6080c97f18 <+88>:??? cmpl?? $0x1,0x0(%r13)
>>>> ??? 0x00007f6080c97f1d <+93>:??? lea (%r15,%rdx,8),%rcx
>>>> ??? 0x00007f6080c97f21 <+97>:??? mov??? $0x1,%edx
>>>> ??? 0x00007f6080c97f26 <+102>:??? jne 0x7f6080c97f32 
>>>> <SpaceManager::~SpaceManager()+114>
>>>> ??? 0x00007f6080c97f28 <+104>:??? lea 0x74acb4(%rip),%rdx??????? # 
>>>> 0x7f60813e2be3 <AssumeMP>
>>>> ??? 0x00007f6080c97f2f <+111>:??? movzbl (%rdx),%edx
>>>> ??? 0x00007f6080c97f32 <+114>:??? cmp??? $0x0,%dl
>>>> ??? 0x00007f6080c97f35 <+117>:??? je 0x7f6080c97f38 
>>>> <SpaceManager::~SpaceManager()+120>
>>>> ??? 0x00007f6080c97f37 <+119>:??? lock xadd %rax,(%rcx)
>>>> ??? 0x00007f6080c97f3c <+124>:??? mov 0x48(%rbx),%r14
>>>> ??? 0x00007f6080c97f40 <+128>:??? callq 0x7f6080c951a0 
>>>> <Metachunk::overhead()>
>>>> ??? 0x00007f6080c97f45 <+133>:??? movslq 0x8(%rbx),%rdx
>>>> ??? 0x00007f6080c97f49 <+137>:??? imul?? %r14,%rax
>>>> ??? 0x00007f6080c97f4d <+141>:??? lea (%r15,%rdx,8),%rcx
>>>> ??? 0x00007f6080c97f51 <+145>:??? mov??? $0x1,%edx
>>>> ??? 0x00007f6080c97f56 <+150>:??? neg??? %rax
>>>> ??? 0x00007f6080c97f59 <+153>:??? cmpl $0x1,0x0(%r13)
>>>> ??? 0x00007f6080c97f5e <+158>:??? jne 0x7f6080c97f6a 
>>>> <SpaceManager::~SpaceManager()+170>
>>>> ??? 0x00007f6080c97f60 <+160>:??? lea 0x74ac7c(%rip),%rdx??????? # 
>>>> 0x7f60813e2be3 <AssumeMP>
>>>> ??? 0x00007f6080c97f67 <+167>:??? movzbl (%rdx),%edx
>>>> ??? 0x00007f6080c97f6a <+170>:??? cmp??? $0x0,%dl
>>>> ??? 0x00007f6080c97f6d <+173>:??? je 0x7f6080c97f70 
>>>> <SpaceManager::~SpaceManager()+176>
>>>> ??? 0x00007f6080c97f6f <+175>:??? lock xadd %rax,(%rcx)
>>>> ??? 0x00007f6080c97f74 <+180>:??? xor??? %ecx,%ecx
>>>> ??? 0x00007f6080c97f76 <+182>:??? xor??? %esi,%esi
>>>> ??? 0x00007f6080c97f78 <+184>:??? mov 0x10(%rbx,%rcx,1),%rax
>>>> ??? 0x00007f6080c97f7d <+189>:??? xor??? %edx,%edx
>>>> ??? 0x00007f6080c97f7f <+191>:??? test?? %rax,%rax
>>>> ??? 0x00007f6080c97f82 <+194>:??? je 0x7f6080c97f95 
>>>> <SpaceManager::~SpaceManager()+213>
>>>> ??? 0x00007f6080c97f84 <+196>:??? nopl?? 0x0(%rax)
>>>> => 0x00007f6080c97f88 <+200>:??? mov 0x8(%rax),%rax
>>>> ??? 0x00007f6080c97f8c <+204>:??? add??? $0x1,%rdx
>>>> ??? 0x00007f6080c97f90 <+208>:??? test?? %rax,%rax
>>>> ...
>>>> (gdb) info registers
>>>> rax??????????? 0x10??? 16
>>>> rbx??????????? 0x7f5ff800ad30??? 140050159414576
>>>> rcx??????????? 0x10??? 16
>>>> rdx??????????? 0x0??? 0
>>>> rsi??????????? 0x2??? 2
>>>> rdi??????????? 0x1cfe570??? 30401904
>>>> rbp??????????? 0x7f607c3ecb80??? 0x7f607c3ecb80
>>>> rsp??????????? 0x7f607c3ecb50??? 0x7f607c3ecb50
>>>> r8???????????? 0x7f5ff80ae320??? 140050160083744
>>>> r9???????????? 0x7f5ff8052480??? 140050159707264
>>>> r10??????????? 0x0??? 0
>>>> r11??????????? 0x400??? 1024
>>>> r12??????????? 0x1cfe570??? 30401904
>>>> r13??????????? 0x7f6081419470??? 140052462146672
>>>> r14??????????? 0x2??? 2
>>>> r15??????????? 0x7f6081418640??? 140052462143040
>>>> rip??????????? 0x7f6080c97f88??? 0x7f6080c97f88 
>>>> <SpaceManager::~SpaceManager()+200>
>>>> eflags???????? 0x206??? [ PF IF ]
>>>> cs???????????? 0x33??? 51
>>>> ss???????????? 0x2b??? 43
>>>> ds???????????? 0x0??? 0
>>>> es???????????? 0x0??? 0
>>>> fs???????????? 0x0??? 0
>>>> gs???????????? 0x0??? 0
>>>> k0???????????? <unavailable>
>>>> k1???????????? <unavailable>
>>>> k2???????????? <unavailable>
>>>> k3???????????? <unavailable>
>>>> k4???????????? <unavailable>
>>>> k5???????????? <unavailable>
>>>> k6???????????? <unavailable>
>>>> k7???????????? <unavailable>
>>>> ----------------------------------------------------------------------------- 
>>>>
>>>>
>>>> ============================================================================= 
>>>>
>>>>
>>>>
>>>>
>>>> Does anyone know about this case?
>>>>
>>>> Thanks, Osamu
>>>>
>>>>
>>


From thomas.schatzl at oracle.com  Fri Oct 25 10:20:03 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 25 Oct 2019 12:20:03 +0200
Subject: RFR (S): 8232776: G1 should always take rs_length_diff into
 account when predicting rs_lengths
In-Reply-To: <f1bb2f33-a451-1430-2c70-de66ef433fec@oracle.com>
References: <2e973399-ce75-7ab4-ce21-58fe63c74f9c@oracle.com>
 <f1bb2f33-a451-1430-2c70-de66ef433fec@oracle.com>
Message-ID: <2dd515f2-225a-65fb-9939-0c75d1a9f09f@oracle.com>

Hi Kim, Sangheon,

On 22.10.19 22:49, sangheon.kim at oracle.com wrote:
> Hi Thomas,
> 
> On 10/22/19 10:35 AM, Thomas Schatzl wrote:
>> Hi all,
>>
>> ? can I have reviews for this small change that makes G1 always use 
>> the error term for rs-length prediction, not only if G1 sees fit.
>>
>> While rs length prediction is still kind of bad even with this change 
>> (and seemingly a band-aid), with that change it is a bit better. While 
>> there is a "real" fix for RS length estimation coming that so far 
>> looks really good, this change decreases complexity of further changes 
>> in G1Policy enough while improving the estimation.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8232776
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8232776/webrev/
> Looks good.
> 

   thanks for your review.

Thomas


From thomas.schatzl at oracle.com  Fri Oct 25 10:19:24 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 25 Oct 2019 12:19:24 +0200
Subject: RFR (XS): 8232779: G1 current collection parallel time does not
 include optional evacuation
In-Reply-To: <fec08635-993f-4054-62b8-d0ac7e11eeed@oracle.com>
References: <15f7fa18-0334-c9ff-be69-c8ecb114e363@oracle.com>
 <fec08635-993f-4054-62b8-d0ac7e11eeed@oracle.com>
Message-ID: <6202aeaa-f40d-eaed-c46c-244a6bf0d7bc@oracle.com>

Hi Sangheon, Kim

On 22.10.19 22:50, sangheon.kim at oracle.com wrote:
> Hi Thomas,
> 
> On 10/22/19 11:05 AM, Thomas Schatzl wrote:
>> Hi all,
>>
>> ? can I have reviews for this change that fixes the calculation of 
>> G1GCPhaseTimes::cur_collection_par_time_ms(): we forgot to consider 
>> the optional evacuation time.
>>
>> This causes too long Other time, having minor effects on pause time 
>> prediction.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8232779
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8232779/webrev/
> Looks good.
> 

   thanks for your review.

Thomas


From suenaga at oss.nttdata.com  Fri Oct 25 12:20:20 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 25 Oct 2019 21:20:20 +0900
Subject: Segmentation Fault occurs when ClassLoader and Metaspace is
 released in JDK 8
In-Reply-To: <af915a45-9d20-c637-9ee8-ca76e3006967@nttcom.co.jp_1>
References: <fb308571-cbdf-1f4b-177f-aa6bac986a5f@nttcom.co.jp_1>
 <422c9ca2-5053-c761-cb61-f075877bb666@oss.nttdata.com>
 <314f9ad2-17df-1082-8816-7af73a96e9fb@nttcom.co.jp_1>
 <1ccb4f35-7f21-4aa2-4cbb-b75244b6d12d@oss.nttdata.com>
 <af915a45-9d20-c637-9ee8-ca76e3006967@nttcom.co.jp_1>
Message-ID: <ea8fc9d7-bab4-a812-2ed7-19a722e298b1@oss.nttdata.com>

It seems a bug.
Anyone have any suggestions about this?

> (gdb) p ClassLoaderData::_class_loader
> $21 = (oop) 0xa3afc1f0

(CLD::_class_loader is not static member, so this command would be failed.)

> hsdb> inspect 0xa3afc1f0
> instance of [C @ 0x00000000a3afc1f0 @ 0x00000000a3afc1f0 (size = 72)
> _mark: 1
> _metadata._compressed_klass: TypeArrayKlass for [C
> 0: 'c'

I believe CLD::_class_loader should be the OOP for class loader.
I guess memory corruption was occurred in some reason - Is it a bug in HotSpot?

I checked 8u222 on Fedora 30, my guess seems correct.

* GDB
```
(gdb) p ClassLoaderDataGraph::_head->_class_loader
$2 = (oop) 0xd67d0900
```

* CLHSDB
```
hsdb> inspect 0xd67d0900
instance of Oop for sun/misc/Launcher$AppClassLoader @ 0x00000000d67d0900 @ 0x00000000d67d0900 (size = 96)
_mark: 436443282689
_metadata._compressed_klass: InstanceKlass for sun/misc/Launcher$AppClassLoader
parent: Oop for sun/misc/Launcher$ExtClassLoader @ 0x00000000d67bb348 Oop for sun/misc/Launcher$ExtClassLoader @ 0x00000000d67bb348
         :
```


Yasumasa


On 2019/10/25 17:53, Osamu Sakamoto wrote:
> Hi Yasumasa,
> 
> 
>  > I guess this is a bug in combination of Metaspace and CMS.
>  > However current jdk/jdk has different implementation, so it might not be occur in modern JDK.
>  > I want to hear the comments from others.
> Thank you for your comment.
> I want to hear from others, too
> 
> 
>  > AFAICS you cannot find head of _unloading at this point.
>  > However you can traverse CLD list with purge_me->_next .
> Thank you for telling me how to traverse CLD list.
> I could start to traverse the CLD list, but this list is too long to traverse manually.
> I recursively chekced _next -> _next -> next ... about 500 times with GDB print command, but NULL termination or address loop isn't found yet.
> I'll try to find a good way to traverse the CLD list to the end.
> 
> 
>  > BTW, CLD has OOP for class loader in ClassLoaderData::_class_loader .
>  > If you check it on (CL)HSDB, you might get any hints from it.
>  > For example, use system class loader instead of custom class loader from framework.
> I checked CLD oop, but I don't understand what type of ClassLoader is.
> The result is below.
> It looks like that this ClassLoaderData::_class_loader oop indicates character array.
> Is it normal?
> If so, what is this class loader?(Bootstrap ClassLoader?)
> 
> ---------------------------------------------------
> (gdb) p ClassLoaderData::_class_loader
> $21 = (oop) 0xa3afc1f0
> 
> hsdb> inspect 0xa3afc1f0
> instance of [C @ 0x00000000a3afc1f0 @ 0x00000000a3afc1f0 (size = 72)
> _mark: 1
> _metadata._compressed_klass: TypeArrayKlass for [C
> 0: 'c'
> 1: 'o'
> 2: 'l'
> 3: 'u'
> 4: 'm'
> 5: 'n'
> 6: '1'
> 7: '5'
> 8: '6'
> 9: '5'
> 10: '7'
> 11: '5'
> 12: '5'
> 13: '9'
> 14: '8'
> 15: '6'
> 16: '3'
> 17: '3'
> 18: '1'
> 19: '_'
> 20: '8'
> 21: '0'
> 22: '0'
> 23: '3'
> ---------------------------------------------------
> 
> 
> Thanks,
> 
> Osamu
> 
> 
> On 10/24/19 09:49, Yasumasa Suenaga wrote:
>> Hi Osamu,
>>
>> I guess this is a bug in combination of Metaspace and CMS.
>> However current jdk/jdk has different implementation, so it might not be occur in modern JDK.
>> I want to hear the comments from others.
>>
>> My comments is below:
>>
>> On 2019/10/23 18:57, Osamu Sakamoto wrote:
>>> Hi Yasumasa,
>>>
>>> Thank you for answering.
>>>
>>> ?> What JVM options did you pass?
>>> The following is the JVM options I passed.
>>> -----------------------------------------------------------------
>>> -Xmx2048m
>>> -Xms2048m
>>> -XX:NewSize=412m
>>> -XX:MaxNewSize=412m
>>> -XX:SurvivorRatio=8
>>> -XX:MaxTenuringThreshold=15
>>> -XX:+UseConcMarkSweepGC
>>> -XX:+UseCMSInitiatingOccupancyOnly
>>> -XX:CMSInitiatingOccupancyFraction=80
>>> -XX:+CMSClassUnloadingEnabled
>>> -XX:CompressedClassSpaceSize=64m
>>> -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps
>>> -XX:+UseGCLogFileRotation
>>> -XX:GCLogFileSize=0
>>> -Xloggc:/var/log/tomcatm0/gc-%p.log
>>> -XX:+HeapDumpOnOutOfMemoryError
>>> -XX:+AlwaysLockClassLoader
>>> -----------------------------------------------------------------
>>>
>>>
>>> ?> I guess you used CMS because this problem seems to occur on CMS only [1] [2].
>>> Yes, I used CMS.
>>>
>>> ?> So it might be work around not to use CMS.
>>> Thank you for telling me work around.
>>> But it is difficult to change the GC method, so we would like to solve this issue with CMS GC if possible.
>>>
>>>
>>> ?> I'm not sure root cause of this issue, but it seems to break ClassLoaderDataGraph::_unloading.
>>> ?> (like double free (delete) of CLD)
>>> I checked whether the ClassLoaderDataGraph::_unloading is broken or not, but I didn't know because of the value has been cleaered by NULL or optimized out.
>>>
>>> Referring ClassLoaderDataGraph[1].cpp, it looks like that _unloading value is saved to ClassLoaderDataGraph::_saved_unloading.
>>> But _saved_unloading had been cleared by NULL, too.
>>>
>>> Is there any other way to check it?
>>>
>>> [1]http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.cpp#l753
>>>
>>> -----------------------------------------------------------------
>>> (gdb) f 10
>>> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
>>> 818??? ??? delete purge_me;
>>> (gdb) list ClassLoaderDataGraph::purge
>>> 810??? void ClassLoaderDataGraph::purge() {
>>> 811??? ? assert(SafepointSynchronize::is_at_safepoint(), "must be at safepoint!");
>>> 812??? ? ClassLoaderData* list = _unloading;
>>> 813??? ? _unloading = NULL;
>>> 814??? ? ClassLoaderData* next = list;
>>> 815??? ? while (next != NULL) {
>>> 816??? ??? ClassLoaderData* purge_me = next;
>>> 817??? ??? next = purge_me->next();
>>> 818??? ??? delete purge_me;
>>> 819??? ? }
>>> 820??? ? Metaspace::purge();
>>> 821??? }
>>> (gdb) p _unloading
>>> $29 = (ClassLoaderData *) 0x0
>>> (gdb) p list
>>> $30 = <optimized out>
>>> (gdb) p next
>>> $31 = <optimized out>
>>> (gdb) p ClassLoaderDataGraph::_saved_unloading
>>> $32 = (ClassLoaderData *) 0x0
>>> -----------------------------------------------------------------
>>
>> AFAICS you cannot find head of _unloading at this point.
>> However you can traverse CLD list with purge_me->_next .
>>
>>
>> BTW, CLD has OOP for class loader in ClassLoaderData::_class_loader .
>> If you check it on (CL)HSDB, you might get any hints from it.
>> For example, use system class loader instead of custom class loader from framework.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> Thanks,
>>> Osamu
>>>
>>> On 10/21/19 22:29, Yasumasa Suenaga wrote:
>>>> Hi Osamu,
>>>>
>>>> What JVM options did you pass?
>>>>
>>>> I guess you used CMS because this problem seems to occur on CMS only [1] [2].
>>>> So it might be work around not to use CMS.
>>>>
>>>> I'm not sure root cause of this issue, but it seems to break ClassLoaderDataGraph::_unloading.
>>>> (like double free (delete) of CLD)
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> [1] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.hpp#l100
>>>> [2] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp#l6384
>>>>
>>>>
>>>> On 2019/10/21 17:50, Osamu Sakamoto wrote:
>>>>> Hi all,
>>>>>
>>>>> I have a problem about Segmentation Fault(SEGV) in GC and I can't make the cause clear.
>>>>> Could you help me solve the problem?
>>>>>
>>>>> Our System uses OpenJDK 1.8.0.181, and crashed by SEGV when purging ClassLoader at safepoint.
>>>>> This problem can't be reproduced, but this has happened 4 times in a few months.
>>>>>
>>>>> The following is the summary of my investigation.
>>>>>
>>>>> =============================================================================
>>>>>
>>>>> First I checked hs_err, and that shows that the SEGV occurred.
>>>>> VM_Operation is GenCollectForAllocation at safepoint.
>>>>>
>>>>> -----------------------------------------------------------------------------
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> #? SIGSEGV (0xb) at pc=0x00007f6080c97f88, pid=23931, tid=0x00007f607c3ed700
>>>>> #
>>>>> # JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
>>>>> # Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
>>>>> # Problematic frame:
>>>>> # V? [libjvm.so+0x84bf88]
>>>>> #
>>>>> # Core dump written. Default location: /opt/tomcate0/core or core.23931
>>>>> #
>>>>> # If you would like to submit a bug report, please visit:
>>>>> #?? http://bugreport.java.com/bugreport/crash.jsp
>>>>> #
>>>>>
>>>>> ---------------? T H R E A D? ---------------
>>>>>
>>>>> Current thread (0x00007f6078c00000):? VMThread [stack: 0x00007f607c2ed000,0x00007f607c3ee000] [id=23939]
>>>>>
>>>>> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000018
>>>>>
>>>>> Registers:
>>>>> RAX=0x0000000000000010, RBX=0x00007f5ff800ad30, RCX=0x0000000000000010, RDX=0x0000000000000000
>>>>> RSP=0x00007f607c3ecb50, RBP=0x00007f607c3ecb80, RSI=0x0000000000000002, RDI=0x0000000001cfe570
>>>>> R8 =0x00007f5ff80ae320, R9 =0x00007f5ff8052480, R10=0x0000000000000000, R11=0x0000000000000400
>>>>> R12=0x0000000001cfe570, R13=0x00007f6081419470, R14=0x0000000000000002, R15=0x00007f6081418640
>>>>> RIP=0x00007f6080c97f88, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000004
>>>>> ?? TRAPNO=0x000000000000000e
>>>>>
>>>>> Top of Stack: (sp=0x00007f607c3ecb50)
>>>>> 0x00007f607c3ecb50:?? 00007f607c3ecba0 00007f5ff800ad30
>>>>> 0x00007f607c3ecb60:?? 00007f5ff800ad00 0000000000000000
>>>>> 0x00007f607c3ecb70:?? 0000000000000000 0000000000000001
>>>>> 0x00007f607c3ecb80:?? 00007f607c3ecba0 00007f6080c995fa
>>>>> 0x00007f607c3ecb90:?? 00007f5ff800ad00 00007f5ff800ac20
>>>>> 0x00007f607c3ecba0:?? 00007f607c3ecbc0 00007f60808bff5e
>>>>> 0x00007f607c3ecbb0:?? 00007f5ff800ac20 00007f5ff8052870
>>>>> 0x00007f607c3ecbc0:?? 00007f607c3ecbe0 00007f60808c0f0f
>>>>> 0x00007f607c3ecbd0:?? 00007f607c3ecbf0 00007f608140f308
>>>>> 0x00007f607c3ecbe0:?? 00007f607c3ecc30 00007f6080daa0b7
>>>>> 0x00007f607c3ecbf0:?? 00007f6069000100 0000000000000000
>>>>> 0x00007f607c3ecc00:?? 00007f607c3ecc20 00007f6080ed0800
>>>>> 0x00007f607c3ecc10:?? 00000000000000f9 88e95c3ba257ab00
>>>>> 0x00007f607c3ecc20:?? 431bde82d7b634db 00007f607800aa00
>>>>> 0x00007f607c3ecc30:?? 00007f607c3eccc0 00007f6080daa9d5
>>>>> 0x00007f607c3ecc40:?? 0000000000000000 00007f607803bf20
>>>>> 0x00007f607c3ecc50:?? 00007f607803be20 00000000000003e8
>>>>> 0x00007f607c3ecc60:?? 0000000000000001 00007f6078c00000
>>>>> 0x00007f607c3ecc70:?? 00007f607c3eccc0 0000000000000000
>>>>> 0x00007f607c3ecc80:?? 00000004000000f9 00007f60813e2b99
>>>>> 0x00007f607c3ecc90:?? 00007f607803bfa0 00007f6078c00000
>>>>> 0x00007f607c3ecca0:?? 0000000000000000 0000000000000000
>>>>> 0x00007f607c3eccb0:?? 00007f6081418bd0 00007f607803bf20
>>>>> 0x00007f607c3eccc0:?? 00007f607c3ece60 00007f6080f2048a
>>>>> 0x00007f607c3eccd0:?? 00007f607c3ecd20 00007f607c3ecce0
>>>>> 0x00007f607c3ecce0:?? 00007f6078c00000 00007f6078c00980
>>>>> 0x00007f607c3eccf0:?? 00007f6078c009c0 00007f6078c009d0
>>>>> 0x00007f607c3ecd00:?? 00007f6078c00aa8 00000000000000d8
>>>>> 0x00007f607c3ecd10:?? 00007f6078c00be0 0000000000000000
>>>>> 0x00007f607c3ecd20:?? 00007f607c3ecd28 6e69747563657845
>>>>> 0x00007f607c3ecd30:?? 65706f204d562067 203a6e6f69746172
>>>>> 0x00007f607c3ecd40:?? 656c6c6f436e6547 6c6c41726f467463
>>>>>
>>>>> Instructions: (pc=0x00007f6080c97f88)
>>>>> 0x00007f6080c97f68:?? b6 12 80 fa 00 74 01 f0 48 0f c1 01 31 c9 31 f6
>>>>> 0x00007f6080c97f78:?? 48 8b 44 0b 10 31 d2 48 85 c0 74 11 0f 1f 40 00
>>>>> 0x00007f6080c97f88:?? 48 8b 40 08 48 83 c2 01 48 85 c0 75 f3 48 83 c1
>>>>> 0x00007f6080c97f98:?? 08 48 01 d6 48 83 f9 20 75 d6 8b 7b 08 48 8b 05
>>>>>
>>>>> Register to memory mapping:
>>>>>
>>>>> RAX=0x0000000000000010 is an unknown value
>>>>> RBX=0x00007f5ff800ad30 is an unknown value
>>>>> RCX=0x0000000000000010 is an unknown value
>>>>> RDX=0x0000000000000000 is an unknown value
>>>>> RSP=0x00007f607c3ecb50 is an unknown value
>>>>> RBP=0x00007f607c3ecb80 is an unknown value
>>>>> RSI=0x0000000000000002 is an unknown value
>>>>> RDI=0x0000000001cfe570 is an unknown value
>>>>> R8 =0x00007f5ff80ae320 is an unknown value
>>>>> R9 =0x00007f5ff8052480 is an unknown value
>>>>> R10=0x0000000000000000 is an unknown value
>>>>> R11=0x0000000000000400 is an unknown value
>>>>> R12=0x0000000001cfe570 is an unknown value
>>>>> R13=0x00007f6081419470: <offset 0xfcd470> in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so at 0x00007f608044c000
>>>>> R14=0x0000000000000002 is an unknown value
>>>>> R15=0x00007f6081418640: <offset 0xfcc640> in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so at 0x00007f608044c000
>>>>>
>>>>>
>>>>> Stack: [0x00007f607c2ed000,0x00007f607c3ee000], sp=0x00007f607c3ecb50, free space=1022k
>>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>>>>> V? [libjvm.so+0x84bf88]
>>>>> V? [libjvm.so+0x84d5fa]
>>>>> V? [libjvm.so+0x473f5e]
>>>>> V? [libjvm.so+0x474f0f]
>>>>> V? [libjvm.so+0x95e0b7]
>>>>> V? [libjvm.so+0x95e9d5]
>>>>> V? [libjvm.so+0xad448a]
>>>>> V? [libjvm.so+0xad48f1]
>>>>> V? [libjvm.so+0x8beb82]
>>>>>
>>>>> VM_Operation (0x00007f5fd69e6120): GenCollectForAllocation, mode: safepoint, requested by thread 0x00007f6079013800
>>>>>
>>>>> ...
>>>>> -----------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>> Next, I used GDB to check the backtrace of the SEGV thread from the coredump.
>>>>> The following is the backtrace.
>>>>> The SEGV occurred when ClassLoader is purged and Metaspace is destructed.
>>>>> And frame #7 shows that a signal(SEGV) handler is called after SpaceManager::~SpaceManager() is executed.
>>>>>
>>>>> -----------------------------------------------------------------------------
>>>>> (gdb) bt
>>>>> #0? 0x00007f608146f1f7 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>>>>> #1? 0x00007f60814708e8 in __GI_abort () at abort.c:90
>>>>> #2? 0x00007f6080d0bc39 in os::abort (dump_core=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1519
>>>>> #3? 0x00007f6080f1b816 in VMError::report_and_die (this=this at entry=0x7f607c3ebd10) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060
>>>>> #4? 0x00007f6080d15927 in JVM_handle_linux_signal (sig=11, info=0x7f607c3ebfb0, ucVoid=0x7f607c3ebe80, abort_if_unrecognized=<optimized out>)
>>>>> ???? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
>>>>> #5? 0x00007f6080d09038 in signalHandler (sig=11, info=0x7f607c3ebfb0, uc=0x7f607c3ebe80) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4446
>>>>> #6? <signal handler called>
>>>>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, __in_chrg=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>>>>> #8? 0x00007f6080c995fa in Metaspace::~Metaspace (this=0x7f5ff800ad00, __in_chrg=<optimized out>)
>>>>> ???? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2971
>>>>> #9? 0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData (this=0x7f5ff800ac20, __in_chrg=<optimized out>)
>>>>> ???? at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
>>>>> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
>>>>> #11 0x00007f6080daa0b7 in ClassLoaderDataGraph::purge_if_needed () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.hpp:104
>>>>> #12 SafepointSynchronize::do_cleanup_tasks () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:551
>>>>> #13 0x00007f6080daa9d5 in SafepointSynchronize::begin () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:402
>>>>> #14 0x00007f6080f2048a in VMThread::loop (this=this at entry=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:501
>>>>> #15 0x00007f6080f208f1 in VMThread::run (this=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276
>>>>> #16 0x00007f6080d0ab82 in java_start (thread=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796
>>>>> #17 0x00007f6081e2de25 in start_thread (arg=0x7f607c3ed700) at pthread_create.c:308
>>>>> #18 0x00007f608153234d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
>>>>> -----------------------------------------------------------------------------
>>>>>
>>>>>
>>>>> In Frame #7, Line 2028 (chunk = chunk->next()) is the crash point.
>>>>> The variable "chunk" is defined at Line 2025 (Metachunk* chunk = chunks_in_use(i);).
>>>>> "chunks_in_use(i)" is defined at Line 648 (Metachunk* chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }).
>>>>> So I checked values of "_chunks_in_use", and understood that "_chunks_in_use[2]" has Illegal Address "0x10".
>>>>> Therefore, I think that the SEGV occurred because of referencing Illegal Address "0x10" at "chunk = chunk->next()".
>>>>>
>>>>> -----------------------------------------------------------------------------
>>>>> (gdb) f 7
>>>>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, __in_chrg=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>>>>> 2028??? ??? chunk = chunk->next();
>>>>> (gdb) list
>>>>> 2023??? size_t SpaceManager::sum_count_in_chunks_in_use(ChunkIndex i) {
>>>>> 2024??? ? size_t count = 0;
>>>>> 2025??? ? Metachunk* chunk = chunks_in_use(i);
>>>>> 2026??? ? while (chunk != NULL) {
>>>>> 2027??? ??? count++;
>>>>> 2028??? ??? chunk = chunk->next();
>>>>> 2029??? ? }
>>>>> 2030??? ? return count;
>>>>> 2031??? }
>>>>> 2032
>>>>> (gdb) list SpaceManager::chunks_in_use
>>>>> 647??? ? // Accessors
>>>>> 648??? ? Metachunk* chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }
>>>>> ...
>>>>> (gdb) p _chunks_in_use
>>>>> $11 = {0x7f5fcd41c400, 0x7f5fcd41a000, 0x10, 0x0}
>>>>> -----------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>> The following is disassemble code of "SpaceManager::~SpaceManager()".
>>>>> %rax has 0x10 at "0x00007f6080c97f88 <+200>", but I don't understand why this "0x10" is inserted to %rax.
>>>>>
>>>>> -----------------------------------------------------------------------------
>>>>> (gdb) disas
>>>>> Dump of assembler code for function SpaceManager::~SpaceManager():
>>>>> ??? 0x00007f6080c97ec0 <+0>:??? push?? %rbp
>>>>> ??? 0x00007f6080c97ec1 <+1>:??? mov??? %rsp,%rbp
>>>>> ??? 0x00007f6080c97ec4 <+4>:??? push?? %r15
>>>>> ??? 0x00007f6080c97ec6 <+6>:??? push?? %r14
>>>>> ??? 0x00007f6080c97ec8 <+8>:??? push?? %r13
>>>>> ??? 0x00007f6080c97eca <+10>:??? push?? %r12
>>>>> ??? 0x00007f6080c97ecc <+12>:??? push?? %rbx
>>>>> ??? 0x00007f6080c97ecd <+13>:??? mov??? %rdi,%rbx
>>>>> ??? 0x00007f6080c97ed0 <+16>:??? sub??? $0x8,%rsp
>>>>> ??? 0x00007f6080c97ed4 <+20>:??? mov 0x780785(%rip),%r12??????? # 0x7f6081418660 <_ZN12SpaceManager12_expand_lockE>
>>>>> ??? 0x00007f6080c97edb <+27>:??? test?? %r12,%r12
>>>>> ??? 0x00007f6080c97ede <+30>:??? je???? 0x7f6080c97ee8 <SpaceManager::~SpaceManager()+40>
>>>>> ??? 0x00007f6080c97ee0 <+32>:??? mov??? %r12,%rdi
>>>>> ??? 0x00007f6080c97ee3 <+35>:??? callq? 0x7f6080cce2f0 <Monitor::lock_without_safepoint_check()>
>>>>> ??? 0x00007f6080c97ee8 <+40>:??? movslq 0x8(%rbx),%rcx
>>>>> ??? 0x00007f6080c97eec <+44>:??? lea 0x78075d(%rip),%rdx??????? # 0x7f6081418650 <_ZN12MetaspaceAux15_capacity_wordsE>
>>>>> ??? 0x00007f6080c97ef3 <+51>:??? lea 0x781576(%rip),%r13??????? # 0x7f6081419470 <_ZN2os16_processor_countE>
>>>>> ??? 0x00007f6080c97efa <+58>:??? lea 0x78073f(%rip),%r15??????? # 0x7f6081418640 <_ZN12MetaspaceAux11_used_wordsE>
>>>>> ??? 0x00007f6080c97f01 <+65>:??? mov (%rdx,%rcx,8),%rax
>>>>> ??? 0x00007f6080c97f05 <+69>:??? sub 0x40(%rbx),%rax
>>>>> ??? 0x00007f6080c97f09 <+73>:??? mov %rax,(%rdx,%rcx,8)
>>>>> ??? 0x00007f6080c97f0d <+77>:??? mov 0x38(%rbx),%rax
>>>>> ??? 0x00007f6080c97f11 <+81>:??? movslq 0x8(%rbx),%rdx
>>>>> ??? 0x00007f6080c97f15 <+85>:??? neg??? %rax
>>>>> ??? 0x00007f6080c97f18 <+88>:??? cmpl?? $0x1,0x0(%r13)
>>>>> ??? 0x00007f6080c97f1d <+93>:??? lea (%r15,%rdx,8),%rcx
>>>>> ??? 0x00007f6080c97f21 <+97>:??? mov??? $0x1,%edx
>>>>> ??? 0x00007f6080c97f26 <+102>:??? jne 0x7f6080c97f32 <SpaceManager::~SpaceManager()+114>
>>>>> ??? 0x00007f6080c97f28 <+104>:??? lea 0x74acb4(%rip),%rdx??????? # 0x7f60813e2be3 <AssumeMP>
>>>>> ??? 0x00007f6080c97f2f <+111>:??? movzbl (%rdx),%edx
>>>>> ??? 0x00007f6080c97f32 <+114>:??? cmp??? $0x0,%dl
>>>>> ??? 0x00007f6080c97f35 <+117>:??? je 0x7f6080c97f38 <SpaceManager::~SpaceManager()+120>
>>>>> ??? 0x00007f6080c97f37 <+119>:??? lock xadd %rax,(%rcx)
>>>>> ??? 0x00007f6080c97f3c <+124>:??? mov 0x48(%rbx),%r14
>>>>> ??? 0x00007f6080c97f40 <+128>:??? callq 0x7f6080c951a0 <Metachunk::overhead()>
>>>>> ??? 0x00007f6080c97f45 <+133>:??? movslq 0x8(%rbx),%rdx
>>>>> ??? 0x00007f6080c97f49 <+137>:??? imul?? %r14,%rax
>>>>> ??? 0x00007f6080c97f4d <+141>:??? lea (%r15,%rdx,8),%rcx
>>>>> ??? 0x00007f6080c97f51 <+145>:??? mov??? $0x1,%edx
>>>>> ??? 0x00007f6080c97f56 <+150>:??? neg??? %rax
>>>>> ??? 0x00007f6080c97f59 <+153>:??? cmpl $0x1,0x0(%r13)
>>>>> ??? 0x00007f6080c97f5e <+158>:??? jne 0x7f6080c97f6a <SpaceManager::~SpaceManager()+170>
>>>>> ??? 0x00007f6080c97f60 <+160>:??? lea 0x74ac7c(%rip),%rdx??????? # 0x7f60813e2be3 <AssumeMP>
>>>>> ??? 0x00007f6080c97f67 <+167>:??? movzbl (%rdx),%edx
>>>>> ??? 0x00007f6080c97f6a <+170>:??? cmp??? $0x0,%dl
>>>>> ??? 0x00007f6080c97f6d <+173>:??? je 0x7f6080c97f70 <SpaceManager::~SpaceManager()+176>
>>>>> ??? 0x00007f6080c97f6f <+175>:??? lock xadd %rax,(%rcx)
>>>>> ??? 0x00007f6080c97f74 <+180>:??? xor??? %ecx,%ecx
>>>>> ??? 0x00007f6080c97f76 <+182>:??? xor??? %esi,%esi
>>>>> ??? 0x00007f6080c97f78 <+184>:??? mov 0x10(%rbx,%rcx,1),%rax
>>>>> ??? 0x00007f6080c97f7d <+189>:??? xor??? %edx,%edx
>>>>> ??? 0x00007f6080c97f7f <+191>:??? test?? %rax,%rax
>>>>> ??? 0x00007f6080c97f82 <+194>:??? je 0x7f6080c97f95 <SpaceManager::~SpaceManager()+213>
>>>>> ??? 0x00007f6080c97f84 <+196>:??? nopl?? 0x0(%rax)
>>>>> => 0x00007f6080c97f88 <+200>:??? mov 0x8(%rax),%rax
>>>>> ??? 0x00007f6080c97f8c <+204>:??? add??? $0x1,%rdx
>>>>> ??? 0x00007f6080c97f90 <+208>:??? test?? %rax,%rax
>>>>> ...
>>>>> (gdb) info registers
>>>>> rax??????????? 0x10??? 16
>>>>> rbx??????????? 0x7f5ff800ad30??? 140050159414576
>>>>> rcx??????????? 0x10??? 16
>>>>> rdx??????????? 0x0??? 0
>>>>> rsi??????????? 0x2??? 2
>>>>> rdi??????????? 0x1cfe570??? 30401904
>>>>> rbp??????????? 0x7f607c3ecb80??? 0x7f607c3ecb80
>>>>> rsp??????????? 0x7f607c3ecb50??? 0x7f607c3ecb50
>>>>> r8???????????? 0x7f5ff80ae320??? 140050160083744
>>>>> r9???????????? 0x7f5ff8052480??? 140050159707264
>>>>> r10??????????? 0x0??? 0
>>>>> r11??????????? 0x400??? 1024
>>>>> r12??????????? 0x1cfe570??? 30401904
>>>>> r13??????????? 0x7f6081419470??? 140052462146672
>>>>> r14??????????? 0x2??? 2
>>>>> r15??????????? 0x7f6081418640??? 140052462143040
>>>>> rip??????????? 0x7f6080c97f88??? 0x7f6080c97f88 <SpaceManager::~SpaceManager()+200>
>>>>> eflags???????? 0x206??? [ PF IF ]
>>>>> cs???????????? 0x33??? 51
>>>>> ss???????????? 0x2b??? 43
>>>>> ds???????????? 0x0??? 0
>>>>> es???????????? 0x0??? 0
>>>>> fs???????????? 0x0??? 0
>>>>> gs???????????? 0x0??? 0
>>>>> k0???????????? <unavailable>
>>>>> k1???????????? <unavailable>
>>>>> k2???????????? <unavailable>
>>>>> k3???????????? <unavailable>
>>>>> k4???????????? <unavailable>
>>>>> k5???????????? <unavailable>
>>>>> k6???????????? <unavailable>
>>>>> k7???????????? <unavailable>
>>>>> -----------------------------------------------------------------------------
>>>>>
>>>>> =============================================================================
>>>>>
>>>>>
>>>>>
>>>>> Does anyone know about this case?
>>>>>
>>>>> Thanks, Osamu
>>>>>
>>>>>
>>>
> 


From sangheon.kim at oracle.com  Fri Oct 25 14:02:23 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Fri, 25 Oct 2019 07:02:23 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
Message-ID: <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>

Hi Stefan,

On 10/23/19 1:47 AM, Stefan Johansson wrote:
> Hi Sangheon,
>
> On 2019-10-22 18:47, sangheon.kim at oracle.com wrote:
>> Hi Kim,
>>
>> On 10/22/19 12:19 AM, Kim Barrett wrote:
>>>> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
>>>> What do you think about below comment?
>>>>
>>>> ?? // Tries to allocate word_sz in the PLAB of the next 
>>>> "generation" after trying to
>>>> ?? // allocate into dest. Previous_plab_refill_failed indicates 
>>>> whether previous
>>>> ?? // PLAB refill for the original (source) object was failed.
>>> Drop ?was?.? Otherwise looks good.
>> Done.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3.inc
> Looks good in general, just one minor thing, no need for a new webrev 
> though:
> src/hotspot/share/gc/g1/g1Allocator.cpp
> ---
> 144?? for (uint nodex_index = 0; nodex_index < _num_alloc_regions; 
> nodex_index++) {
>
> The name nodex_index has one too many x:es =) I would prefer node_index.
Ouch!
Fixed..

In addition, Stefan, Thomas and I had some discussion about making 
PLAB-NUMA aware (only for survivor).
Stefan provided a patch with it and it is simple enough to include under 
this CR.

Webrev:
http://cr.openjdk.java.net/~sangheki/8220311/webrev.4
http://cr.openjdk.java.net/~sangheki/8220311/webrev.4.inc

Testing: hs-tier 1 ~ 3, with/without UseNUMA

Thanks,
Sangheon


> ---
>
> Thanks,
> Stefan
>
>>
>> Thanks,
>> Sangheon
>>
>>
>>>
>>>> ?? // Returns a non-NULL pointer if successful, and updates dest if 
>>>> required.
>>>> ?? // Also determines whether we should continue to try to allocate 
>>>> into the various
>>>> ?? // generations or just end trying to allocate.
>>>> ?? HeapWord* allocate_in_next_plab(G1HeapRegionAttr* dest,
>>>> ...
>>>>
>>>> Let me post the webrev when we decide. :)
>>>>
>>>> Thanks,
>>>> Sangheon
>>>>
>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>>
>>>>> Looks good, other than that one comment issue.
>>>
>>


From zgu at redhat.com  Fri Oct 25 14:29:08 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 25 Oct 2019 10:29:08 -0400
Subject: RFR 8232992: Shenandoah: Implement self-fixing interpreter LRB
Message-ID: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>

Please review this patch that implements self-fixing interpreter LRB.


Bug: https://bugs.openjdk.java.net/browse/JDK-8232992
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232992/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release)
   x86_64 and x86_32 on Linux
   aarch64 Linux
   Windows x86_64

Thanks,

-Zhengyu


From shade at redhat.com  Fri Oct 25 14:48:17 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 25 Oct 2019 16:48:17 +0200
Subject: RFR 8232992: Shenandoah: Implement self-fixing interpreter LRB
In-Reply-To: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
References: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
Message-ID: <6ba6da68-c48f-24b1-7ca3-d2bd8a46c4b8@redhat.com>

On 10/25/19 4:29 PM, Zhengyu Gu wrote:
> Please review this patch that implements self-fixing interpreter LRB.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232992
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232992/webrev.00/

*) I believe we can drop "fixup" from ShenandoahRuntime::load_reference_barrier_fixup(_narrow), as
there no not-fixup versions left.

*) shenandoahBarrierSetAssembler_x86.cpp nit: space before parenthesis

 346   if(setup_addr_first) {

*) shenandoahBarrierSetAssembler_x86.cpp: why pop(thread) is in the middle here?

 315   __ testb(gc_state, ShenandoahHeap::HAS_FORWARDED);
 316 #ifndef _LP64
 317   __ pop(thread);
 318 #endif
 319   __ jccb(Assembler::zero, done);

*) shenandoahBarrierSetAssembler_x86.cpp: seems to me it is cleaner to initialize the boolean
variable first, and then use it. Also, suggestion for name: "need_addr_setup".

    // Use rsi for src address
    const Register src_addr = rsi;
    bool need_addr_setup = (src_addr != dst);

    if (need_addr_setup) {
      ...
    } else {
      ...
    }

    __ call(RuntimeAddress(CAST_FROM_FN_PTR(...);

    if (need_addr_setup) {
      ...

*) shenandoahBarrierSetAssembler_x86.cpp, shenandoahBarrierSetAssembler_aarch64.cpp: since this code
now uses rscratch1, it has to assert that registers do not clash. For example with:

   assert_different_registers(dst, rscratch1, rscratch2);

*) shenandoahBarrierSetC2.cpp: this change looks like a bug fix for matching _narrow. Please RFR it
separately, it should go in sooner.

-- 
Thanks,
-Aleksey


From shade at redhat.com  Fri Oct 25 15:24:03 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 25 Oct 2019 17:24:03 +0200
Subject: RFR (XS) 8233021: Shenandoah: SBSC2::is_shenandoah_lrb_call should
 match all LRB shapes
Message-ID: <e82661f9-65ad-ecf0-01b1-fbcf53a833eb@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8233021

See bug for explanation.

Fix:
  http://cr.openjdk.java.net/~shade/8233021/webrev.01/

Testing: hotspot_gc_shenandoah; specjvm2008 with C2 verification turned on

-- 
Thanks,
-Aleksey


From sangheon.kim at oracle.com  Fri Oct 25 21:56:30 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Fri, 25 Oct 2019 14:56:30 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
 <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>
Message-ID: <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>

Hi Kim,

On 10/24/19 4:05 PM, Kim Barrett wrote:
>> On Oct 23, 2019, at 12:20 PM, sangheon.kim at oracle.com wrote:
>>
>> Hi Per,
>>
>> Thanks for taking a look at this.
>>
>> I agree all your comments and here's the webrev.
>> - All comments from Per.
>> - Move G1PageBasedVirtualSpace::page_size() near to page_start() from Kim.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6.inc
>> Testing: build test for linux, solaris, windows and mac.
>>
>> FYI, as I think existing numa related API names and -1 stuff seem not good, I planned to refine those later after pushing. But as you said following existing rule and then refine all together later seems better.
> The type of the argument for numa_get_group_id(void* address) should
> be "const void*".  Sorry I didn't notice that earlier.  Of course,
> this will require a const_cast to remove the const qualifier when
> calling get_mempolicy, but it is better to isolate the workaround for
> that missing qualifier to that one place.
>
> I'm not sure I like the overload for os::numa_get_group_id.  While
> both are getting the numa id associated with something, the associations
> involved seem pretty different to me.
>
> Spelling them out, they could be
>
> numa_get_group_id_for_current_thread()
> numa_get_group_id_for_address(const void* address)
>
> Those seem semantically unrelated to me, so violate the usual guidance
> of only overloading operations that are roughly equivalent (*).  Or put
> another way, one should not need to determine which overload is selected
> to understand a call site.
>
> Of course, "roughly equivalent" is in the eye of the beholder.
>
> (*) Operator overloading sometimes violates this on the basis that the
> syntactic concision of using operators is more important, and there
> are a limited set of operators.  Such violations are often used as an
> argument against using operator overloading at all.
I think the overload looks okay to me.
But as you are not sure about it, I renamed the newly added one.

- static int numa_get_group_id(void* address);
+ static int numa_get_group_id_for_address(const void* address);


webrev:
http://cr.openjdk.java.net/~sangheki/8220310/webrev.7
http://cr.openjdk.java.net/~sangheki/8220310/webrev.7.inc

Testing: hs-tier1

Thanks,
Sangheon


>


From zgu at redhat.com  Sat Oct 26 00:34:38 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 25 Oct 2019 20:34:38 -0400
Subject: RFR 8232992: Shenandoah: Implement self-fixing interpreter LRB
In-Reply-To: <6ba6da68-c48f-24b1-7ca3-d2bd8a46c4b8@redhat.com>
References: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
 <6ba6da68-c48f-24b1-7ca3-d2bd8a46c4b8@redhat.com>
Message-ID: <942c5c5d-fa2b-e14b-3319-0092d782da24@redhat.com>


On 10/25/19 10:48 AM, Aleksey Shipilev wrote:
> On 10/25/19 4:29 PM, Zhengyu Gu wrote:
>> Please review this patch that implements self-fixing interpreter LRB.
>>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232992
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8232992/webrev.00/
> 
> *) I believe we can drop "fixup" from ShenandoahRuntime::load_reference_barrier_fixup(_narrow), as
> there no not-fixup versions left.

Sure.

> 
> *) shenandoahBarrierSetAssembler_x86.cpp nit: space before parenthesis
> 
>   346   if(setup_addr_first) {

Fixed

> 
> *) shenandoahBarrierSetAssembler_x86.cpp: why pop(thread) is in the middle here?
> 
>   315   __ testb(gc_state, ShenandoahHeap::HAS_FORWARDED);
>   316 #ifndef _LP64
>   317   __ pop(thread);
>   318 #endif
>   319   __ jccb(Assembler::zero, done);
> 

I was worried about having to track 'thread', so just pop it after use 
and forget about it.

But, yes, there is nothing to worry about, reverted.

> *) shenandoahBarrierSetAssembler_x86.cpp: seems to me it is cleaner to initialize the boolean
> variable first, and then use it. Also, suggestion for name: "need_addr_setup".
> 
>      // Use rsi for src address
>      const Register src_addr = rsi;
>      bool need_addr_setup = (src_addr != dst);
> 
>      if (need_addr_setup) {
>        ...
>      } else {
>        ...
>      }
> 
>      __ call(RuntimeAddress(CAST_FROM_FN_PTR(...);
> 
>      if (need_addr_setup) {
>        ...
> 
Fixed

> *) shenandoahBarrierSetAssembler_x86.cpp, shenandoahBarrierSetAssembler_aarch64.cpp: since this code
> now uses rscratch1, it has to assert that registers do not clash. For example with:
> 
>     assert_different_registers(dst, rscratch1, rscratch2);

We only need to use rscratch1 when dst == r1, and there is possibility 
that dst comes in in rscratch1 (see SBSA::load_at() method), I think 
current assertion (dst != rscratch2) is sufficient.

However, we do need to ensure scratch registers are not used by 
load_addr, so added:

   assert_different_registers(load_addr.base(), load_addr.index(), 
rscratch1);
   assert_different_registers(load_addr.base(), load_addr.index(), 
rscratch2);


Updated: http://cr.openjdk.java.net/~zgu/JDK-8232992/webrev.01/

Reran hotspot_gc_shenandoah tests (fastdebug and release)
   x86_64 and x86_32 on Linux
   aarch64 on Linux

Thanks,

-Zhengyu

> 
> *) shenandoahBarrierSetC2.cpp: this change looks like a bug fix for matching _narrow. Please RFR it
> separately, it should go in sooner.
> 


From kim.barrett at oracle.com  Sat Oct 26 01:51:43 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 25 Oct 2019 21:51:43 -0400
Subject: RFR (XS): 8232951: TestG1ParallelPhases.java fails with phase
 NonYoungFreeCSet not found
In-Reply-To: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com>
References: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com>
Message-ID: <951EB603-F273-4787-9D8D-32D8194ECFAA@oracle.com>

> On Oct 24, 2019, at 7:50 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> [?]
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8232951
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8232951/webrev/
> Testing:
> 400 runs of the changed test without issues
> 
> Thanks,
>  Thomas

I'd not previously noticed the AlwaysTenure and NeverTenure options.
So many options...

Those options are documented as being ParallelGC only.  But it looks
like setting either of them forces a value for MaxTenuringThreshold,
so it seems okay to change the test to use AlwaysTenure.  The
documentation for the options should be updated though.  (That can be
a separate RFE.)

Please put the new -Xlog option on a separate line.  I know we don't
have an official line length limit, but 152 chars seems excessive to
me, and forced me to scroll to see some of it.

Other than that, looks good.  I don't need a new webrev.


From per.liden at oracle.com  Sat Oct 26 08:36:44 2019
From: per.liden at oracle.com (Per Liden)
Date: Sat, 26 Oct 2019 10:36:44 +0200
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
 <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>
 <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>
Message-ID: <9d9494cd-82cd-6cf6-94e6-432a6ae187fb@oracle.com>

On 10/25/19 11:56 PM, sangheon.kim at oracle.com wrote:
> Hi Kim,
> 
> On 10/24/19 4:05 PM, Kim Barrett wrote:
>>> On Oct 23, 2019, at 12:20 PM,sangheon.kim at oracle.com  wrote:
>>>
>>> Hi Per,
>>>
>>> Thanks for taking a look at this.
>>>
>>> I agree all your comments and here's the webrev.
>>> - All comments from Per.
>>> - Move G1PageBasedVirtualSpace::page_size() near to page_start() from Kim.
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6.inc
>>> Testing: build test for linux, solaris, windows and mac.
>>>
>>> FYI, as I think existing numa related API names and -1 stuff seem not good, I planned to refine those later after pushing. But as you said following existing rule and then refine all together later seems better.
>> The type of the argument for numa_get_group_id(void* address) should
>> be "const void*".  Sorry I didn't notice that earlier.  Of course,
>> this will require a const_cast to remove the const qualifier when
>> calling get_mempolicy, but it is better to isolate the workaround for
>> that missing qualifier to that one place.
>>
>> I'm not sure I like the overload for os::numa_get_group_id.  While
>> both are getting the numa id associated with something, the associations
>> involved seem pretty different to me.
>>
>> Spelling them out, they could be
>>
>> numa_get_group_id_for_current_thread()
>> numa_get_group_id_for_address(const void* address)
>>
>> Those seem semantically unrelated to me, so violate the usual guidance
>> of only overloading operations that are roughly equivalent (*).  Or put
>> another way, one should not need to determine which overload is selected
>> to understand a call site.
>>
>> Of course, "roughly equivalent" is in the eye of the beholder.
>>
>> (*) Operator overloading sometimes violates this on the basis that the
>> syntactic concision of using operators is more important, and there
>> are a limited set of operators.  Such violations are often used as an
>> argument against using operator overloading at all.
> I think the overload looks okay to me.
> But as you are not sure about it, I renamed the newly added one.
> 
> - static int numa_get_group_id(void* address);
> + static int numa_get_group_id_for_address(const void* address);
> 

Works for me.

/Per

> 
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.7
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.7.inc
> 
> Testing: hs-tier1
> 
> Thanks,
> Sangheon
> 
> 
> 


From kim.barrett at oracle.com  Sun Oct 27 22:02:44 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sun, 27 Oct 2019 18:02:44 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <06ACBF87-ADBE-499F-B668-0274E4925B26@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
 <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>
 <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>
Message-ID: <BF628A14-BA07-4E7B-9942-FBC59587A13B@oracle.com>

> On Oct 25, 2019, at 5:56 PM, sangheon.kim at oracle.com wrote:
> 
> -  static int    numa_get_group_id(void* address);
> +  static int    numa_get_group_id_for_address(const void* address);
> 
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.7
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.7.inc
> 
> Testing: hs-tier1
> 
> Thanks,
> Sangheon

Looks good.


From sakamoto.osamu at nttcom.co.jp  Mon Oct 28 02:40:00 2019
From: sakamoto.osamu at nttcom.co.jp (Osamu Sakamoto)
Date: Mon, 28 Oct 2019 11:40:00 +0900
Subject: Segmentation Fault occurs when ClassLoader and Metaspace is
 released in JDK 8
In-Reply-To: <ea8fc9d7-bab4-a812-2ed7-19a722e298b1@oss.nttdata.com>
References: <fb308571-cbdf-1f4b-177f-aa6bac986a5f@nttcom.co.jp_1>
 <422c9ca2-5053-c761-cb61-f075877bb666@oss.nttdata.com>
 <314f9ad2-17df-1082-8816-7af73a96e9fb@nttcom.co.jp_1>
 <1ccb4f35-7f21-4aa2-4cbb-b75244b6d12d@oss.nttdata.com>
 <af915a45-9d20-c637-9ee8-ca76e3006967@nttcom.co.jp_1>
 <ea8fc9d7-bab4-a812-2ed7-19a722e298b1@oss.nttdata.com>
Message-ID: <121df7b1-b423-7790-5453-c14b545fa40b@nttcom.co.jp_1>

Hi Yasumasa,

 > It seems a bug.
 > Anyone have any suggestions about this?
I think, too.
Does anyone know this?


 > (CLD::_class_loader is not static member, so this command would be 
failed.)
I checked CLD::_class_loader after moving frame 
9(ClassLoaderData::~ClassLoaderData), and it successed.
```
(gdb) f 9
#9? 0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData 
(this=0x7f5ff800ac20,
 ??? __in_chrg=<optimized out>)
 ??? at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
383??? ??? delete m;
(gdb) p ClassLoaderData::_class_loader
$34 = (oop) 0xa3afc1f0
```

I rechecked the _class_loader in purge_me, the value is same.
```
(gdb) f 10
#10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge ()
 ??? at 
/usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
818??? ??? delete purge_me;
(gdb) p purge_me
$35 = (ClassLoaderData *) 0x7f5ff800ac20
(gdb) p purge_me->_class_loader
$36 = (oop) 0xa3afc1f0
```


 > I believe CLD::_class_loader should be the OOP for class loader.
 > I guess memory corruption was occurred in some reason - Is it a bug 
in HotSpot?
I checked ClassLoaderDataGraph::_head->_class_loader, too.
Its oop indicates sun.reflect.DelegatingClassLoader.
So, it seems that _class_loader in purge_me has illegal oop value.

* GDB
```
(gdb) p ClassLoaderDataGraph::_head->_class_loader
$37 = (oop) 0xb10f3aa8
```

*CLHSDB
```
hsdb> inspect 0xb10f3aa8
instance of Oop for sun/reflect/DelegatingClassLoader @ 
0x00000000b10f3aa8 @ 0x00000000b10f3aa8 (size = 72)
_mark: 184567784057
_metadata._compressed_klass: InstanceKlass for 
sun/reflect/DelegatingClassLoader
parent: Oop for org/apache/catalina/loader/ParallelWebappClassLoader @ 
0x0000000099c00000 Oop for 
org/apache/catalina/loader/ParallelWebappClassLoader @ 0x0000000099c00000
 ??? :
```


Thanks,
Osamu

On 10/25/19 21:20, Yasumasa Suenaga wrote:
> It seems a bug.
> Anyone have any suggestions about this?
>
>> (gdb) p ClassLoaderData::_class_loader
>> $21 = (oop) 0xa3afc1f0
>
> (CLD::_class_loader is not static member, so this command would be 
> failed.)
>
>> hsdb> inspect 0xa3afc1f0
>> instance of [C @ 0x00000000a3afc1f0 @ 0x00000000a3afc1f0 (size = 72)
>> _mark: 1
>> _metadata._compressed_klass: TypeArrayKlass for [C
>> 0: 'c'
>
> I believe CLD::_class_loader should be the OOP for class loader.
> I guess memory corruption was occurred in some reason - Is it a bug in 
> HotSpot?
>
> I checked 8u222 on Fedora 30, my guess seems correct.
>
> * GDB
> ```
> (gdb) p ClassLoaderDataGraph::_head->_class_loader
> $2 = (oop) 0xd67d0900
> ```
>
> * CLHSDB
> ```
> hsdb> inspect 0xd67d0900
> instance of Oop for sun/misc/Launcher$AppClassLoader @ 
> 0x00000000d67d0900 @ 0x00000000d67d0900 (size = 96)
> _mark: 436443282689
> _metadata._compressed_klass: InstanceKlass for 
> sun/misc/Launcher$AppClassLoader
> parent: Oop for sun/misc/Launcher$ExtClassLoader @ 0x00000000d67bb348 
> Oop for sun/misc/Launcher$ExtClassLoader @ 0x00000000d67bb348
> ??????? :
> ```
>
>
> Yasumasa
>
>
> On 2019/10/25 17:53, Osamu Sakamoto wrote:
>> Hi Yasumasa,
>>
>>
>> ?> I guess this is a bug in combination of Metaspace and CMS.
>> ?> However current jdk/jdk has different implementation, so it might 
>> not be occur in modern JDK.
>> ?> I want to hear the comments from others.
>> Thank you for your comment.
>> I want to hear from others, too
>>
>>
>> ?> AFAICS you cannot find head of _unloading at this point.
>> ?> However you can traverse CLD list with purge_me->_next .
>> Thank you for telling me how to traverse CLD list.
>> I could start to traverse the CLD list, but this list is too long to 
>> traverse manually.
>> I recursively chekced _next -> _next -> next ... about 500 times with 
>> GDB print command, but NULL termination or address loop isn't found yet.
>> I'll try to find a good way to traverse the CLD list to the end.
>>
>>
>> ?> BTW, CLD has OOP for class loader in ClassLoaderData::_class_loader .
>> ?> If you check it on (CL)HSDB, you might get any hints from it.
>> ?> For example, use system class loader instead of custom class 
>> loader from framework.
>> I checked CLD oop, but I don't understand what type of ClassLoader is.
>> The result is below.
>> It looks like that this ClassLoaderData::_class_loader oop indicates 
>> character array.
>> Is it normal?
>> If so, what is this class loader?(Bootstrap ClassLoader?)
>>
>> ---------------------------------------------------
>> (gdb) p ClassLoaderData::_class_loader
>> $21 = (oop) 0xa3afc1f0
>>
>> hsdb> inspect 0xa3afc1f0
>> instance of [C @ 0x00000000a3afc1f0 @ 0x00000000a3afc1f0 (size = 72)
>> _mark: 1
>> _metadata._compressed_klass: TypeArrayKlass for [C
>> 0: 'c'
>> 1: 'o'
>> 2: 'l'
>> 3: 'u'
>> 4: 'm'
>> 5: 'n'
>> 6: '1'
>> 7: '5'
>> 8: '6'
>> 9: '5'
>> 10: '7'
>> 11: '5'
>> 12: '5'
>> 13: '9'
>> 14: '8'
>> 15: '6'
>> 16: '3'
>> 17: '3'
>> 18: '1'
>> 19: '_'
>> 20: '8'
>> 21: '0'
>> 22: '0'
>> 23: '3'
>> ---------------------------------------------------
>>
>>
>> Thanks,
>>
>> Osamu
>>
>>
>> On 10/24/19 09:49, Yasumasa Suenaga wrote:
>>> Hi Osamu,
>>>
>>> I guess this is a bug in combination of Metaspace and CMS.
>>> However current jdk/jdk has different implementation, so it might 
>>> not be occur in modern JDK.
>>> I want to hear the comments from others.
>>>
>>> My comments is below:
>>>
>>> On 2019/10/23 18:57, Osamu Sakamoto wrote:
>>>> Hi Yasumasa,
>>>>
>>>> Thank you for answering.
>>>>
>>>> ?> What JVM options did you pass?
>>>> The following is the JVM options I passed.
>>>> -----------------------------------------------------------------
>>>> -Xmx2048m
>>>> -Xms2048m
>>>> -XX:NewSize=412m
>>>> -XX:MaxNewSize=412m
>>>> -XX:SurvivorRatio=8
>>>> -XX:MaxTenuringThreshold=15
>>>> -XX:+UseConcMarkSweepGC
>>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>> -XX:CMSInitiatingOccupancyFraction=80
>>>> -XX:+CMSClassUnloadingEnabled
>>>> -XX:CompressedClassSpaceSize=64m
>>>> -XX:+PrintGCDetails
>>>> -XX:+PrintGCDateStamps
>>>> -XX:+UseGCLogFileRotation
>>>> -XX:GCLogFileSize=0
>>>> -Xloggc:/var/log/tomcatm0/gc-%p.log
>>>> -XX:+HeapDumpOnOutOfMemoryError
>>>> -XX:+AlwaysLockClassLoader
>>>> -----------------------------------------------------------------
>>>>
>>>>
>>>> ?> I guess you used CMS because this problem seems to occur on CMS 
>>>> only [1] [2].
>>>> Yes, I used CMS.
>>>>
>>>> ?> So it might be work around not to use CMS.
>>>> Thank you for telling me work around.
>>>> But it is difficult to change the GC method, so we would like to 
>>>> solve this issue with CMS GC if possible.
>>>>
>>>>
>>>> ?> I'm not sure root cause of this issue, but it seems to break 
>>>> ClassLoaderDataGraph::_unloading.
>>>> ?> (like double free (delete) of CLD)
>>>> I checked whether the ClassLoaderDataGraph::_unloading is broken or 
>>>> not, but I didn't know because of the value has been cleaered by 
>>>> NULL or optimized out.
>>>>
>>>> Referring ClassLoaderDataGraph[1].cpp, it looks like that 
>>>> _unloading value is saved to ClassLoaderDataGraph::_saved_unloading.
>>>> But _saved_unloading had been cleared by NULL, too.
>>>>
>>>> Is there any other way to check it?
>>>>
>>>> [1]http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.cpp#l753 
>>>>
>>>>
>>>> -----------------------------------------------------------------
>>>> (gdb) f 10
>>>> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at 
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818 
>>>>
>>>> 818??? ??? delete purge_me;
>>>> (gdb) list ClassLoaderDataGraph::purge
>>>> 810??? void ClassLoaderDataGraph::purge() {
>>>> 811??? ? assert(SafepointSynchronize::is_at_safepoint(), "must be 
>>>> at safepoint!");
>>>> 812??? ? ClassLoaderData* list = _unloading;
>>>> 813??? ? _unloading = NULL;
>>>> 814??? ? ClassLoaderData* next = list;
>>>> 815??? ? while (next != NULL) {
>>>> 816??? ??? ClassLoaderData* purge_me = next;
>>>> 817??? ??? next = purge_me->next();
>>>> 818??? ??? delete purge_me;
>>>> 819??? ? }
>>>> 820??? ? Metaspace::purge();
>>>> 821??? }
>>>> (gdb) p _unloading
>>>> $29 = (ClassLoaderData *) 0x0
>>>> (gdb) p list
>>>> $30 = <optimized out>
>>>> (gdb) p next
>>>> $31 = <optimized out>
>>>> (gdb) p ClassLoaderDataGraph::_saved_unloading
>>>> $32 = (ClassLoaderData *) 0x0
>>>> -----------------------------------------------------------------
>>>
>>> AFAICS you cannot find head of _unloading at this point.
>>> However you can traverse CLD list with purge_me->_next .
>>>
>>>
>>> BTW, CLD has OOP for class loader in ClassLoaderData::_class_loader .
>>> If you check it on (CL)HSDB, you might get any hints from it.
>>> For example, use system class loader instead of custom class loader 
>>> from framework.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>> Thanks,
>>>> Osamu
>>>>
>>>> On 10/21/19 22:29, Yasumasa Suenaga wrote:
>>>>> Hi Osamu,
>>>>>
>>>>> What JVM options did you pass?
>>>>>
>>>>> I guess you used CMS because this problem seems to occur on CMS 
>>>>> only [1] [2].
>>>>> So it might be work around not to use CMS.
>>>>>
>>>>> I'm not sure root cause of this issue, but it seems to break 
>>>>> ClassLoaderDataGraph::_unloading.
>>>>> (like double free (delete) of CLD)
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> [1] 
>>>>> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.hpp#l100
>>>>> [2] 
>>>>> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp#l6384
>>>>>
>>>>>
>>>>> On 2019/10/21 17:50, Osamu Sakamoto wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I have a problem about Segmentation Fault(SEGV) in GC and I can't 
>>>>>> make the cause clear.
>>>>>> Could you help me solve the problem?
>>>>>>
>>>>>> Our System uses OpenJDK 1.8.0.181, and crashed by SEGV when 
>>>>>> purging ClassLoader at safepoint.
>>>>>> This problem can't be reproduced, but this has happened 4 times 
>>>>>> in a few months.
>>>>>>
>>>>>> The following is the summary of my investigation.
>>>>>>
>>>>>> ============================================================================= 
>>>>>>
>>>>>>
>>>>>> First I checked hs_err, and that shows that the SEGV occurred.
>>>>>> VM_Operation is GenCollectForAllocation at safepoint.
>>>>>>
>>>>>> ----------------------------------------------------------------------------- 
>>>>>>
>>>>>> #
>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>> #
>>>>>> #? SIGSEGV (0xb) at pc=0x00007f6080c97f88, pid=23931, 
>>>>>> tid=0x00007f607c3ed700
>>>>>> #
>>>>>> # JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 
>>>>>> 1.8.0_181-b13)
>>>>>> # Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode 
>>>>>> linux-amd64 compressed oops)
>>>>>> # Problematic frame:
>>>>>> # V? [libjvm.so+0x84bf88]
>>>>>> #
>>>>>> # Core dump written. Default location: /opt/tomcate0/core or 
>>>>>> core.23931
>>>>>> #
>>>>>> # If you would like to submit a bug report, please visit:
>>>>>> #?? http://bugreport.java.com/bugreport/crash.jsp
>>>>>> #
>>>>>>
>>>>>> ---------------? T H R E A D? ---------------
>>>>>>
>>>>>> Current thread (0x00007f6078c00000):? VMThread [stack: 
>>>>>> 0x00007f607c2ed000,0x00007f607c3ee000] [id=23939]
>>>>>>
>>>>>> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), 
>>>>>> si_addr: 0x0000000000000018
>>>>>>
>>>>>> Registers:
>>>>>> RAX=0x0000000000000010, RBX=0x00007f5ff800ad30, 
>>>>>> RCX=0x0000000000000010, RDX=0x0000000000000000
>>>>>> RSP=0x00007f607c3ecb50, RBP=0x00007f607c3ecb80, 
>>>>>> RSI=0x0000000000000002, RDI=0x0000000001cfe570
>>>>>> R8 =0x00007f5ff80ae320, R9 =0x00007f5ff8052480, 
>>>>>> R10=0x0000000000000000, R11=0x0000000000000400
>>>>>> R12=0x0000000001cfe570, R13=0x00007f6081419470, 
>>>>>> R14=0x0000000000000002, R15=0x00007f6081418640
>>>>>> RIP=0x00007f6080c97f88, EFLAGS=0x0000000000010202, 
>>>>>> CSGSFS=0x0000000000000033, ERR=0x0000000000000004
>>>>>> ?? TRAPNO=0x000000000000000e
>>>>>>
>>>>>> Top of Stack: (sp=0x00007f607c3ecb50)
>>>>>> 0x00007f607c3ecb50:?? 00007f607c3ecba0 00007f5ff800ad30
>>>>>> 0x00007f607c3ecb60:?? 00007f5ff800ad00 0000000000000000
>>>>>> 0x00007f607c3ecb70:?? 0000000000000000 0000000000000001
>>>>>> 0x00007f607c3ecb80:?? 00007f607c3ecba0 00007f6080c995fa
>>>>>> 0x00007f607c3ecb90:?? 00007f5ff800ad00 00007f5ff800ac20
>>>>>> 0x00007f607c3ecba0:?? 00007f607c3ecbc0 00007f60808bff5e
>>>>>> 0x00007f607c3ecbb0:?? 00007f5ff800ac20 00007f5ff8052870
>>>>>> 0x00007f607c3ecbc0:?? 00007f607c3ecbe0 00007f60808c0f0f
>>>>>> 0x00007f607c3ecbd0:?? 00007f607c3ecbf0 00007f608140f308
>>>>>> 0x00007f607c3ecbe0:?? 00007f607c3ecc30 00007f6080daa0b7
>>>>>> 0x00007f607c3ecbf0:?? 00007f6069000100 0000000000000000
>>>>>> 0x00007f607c3ecc00:?? 00007f607c3ecc20 00007f6080ed0800
>>>>>> 0x00007f607c3ecc10:?? 00000000000000f9 88e95c3ba257ab00
>>>>>> 0x00007f607c3ecc20:?? 431bde82d7b634db 00007f607800aa00
>>>>>> 0x00007f607c3ecc30:?? 00007f607c3eccc0 00007f6080daa9d5
>>>>>> 0x00007f607c3ecc40:?? 0000000000000000 00007f607803bf20
>>>>>> 0x00007f607c3ecc50:?? 00007f607803be20 00000000000003e8
>>>>>> 0x00007f607c3ecc60:?? 0000000000000001 00007f6078c00000
>>>>>> 0x00007f607c3ecc70:?? 00007f607c3eccc0 0000000000000000
>>>>>> 0x00007f607c3ecc80:?? 00000004000000f9 00007f60813e2b99
>>>>>> 0x00007f607c3ecc90:?? 00007f607803bfa0 00007f6078c00000
>>>>>> 0x00007f607c3ecca0:?? 0000000000000000 0000000000000000
>>>>>> 0x00007f607c3eccb0:?? 00007f6081418bd0 00007f607803bf20
>>>>>> 0x00007f607c3eccc0:?? 00007f607c3ece60 00007f6080f2048a
>>>>>> 0x00007f607c3eccd0:?? 00007f607c3ecd20 00007f607c3ecce0
>>>>>> 0x00007f607c3ecce0:?? 00007f6078c00000 00007f6078c00980
>>>>>> 0x00007f607c3eccf0:?? 00007f6078c009c0 00007f6078c009d0
>>>>>> 0x00007f607c3ecd00:?? 00007f6078c00aa8 00000000000000d8
>>>>>> 0x00007f607c3ecd10:?? 00007f6078c00be0 0000000000000000
>>>>>> 0x00007f607c3ecd20:?? 00007f607c3ecd28 6e69747563657845
>>>>>> 0x00007f607c3ecd30:?? 65706f204d562067 203a6e6f69746172
>>>>>> 0x00007f607c3ecd40:?? 656c6c6f436e6547 6c6c41726f467463
>>>>>>
>>>>>> Instructions: (pc=0x00007f6080c97f88)
>>>>>> 0x00007f6080c97f68:?? b6 12 80 fa 00 74 01 f0 48 0f c1 01 31 c9 
>>>>>> 31 f6
>>>>>> 0x00007f6080c97f78:?? 48 8b 44 0b 10 31 d2 48 85 c0 74 11 0f 1f 
>>>>>> 40 00
>>>>>> 0x00007f6080c97f88:?? 48 8b 40 08 48 83 c2 01 48 85 c0 75 f3 48 
>>>>>> 83 c1
>>>>>> 0x00007f6080c97f98:?? 08 48 01 d6 48 83 f9 20 75 d6 8b 7b 08 48 
>>>>>> 8b 05
>>>>>>
>>>>>> Register to memory mapping:
>>>>>>
>>>>>> RAX=0x0000000000000010 is an unknown value
>>>>>> RBX=0x00007f5ff800ad30 is an unknown value
>>>>>> RCX=0x0000000000000010 is an unknown value
>>>>>> RDX=0x0000000000000000 is an unknown value
>>>>>> RSP=0x00007f607c3ecb50 is an unknown value
>>>>>> RBP=0x00007f607c3ecb80 is an unknown value
>>>>>> RSI=0x0000000000000002 is an unknown value
>>>>>> RDI=0x0000000001cfe570 is an unknown value
>>>>>> R8 =0x00007f5ff80ae320 is an unknown value
>>>>>> R9 =0x00007f5ff8052480 is an unknown value
>>>>>> R10=0x0000000000000000 is an unknown value
>>>>>> R11=0x0000000000000400 is an unknown value
>>>>>> R12=0x0000000001cfe570 is an unknown value
>>>>>> R13=0x00007f6081419470: <offset 0xfcd470> in 
>>>>>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so 
>>>>>> at 0x00007f608044c000
>>>>>> R14=0x0000000000000002 is an unknown value
>>>>>> R15=0x00007f6081418640: <offset 0xfcc640> in 
>>>>>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so 
>>>>>> at 0x00007f608044c000
>>>>>>
>>>>>>
>>>>>> Stack: [0x00007f607c2ed000,0x00007f607c3ee000], 
>>>>>> sp=0x00007f607c3ecb50, free space=1022k
>>>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
>>>>>> C=native code)
>>>>>> V? [libjvm.so+0x84bf88]
>>>>>> V? [libjvm.so+0x84d5fa]
>>>>>> V? [libjvm.so+0x473f5e]
>>>>>> V? [libjvm.so+0x474f0f]
>>>>>> V? [libjvm.so+0x95e0b7]
>>>>>> V? [libjvm.so+0x95e9d5]
>>>>>> V? [libjvm.so+0xad448a]
>>>>>> V? [libjvm.so+0xad48f1]
>>>>>> V? [libjvm.so+0x8beb82]
>>>>>>
>>>>>> VM_Operation (0x00007f5fd69e6120): GenCollectForAllocation, mode: 
>>>>>> safepoint, requested by thread 0x00007f6079013800
>>>>>>
>>>>>> ...
>>>>>> ----------------------------------------------------------------------------- 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Next, I used GDB to check the backtrace of the SEGV thread from 
>>>>>> the coredump.
>>>>>> The following is the backtrace.
>>>>>> The SEGV occurred when ClassLoader is purged and Metaspace is 
>>>>>> destructed.
>>>>>> And frame #7 shows that a signal(SEGV) handler is called after 
>>>>>> SpaceManager::~SpaceManager() is executed.
>>>>>>
>>>>>> ----------------------------------------------------------------------------- 
>>>>>>
>>>>>> (gdb) bt
>>>>>> #0? 0x00007f608146f1f7 in __GI_raise (sig=sig at entry=6) at 
>>>>>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>>>>>> #1? 0x00007f60814708e8 in __GI_abort () at abort.c:90
>>>>>> #2? 0x00007f6080d0bc39 in os::abort (dump_core=<optimized out>) 
>>>>>> at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1519
>>>>>> #3? 0x00007f6080f1b816 in VMError::report_and_die 
>>>>>> (this=this at entry=0x7f607c3ebd10) at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060
>>>>>> #4? 0x00007f6080d15927 in JVM_handle_linux_signal (sig=11, 
>>>>>> info=0x7f607c3ebfb0, ucVoid=0x7f607c3ebe80, 
>>>>>> abort_if_unrecognized=<optimized out>)
>>>>>> ???? at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
>>>>>> #5? 0x00007f6080d09038 in signalHandler (sig=11, 
>>>>>> info=0x7f607c3ebfb0, uc=0x7f607c3ebe80) at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4446
>>>>>> #6? <signal handler called>
>>>>>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, 
>>>>>> __in_chrg=<optimized out>) at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>>>>>> #8? 0x00007f6080c995fa in Metaspace::~Metaspace 
>>>>>> (this=0x7f5ff800ad00, __in_chrg=<optimized out>)
>>>>>> ???? at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2971
>>>>>> #9? 0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData 
>>>>>> (this=0x7f5ff800ac20, __in_chrg=<optimized out>)
>>>>>> ???? at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
>>>>>> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
>>>>>> #11 0x00007f6080daa0b7 in ClassLoaderDataGraph::purge_if_needed 
>>>>>> () at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.hpp:104
>>>>>> #12 SafepointSynchronize::do_cleanup_tasks () at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:551
>>>>>> #13 0x00007f6080daa9d5 in SafepointSynchronize::begin () at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:402
>>>>>> #14 0x00007f6080f2048a in VMThread::loop 
>>>>>> (this=this at entry=0x7f6078c00000) at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:501
>>>>>> #15 0x00007f6080f208f1 in VMThread::run (this=0x7f6078c00000) at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276
>>>>>> #16 0x00007f6080d0ab82 in java_start (thread=0x7f6078c00000) at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796
>>>>>> #17 0x00007f6081e2de25 in start_thread (arg=0x7f607c3ed700) at 
>>>>>> pthread_create.c:308
>>>>>> #18 0x00007f608153234d in clone () at 
>>>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
>>>>>> ----------------------------------------------------------------------------- 
>>>>>>
>>>>>>
>>>>>>
>>>>>> In Frame #7, Line 2028 (chunk = chunk->next()) is the crash point.
>>>>>> The variable "chunk" is defined at Line 2025 (Metachunk* chunk = 
>>>>>> chunks_in_use(i);).
>>>>>> "chunks_in_use(i)" is defined at Line 648 (Metachunk* 
>>>>>> chunks_in_use(ChunkIndex index) const { return 
>>>>>> _chunks_in_use[index]; }).
>>>>>> So I checked values of "_chunks_in_use", and understood that 
>>>>>> "_chunks_in_use[2]" has Illegal Address "0x10".
>>>>>> Therefore, I think that the SEGV occurred because of referencing 
>>>>>> Illegal Address "0x10" at "chunk = chunk->next()".
>>>>>>
>>>>>> ----------------------------------------------------------------------------- 
>>>>>>
>>>>>> (gdb) f 7
>>>>>> #7? SpaceManager::~SpaceManager (this=0x7f5ff800ad30, 
>>>>>> __in_chrg=<optimized out>) at 
>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
>>>>>> 2028??? ??? chunk = chunk->next();
>>>>>> (gdb) list
>>>>>> 2023??? size_t 
>>>>>> SpaceManager::sum_count_in_chunks_in_use(ChunkIndex i) {
>>>>>> 2024??? ? size_t count = 0;
>>>>>> 2025??? ? Metachunk* chunk = chunks_in_use(i);
>>>>>> 2026??? ? while (chunk != NULL) {
>>>>>> 2027??? ??? count++;
>>>>>> 2028??? ??? chunk = chunk->next();
>>>>>> 2029??? ? }
>>>>>> 2030??? ? return count;
>>>>>> 2031??? }
>>>>>> 2032
>>>>>> (gdb) list SpaceManager::chunks_in_use
>>>>>> 647??? ? // Accessors
>>>>>> 648??? ? Metachunk* chunks_in_use(ChunkIndex index) const { 
>>>>>> return _chunks_in_use[index]; }
>>>>>> ...
>>>>>> (gdb) p _chunks_in_use
>>>>>> $11 = {0x7f5fcd41c400, 0x7f5fcd41a000, 0x10, 0x0}
>>>>>> ----------------------------------------------------------------------------- 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The following is disassemble code of 
>>>>>> "SpaceManager::~SpaceManager()".
>>>>>> %rax has 0x10 at "0x00007f6080c97f88 <+200>", but I don't 
>>>>>> understand why this "0x10" is inserted to %rax.
>>>>>>
>>>>>> ----------------------------------------------------------------------------- 
>>>>>>
>>>>>> (gdb) disas
>>>>>> Dump of assembler code for function SpaceManager::~SpaceManager():
>>>>>> ??? 0x00007f6080c97ec0 <+0>:??? push?? %rbp
>>>>>> ??? 0x00007f6080c97ec1 <+1>:??? mov??? %rsp,%rbp
>>>>>> ??? 0x00007f6080c97ec4 <+4>:??? push?? %r15
>>>>>> ??? 0x00007f6080c97ec6 <+6>:??? push?? %r14
>>>>>> ??? 0x00007f6080c97ec8 <+8>:??? push?? %r13
>>>>>> ??? 0x00007f6080c97eca <+10>:??? push?? %r12
>>>>>> ??? 0x00007f6080c97ecc <+12>:??? push?? %rbx
>>>>>> ??? 0x00007f6080c97ecd <+13>:??? mov??? %rdi,%rbx
>>>>>> ??? 0x00007f6080c97ed0 <+16>:??? sub??? $0x8,%rsp
>>>>>> ??? 0x00007f6080c97ed4 <+20>:??? mov 0x780785(%rip),%r12??????? # 
>>>>>> 0x7f6081418660 <_ZN12SpaceManager12_expand_lockE>
>>>>>> ??? 0x00007f6080c97edb <+27>:??? test?? %r12,%r12
>>>>>> ??? 0x00007f6080c97ede <+30>:??? je 0x7f6080c97ee8 
>>>>>> <SpaceManager::~SpaceManager()+40>
>>>>>> ??? 0x00007f6080c97ee0 <+32>:??? mov??? %r12,%rdi
>>>>>> ??? 0x00007f6080c97ee3 <+35>:??? callq 0x7f6080cce2f0 
>>>>>> <Monitor::lock_without_safepoint_check()>
>>>>>> ??? 0x00007f6080c97ee8 <+40>:??? movslq 0x8(%rbx),%rcx
>>>>>> ??? 0x00007f6080c97eec <+44>:??? lea 0x78075d(%rip),%rdx??????? # 
>>>>>> 0x7f6081418650 <_ZN12MetaspaceAux15_capacity_wordsE>
>>>>>> ??? 0x00007f6080c97ef3 <+51>:??? lea 0x781576(%rip),%r13??????? # 
>>>>>> 0x7f6081419470 <_ZN2os16_processor_countE>
>>>>>> ??? 0x00007f6080c97efa <+58>:??? lea 0x78073f(%rip),%r15??????? # 
>>>>>> 0x7f6081418640 <_ZN12MetaspaceAux11_used_wordsE>
>>>>>> ??? 0x00007f6080c97f01 <+65>:??? mov (%rdx,%rcx,8),%rax
>>>>>> ??? 0x00007f6080c97f05 <+69>:??? sub 0x40(%rbx),%rax
>>>>>> ??? 0x00007f6080c97f09 <+73>:??? mov %rax,(%rdx,%rcx,8)
>>>>>> ??? 0x00007f6080c97f0d <+77>:??? mov 0x38(%rbx),%rax
>>>>>> ??? 0x00007f6080c97f11 <+81>:??? movslq 0x8(%rbx),%rdx
>>>>>> ??? 0x00007f6080c97f15 <+85>:??? neg??? %rax
>>>>>> ??? 0x00007f6080c97f18 <+88>:??? cmpl $0x1,0x0(%r13)
>>>>>> ??? 0x00007f6080c97f1d <+93>:??? lea (%r15,%rdx,8),%rcx
>>>>>> ??? 0x00007f6080c97f21 <+97>:??? mov??? $0x1,%edx
>>>>>> ??? 0x00007f6080c97f26 <+102>:??? jne 0x7f6080c97f32 
>>>>>> <SpaceManager::~SpaceManager()+114>
>>>>>> ??? 0x00007f6080c97f28 <+104>:??? lea 0x74acb4(%rip),%rdx??????? 
>>>>>> # 0x7f60813e2be3 <AssumeMP>
>>>>>> ??? 0x00007f6080c97f2f <+111>:??? movzbl (%rdx),%edx
>>>>>> ??? 0x00007f6080c97f32 <+114>:??? cmp??? $0x0,%dl
>>>>>> ??? 0x00007f6080c97f35 <+117>:??? je 0x7f6080c97f38 
>>>>>> <SpaceManager::~SpaceManager()+120>
>>>>>> ??? 0x00007f6080c97f37 <+119>:??? lock xadd %rax,(%rcx)
>>>>>> ??? 0x00007f6080c97f3c <+124>:??? mov 0x48(%rbx),%r14
>>>>>> ??? 0x00007f6080c97f40 <+128>:??? callq 0x7f6080c951a0 
>>>>>> <Metachunk::overhead()>
>>>>>> ??? 0x00007f6080c97f45 <+133>:??? movslq 0x8(%rbx),%rdx
>>>>>> ??? 0x00007f6080c97f49 <+137>:??? imul?? %r14,%rax
>>>>>> ??? 0x00007f6080c97f4d <+141>:??? lea (%r15,%rdx,8),%rcx
>>>>>> ??? 0x00007f6080c97f51 <+145>:??? mov??? $0x1,%edx
>>>>>> ??? 0x00007f6080c97f56 <+150>:??? neg??? %rax
>>>>>> ??? 0x00007f6080c97f59 <+153>:??? cmpl $0x1,0x0(%r13)
>>>>>> ??? 0x00007f6080c97f5e <+158>:??? jne 0x7f6080c97f6a 
>>>>>> <SpaceManager::~SpaceManager()+170>
>>>>>> ??? 0x00007f6080c97f60 <+160>:??? lea 0x74ac7c(%rip),%rdx??????? 
>>>>>> # 0x7f60813e2be3 <AssumeMP>
>>>>>> ??? 0x00007f6080c97f67 <+167>:??? movzbl (%rdx),%edx
>>>>>> ??? 0x00007f6080c97f6a <+170>:??? cmp??? $0x0,%dl
>>>>>> ??? 0x00007f6080c97f6d <+173>:??? je 0x7f6080c97f70 
>>>>>> <SpaceManager::~SpaceManager()+176>
>>>>>> ??? 0x00007f6080c97f6f <+175>:??? lock xadd %rax,(%rcx)
>>>>>> ??? 0x00007f6080c97f74 <+180>:??? xor??? %ecx,%ecx
>>>>>> ??? 0x00007f6080c97f76 <+182>:??? xor??? %esi,%esi
>>>>>> ??? 0x00007f6080c97f78 <+184>:??? mov 0x10(%rbx,%rcx,1),%rax
>>>>>> ??? 0x00007f6080c97f7d <+189>:??? xor??? %edx,%edx
>>>>>> ??? 0x00007f6080c97f7f <+191>:??? test?? %rax,%rax
>>>>>> ??? 0x00007f6080c97f82 <+194>:??? je 0x7f6080c97f95 
>>>>>> <SpaceManager::~SpaceManager()+213>
>>>>>> ??? 0x00007f6080c97f84 <+196>:??? nopl?? 0x0(%rax)
>>>>>> => 0x00007f6080c97f88 <+200>:??? mov 0x8(%rax),%rax
>>>>>> ??? 0x00007f6080c97f8c <+204>:??? add??? $0x1,%rdx
>>>>>> ??? 0x00007f6080c97f90 <+208>:??? test?? %rax,%rax
>>>>>> ...
>>>>>> (gdb) info registers
>>>>>> rax??????????? 0x10??? 16
>>>>>> rbx??????????? 0x7f5ff800ad30??? 140050159414576
>>>>>> rcx??????????? 0x10??? 16
>>>>>> rdx??????????? 0x0??? 0
>>>>>> rsi??????????? 0x2??? 2
>>>>>> rdi??????????? 0x1cfe570??? 30401904
>>>>>> rbp??????????? 0x7f607c3ecb80??? 0x7f607c3ecb80
>>>>>> rsp??????????? 0x7f607c3ecb50??? 0x7f607c3ecb50
>>>>>> r8???????????? 0x7f5ff80ae320??? 140050160083744
>>>>>> r9???????????? 0x7f5ff8052480??? 140050159707264
>>>>>> r10??????????? 0x0??? 0
>>>>>> r11??????????? 0x400??? 1024
>>>>>> r12??????????? 0x1cfe570??? 30401904
>>>>>> r13??????????? 0x7f6081419470??? 140052462146672
>>>>>> r14??????????? 0x2??? 2
>>>>>> r15??????????? 0x7f6081418640??? 140052462143040
>>>>>> rip??????????? 0x7f6080c97f88??? 0x7f6080c97f88 
>>>>>> <SpaceManager::~SpaceManager()+200>
>>>>>> eflags???????? 0x206??? [ PF IF ]
>>>>>> cs???????????? 0x33??? 51
>>>>>> ss???????????? 0x2b??? 43
>>>>>> ds???????????? 0x0??? 0
>>>>>> es???????????? 0x0??? 0
>>>>>> fs???????????? 0x0??? 0
>>>>>> gs???????????? 0x0??? 0
>>>>>> k0???????????? <unavailable>
>>>>>> k1???????????? <unavailable>
>>>>>> k2???????????? <unavailable>
>>>>>> k3???????????? <unavailable>
>>>>>> k4???????????? <unavailable>
>>>>>> k5???????????? <unavailable>
>>>>>> k6???????????? <unavailable>
>>>>>> k7???????????? <unavailable>
>>>>>> ----------------------------------------------------------------------------- 
>>>>>>
>>>>>>
>>>>>> ============================================================================= 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Does anyone know about this case?
>>>>>>
>>>>>> Thanks, Osamu
>>>>>>
>>>>>>
>>>>
>>


From stefan.johansson at oracle.com  Mon Oct 28 08:35:53 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 28 Oct 2019 09:35:53 +0100
Subject: RFR (XS): 8232951: TestG1ParallelPhases.java fails with phase
 NonYoungFreeCSet not found
In-Reply-To: <951EB603-F273-4787-9D8D-32D8194ECFAA@oracle.com>
References: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com>
 <951EB603-F273-4787-9D8D-32D8194ECFAA@oracle.com>
Message-ID: <7734C751-1D9B-45F5-86FB-D51D2BE8985F@oracle.com>

Hi Thomas,

> 26 okt. 2019 kl. 03:51 skrev Kim Barrett <kim.barrett at oracle.com>:
> 
>> On Oct 24, 2019, at 7:50 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>> [?]
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8232951
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8232951/webrev/
>> Testing:
>> 400 runs of the changed test without issues
>> 
>> Thanks,
>> Thomas
> 
> I'd not previously noticed the AlwaysTenure and NeverTenure options.
> So many options...
> 
> Those options are documented as being ParallelGC only.  But it looks
> like setting either of them forces a value for MaxTenuringThreshold,
> so it seems okay to change the test to use AlwaysTenure.  The
> documentation for the options should be updated though.  (That can be
> a separate RFE.)
> 
> Please put the new -Xlog option on a separate line.  I know we don't
> have an official line length limit, but 152 chars seems excessive to
> me, and forced me to scroll to see some of it.
> 
> Other than that, looks good.  I don't need a new webrev.
> 

Sounds like a good fix and it looks good,
Stefan

From erik.osterlund at oracle.com  Mon Oct 28 09:16:58 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 28 Oct 2019 10:16:58 +0100
Subject: RFR: 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise
In-Reply-To: <813a25f8-4540-859d-502d-8ffdfc503f72@oracle.com>
References: <813a25f8-4540-859d-502d-8ffdfc503f72@oracle.com>
Message-ID: <cc27a324-5697-8391-a3f8-934cda80af6c@oracle.com>

Hi Stefan,

Looks good.

Thanks,
/Erik

On 2019-10-24 18:36, Stefan Karlsson wrote:
> Hi all,
>
> Please review this patch to make the ZVerifyViews mapping and 
> unmapping precise.
>
> https://cr.openjdk.java.net/~stefank/8232604/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8232604
>
> Today, when the ZVerifyViews flag is turned on, we unmap all bad 
> views. The intention is to catch stray-pointer bugs.
>
> The current implementation takes a short-cut and unmap all memory en 
> masse. This works for Linux, but not on Windows, where we must be 
> precise in what we unmap.
>
> There are three places where allocated pages are registered today:
> - In the page table - actively used
> - In the page cache - free pages waiting to be used
> - In-flight from the alloc queue
>
> The proposed patch registers all satisfied alloc requests, lets the 
> requesting threads deregister the satisfied request when the page is 
> received, and makes sure that the GC visits all in-flight satisfied 
> alloc requests when it performs the ZVerifyViews flip.
>
> Thanks,
> StefanK


From stefan.karlsson at oracle.com  Mon Oct 28 10:04:45 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 28 Oct 2019 11:04:45 +0100
Subject: RFR: 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise
In-Reply-To: <cc27a324-5697-8391-a3f8-934cda80af6c@oracle.com>
References: <813a25f8-4540-859d-502d-8ffdfc503f72@oracle.com>
 <cc27a324-5697-8391-a3f8-934cda80af6c@oracle.com>
Message-ID: <9678016a-ee4b-84e5-67f1-89e5beb78231@oracle.com>

Thanks, Erik.

StefanK

On 2019-10-28 10:16, Erik ?sterlund wrote:
> Hi Stefan,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 2019-10-24 18:36, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to make the ZVerifyViews mapping and 
>> unmapping precise.
>>
>> https://cr.openjdk.java.net/~stefank/8232604/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8232604
>>
>> Today, when the ZVerifyViews flag is turned on, we unmap all bad 
>> views. The intention is to catch stray-pointer bugs.
>>
>> The current implementation takes a short-cut and unmap all memory en 
>> masse. This works for Linux, but not on Windows, where we must be 
>> precise in what we unmap.
>>
>> There are three places where allocated pages are registered today:
>> - In the page table - actively used
>> - In the page cache - free pages waiting to be used
>> - In-flight from the alloc queue
>>
>> The proposed patch registers all satisfied alloc requests, lets the 
>> requesting threads deregister the satisfied request when the page is 
>> received, and makes sure that the GC visits all in-flight satisfied 
>> alloc requests when it performs the ZVerifyViews flip.
>>
>> Thanks,
>> StefanK
> 


From leo.korinth at oracle.com  Mon Oct 28 10:40:59 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Mon, 28 Oct 2019 11:40:59 +0100
Subject: RFR (XS): 8232951: TestG1ParallelPhases.java fails with phase
 NonYoungFreeCSet not found
In-Reply-To: <7734C751-1D9B-45F5-86FB-D51D2BE8985F@oracle.com>
References: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com>
 <951EB603-F273-4787-9D8D-32D8194ECFAA@oracle.com>
 <7734C751-1D9B-45F5-86FB-D51D2BE8985F@oracle.com>
Message-ID: <1fff7f47-aebc-bbcd-bda6-bb6185c11c3a@oracle.com>

Hi.

Just want to add some information, because I think it will fail again.

The buggy test case is written by me and the provoke mixed gc part is 
copied mostly either from TestOldGenCollectionUsage or TestLogging (as 
it is hard to share this code due to JTREG). However when I did "copy" 
the code I also did try to improve the code, this could be the reason 
for this failure. I did at least two "improvements" in that I removed 
magic constants when allocating the 20k arrays and instead calculated 
how many I would need; this made the algorithm allocate ~2M instead of 
~3M which could be a problem although to my understanding it should not 
be. Another change I made is that I will not provoke a gc by allocating 
until out-of-memory. The original code seems to try to provoke a gc by 
starting concurrent marks and young gc, but kind of fail-safes with the 
code after the comment // allocate more objects to provoke GC. Having 
this code I guess would fix the problem with the test case, but on the 
other hand, we would not know why the youngGC() after concurrent mark 
does not provoke a mixed gc (I guess it should, but correct me if this 
is false).

I have talked to Thomas off-list, and I think AlwaysTenure is not the 
solution to the problem we have. I think adding the debug options is 
great and should be done, and AlwaysTenure seems better than 
MaxTenuringThreshold=1 but we should expect the test case to continue to 
fail in the future.

If you go by adding AlwaysTenure instead of MaxTenuringThreshold=1, 
please also remove one getWhiteBox().youngGC() in allocateOldObjects so 
that we do not leave "magic" lines in the test case. Also update the 
comment to // Do *one* young collections...
and there is another "-XX:MaxTenuringThreshold=1" that needs to be 
updated. I need no webrev for these changes.

I am sorry that my "improvements" probably caused this failure, though 
just having heaps of code and not understanding why, is probably worse 
in the long run --- at least that is my thinking.

Thanks,
Leo


On 28/10/2019 09:35, Stefan Johansson wrote:
> Hi Thomas,
> 
>> 26 okt. 2019 kl. 03:51 skrev Kim Barrett <kim.barrett at oracle.com>:
>>
>>> On Oct 24, 2019, at 7:50 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>> [?]
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8232951
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8232951/webrev/
>>> Testing:
>>> 400 runs of the changed test without issues
>>>
>>> Thanks,
>>> Thomas
>>
>> I'd not previously noticed the AlwaysTenure and NeverTenure options.
>> So many options...
>>
>> Those options are documented as being ParallelGC only.  But it looks
>> like setting either of them forces a value for MaxTenuringThreshold,
>> so it seems okay to change the test to use AlwaysTenure.  The
>> documentation for the options should be updated though.  (That can be
>> a separate RFE.)
>>
>> Please put the new -Xlog option on a separate line.  I know we don't
>> have an official line length limit, but 152 chars seems excessive to
>> me, and forced me to scroll to see some of it.
>>
>> Other than that, looks good.  I don't need a new webrev.
>>
> 
> Sounds like a good fix and it looks good,
> Stefan
> 


From stefan.johansson at oracle.com  Mon Oct 28 12:41:38 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 28 Oct 2019 13:41:38 +0100
Subject: RFR: 8233065: PSParallelCompact::move_and_update is unused and should
 be removed
Message-ID: <7C04DE96-9918-4F5B-81C5-0ABA5AB6DEAB@oracle.com>

Hi,

Please review this small fix that removes an unused function.

JBS: https://bugs.openjdk.java.net/browse/JDK-8233065
Webrev: http://cr.openjdk.java.net/~sjohanss/8233065/00/

Summary:
The function move_and_update was not removed when its last use was removed during the removal of PermGen. 

Testing:
Build and tested through mach5 (tier1)

Thanks,
Stefan

From leo.korinth at oracle.com  Mon Oct 28 13:06:43 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Mon, 28 Oct 2019 14:06:43 +0100
Subject: RFR: 8233065: PSParallelCompact::move_and_update is unused and
 should be removed
In-Reply-To: <7C04DE96-9918-4F5B-81C5-0ABA5AB6DEAB@oracle.com>
References: <7C04DE96-9918-4F5B-81C5-0ABA5AB6DEAB@oracle.com>
Message-ID: <9592e6db-bfb7-894d-fa24-f5982e63b8fd@oracle.com>

Looks good.

Thanks for cleaning up!
/Leo

On 28/10/2019 13:41, Stefan Johansson wrote:
> Hi,
> 
> Please review this small fix that removes an unused function.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8233065
> Webrev: http://cr.openjdk.java.net/~sjohanss/8233065/00/
> 
> Summary:
> The function move_and_update was not removed when its last use was removed during the removal of PermGen.
> 
> Testing:
> Build and tested through mach5 (tier1)
> 
> Thanks,
> Stefan
> 


From thomas.schatzl at oracle.com  Mon Oct 28 13:42:10 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 28 Oct 2019 14:42:10 +0100
Subject: RFR: 8233065: PSParallelCompact::move_and_update is unused and
 should be removed
In-Reply-To: <7C04DE96-9918-4F5B-81C5-0ABA5AB6DEAB@oracle.com>
References: <7C04DE96-9918-4F5B-81C5-0ABA5AB6DEAB@oracle.com>
Message-ID: <9f2a0003-c33f-1d31-f4bb-8d491817a4be@oracle.com>

Hi,

On 28.10.19 13:41, Stefan Johansson wrote:
> Hi,
> 
> Please review this small fix that removes an unused function.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8233065
> Webrev: http://cr.openjdk.java.net/~sjohanss/8233065/00/
> 
> Summary:
> The function move_and_update was not removed when its last use was removed during the removal of PermGen.
> 
> Testing:
> Build and tested through mach5 (tier1)
> 

   looks good.

Thomas


From shade at redhat.com  Mon Oct 28 14:49:23 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 28 Oct 2019 15:49:23 +0100
Subject: RFR 8232992: Shenandoah: Implement self-fixing interpreter LRB
In-Reply-To: <942c5c5d-fa2b-e14b-3319-0092d782da24@redhat.com>
References: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
 <6ba6da68-c48f-24b1-7ca3-d2bd8a46c4b8@redhat.com>
 <942c5c5d-fa2b-e14b-3319-0092d782da24@redhat.com>
Message-ID: <15f36ca6-aa08-3ca6-dad8-314baa77fc75@redhat.com>

On 10/26/19 2:34 AM, Zhengyu Gu wrote:
> We only need to use rscratch1 when dst == r1, and there is possibility that dst comes in in
> rscratch1 (see SBSA::load_at() method), I think current assertion (dst != rscratch2) is sufficient.
> 
> However, we do need to ensure scratch registers are not used by load_addr, so added:
> 
> ? assert_different_registers(load_addr.base(), load_addr.index(), rscratch1);
> ? assert_different_registers(load_addr.base(), load_addr.index(), rscratch2);

Why not just:
  assert_different_registers(load_addr.base(), load_addr.index(), rscratch1, rscratch2);

> Updated: http://cr.openjdk.java.net/~zgu/JDK-8232992/webrev.01/

Looks fine to me otherwise.

-- 
Thanks,
-Aleksey


From erik.osterlund at oracle.com  Mon Oct 28 15:29:35 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 28 Oct 2019 16:29:35 +0100
Subject: RFR: 8233073: Make BitMap accessors more memory ordering friendly
Message-ID: <1b19522f-534d-ca6b-4e97-f837e4ab7212@oracle.com>

Hi,

I have need for accessors on BitMap being more explicitly memory 
ordering aware, to fix a bug in ZGC for AArch64 
(https://bugs.openjdk.java.net/browse/JDK-8233061).
In particular, I need failed bit sets to still have acquire semantics, 
and I need a getter with acquire semantics.

My intention is to solve the problem by making the relevant BitMap 
accessors accept explicitly passed in memory ordering parameters, and 
utilize them. I draw the line of conservativeness at supporting 
IRIW-consistent loads. Having spent a great deal of time finding a 
single algorithm that breaks due to IRIW-consistency violations, and 
knowing the complexity of algorithms that actually can break due to 
that, I would be *very* surprised if we got anywhere close to that. 
Therefore, acquiring loads are the most conservative loads I support. 
This is explicitly stated in the API, so that anyone that actually 
relies on IRIW consistency in the future can reconsider that, and add a 
mode that fences before loads on nMCA machines.

The main points of controversy with this patch, where I expect people to 
have wildly different opinions and hopefully get at least a little bit 
upset are the following:

1) For the same reason that our implementation of Atomic::cmpxchg does 
not supply both one ordering for success and one for failed CAS, unlike 
the C++11 atomic counter part, I do not do so either in the par_set_bit 
API. In the Atomic API, this was very much intentional, because it is 
tricky to reason about the subtle effects of having relaxed failed CAS 
and conservative success. In fact, it's a bug of precisely that nature I 
am chasing. Therefore, I wish to transfer that same reasoning to the 
par_set_bit API, and not allow passing in a weaker failing memory 
ordering. A consequence of this is that I have made the uses of this API 
more conservative for failed bit flips than it was in the past. However, 
this new API allows relaxing the real pain point of the API: the success 
case (with it's bi-directional full fencing semantics). So I expect it 
can be applied to make RMO architectures happier where it really matters 
in the end. However, I will not attempt to prove that relaxing these 
calls is okay in various places with this patch: that is outside of my 
scope, I'm merely adding API hooks for allowing that.

2) The default strength on the getter is memory_order_relaxed and not 
memory_order_conservative. After looking at uses, I realize it's used 
mostly in single threaded contexts by compiler code, and there is 
seemingly only a single use in the VM that cares about having acquire 
(ZGC, and it's broken today). While letting the frequency of uses decide 
what is the default rather than what is safest is not something I would 
normally do, it does feel like since the norm is so vastly in favour of 
the relaxed variant, I don't want to let the one ZGC use case clutter 
half the VM with explicitly relaxing the load. I am okay with reverting 
that decision if people want me to.

CR:
http://cr.openjdk.java.net/~eosterlund/8233073/webrev.00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8233073

Thanks,
/Erik


From zgu at redhat.com  Mon Oct 28 15:35:49 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 28 Oct 2019 11:35:49 -0400
Subject: RFR 8232992: Shenandoah: Implement self-fixing interpreter LRB
In-Reply-To: <15f36ca6-aa08-3ca6-dad8-314baa77fc75@redhat.com>
References: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
 <6ba6da68-c48f-24b1-7ca3-d2bd8a46c4b8@redhat.com>
 <942c5c5d-fa2b-e14b-3319-0092d782da24@redhat.com>
 <15f36ca6-aa08-3ca6-dad8-314baa77fc75@redhat.com>
Message-ID: <48c982fe-9dce-b122-c1fd-6e716778d4f2@redhat.com>


On 10/28/19 10:49 AM, Aleksey Shipilev wrote:
> On 10/26/19 2:34 AM, Zhengyu Gu wrote:
>> We only need to use rscratch1 when dst == r1, and there is possibility that dst comes in in
>> rscratch1 (see SBSA::load_at() method), I think current assertion (dst != rscratch2) is sufficient.
>>
>> However, we do need to ensure scratch registers are not used by load_addr, so added:
>>
>>  ? assert_different_registers(load_addr.base(), load_addr.index(), rscratch1);
>>  ? assert_different_registers(load_addr.base(), load_addr.index(), rscratch2);
> 
> Why not just:
>    assert_different_registers(load_addr.base(), load_addr.index(), rscratch1, rscratch2);

Yep, fixed and pushed.

Thanks,

-Zhengyu

> 
>> Updated: http://cr.openjdk.java.net/~zgu/JDK-8232992/webrev.01/
> 
> Looks fine to me otherwise.
> 


From erik.osterlund at oracle.com  Mon Oct 28 15:53:05 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 28 Oct 2019 16:53:05 +0100
Subject: RFR: 8233061: ZGC: Enforce memory ordering in segmented bit maps
Message-ID: <311b863b-e2dc-c56c-7115-d13afb7c4f4b@oracle.com>

Hi,

In ZGC, bitmaps are lazily cleared in a segmented fashion. In this 
scheme, liveness is determined by looking at a counter, a segment bit 
map and finally the flat bit map structure. The accesses for the various 
stages need to be ordered properly. This patch sprinkles some 
OrderAccess calls to enforce this ordering.

Out of curiosity, I disassembled libjvm.so with and without this patch 
to see if the reordering has bitten us in practice on x86_64. 
Fortunately, according to my analysis, it has not; we seem to have been 
lucky. But there is a lot of machine code, so I could have missed 
something. However, given that we now have an AArch64 port which is 
definitely affected by this problem, and compilers really are free to do 
whatever they want to in the future, it seems in order to enforce this 
explicitly.

This patch depends on https://bugs.openjdk.java.net/browse/JDK-8233073 
which exposes some memory ordering aware getters on BitMap. I did not 
want to just wrap the existing API in ZGC, so I split that out to a 
separate RFE.

CR:
http://cr.openjdk.java.net/~eosterlund/8233061/webrev.00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8233061

Thanks,
/Erik


From erik.osterlund at oracle.com  Mon Oct 28 16:11:42 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 28 Oct 2019 17:11:42 +0100
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>
References: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>
Message-ID: <837c9e23-6068-53b5-9c50-7880a0f375c7@oracle.com>

Hi,

After some internal discussions with Per and Stefan, some refactorings 
have been made:

1) Use mmap consistently wherever possibly, instead of mach_vm_map, for 
consistency. And only use mach_vm_remap from a wrapper function to map 
in views.
2) Move the pmem segments up one level so that producer and consumer of 
the segments is on the same level, and let the virtual "file" know only 
about offsets.
3) Minor polishing.

Incremental:
http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00..01/

Full:
http://cr.openjdk.java.net/~eosterlund/8224817/webrev.01/

Thanks,
/Erik

On 2019-10-24 12:38, erik.osterlund at oracle.com wrote:
> Hi,
>
> Now that some curling has been performed, paving way for this patch:
>
> ??? 8229027: Improve how JNIHandleBlock::oops_do distinguishes oops 
> from non-oops
> ??? 8229278: Improve hs_err location printing to assume less about GC 
> internals
> ??? 8229189: Improve JFR leak profiler tracing to deal with 
> discontiguous heaps
> ??? 8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
> ??? 8224820: ZGC: Support discontiguous heap reservations
>
> ...the remaining thing to do is plugging in a few platform specific 
> ZGC files. This patch does that.
> Decided to go with mach_vm_map/mach_vm_remap to implement 
> multi-mapping. Previously I didn't want to do that as I couldn't 
> figure out how to mach_vm_remap memory on top of reserved VA (acquired 
> using mmap). But apparently VM_FLAGS_OVERWRITE was the missing 
> ingredient there. With that in place, dodging the terrible ftruncate 
> implementation on macOS seemed like a good idea. That also implies 
> this port supports large pages (unlike other GCs on macOS today). Yay!
>
> CR:
> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00/
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8229358
>
> Thanks,
> /Erik


From erik.osterlund at oracle.com  Mon Oct 28 16:38:02 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 28 Oct 2019 17:38:02 +0100
Subject: RFR: 8230661: ZGC: Stop reloading oops in load barriers
Message-ID: <954589ac-147d-9419-f5eb-2a4fceb61cf5@oracle.com>

Hi,

In ZGC, an oop is first loaded somewhere, by e.g. JIT compiled code. 
Then it passes a load barrier that typically does not take a slow path. 
But when it does take a slow path, the oop is sometimes reloaded, at 
historically three different places, and now two places.

1) We used to do that as part of the mechanism that transferred 
execution to the slow path because it was easier to write that stub code 
if the original oop died. Since then, the compiler slow paths have been 
rewritten to not reload the oop.

2) Once in the slow path, we sometimes reload weak oops during the 
resurrection block window, because there used to be a race when it 
closed. After concurrent class unloading integrated, there is a 
thread-local handshake before closing the resurrection block window. 
Therefore, that race no longer exists (when class unloading is used).

3) Once the final oop of a slow path has been determined, self-healing 
kicks in. The self-healing CAS may fail. When it does, the oop is 
reloaded. But this is completely unnecessary.

With obstacle 1 gone, and 2 and 3 having no reason to be in the code any 
more, I propose to get rid of all reloading of the oops in the slow 
paths, so that it becomes easier to reason about the code. The object 
captured by the original load, is then always the same object as the 
object found after the load barrier completes, although possibly with a 
new bit representation.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8230661

CR:
https://bugs.openjdk.java.net/browse/JDK-8230661

Thanks,
/Erik


From erik.osterlund at oracle.com  Mon Oct 28 16:44:14 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 28 Oct 2019 17:44:14 +0100
Subject: RFR: 8230661: ZGC: Stop reloading oops in load barriers
In-Reply-To: <954589ac-147d-9419-f5eb-2a4fceb61cf5@oracle.com>
References: <954589ac-147d-9419-f5eb-2a4fceb61cf5@oracle.com>
Message-ID: <d4d1e7fa-a344-4457-ecea-6573a27cec8f@oracle.com>

Oops. CR link was a bug link. For anyone that couldn't figure out what 
the CR link could possibly be, here it is:
http://cr.openjdk.java.net/~eosterlund/8230661/webrev.00/

/Erik

On 2019-10-28 17:38, Erik ?sterlund wrote:
> Hi,
>
> In ZGC, an oop is first loaded somewhere, by e.g. JIT compiled code. 
> Then it passes a load barrier that typically does not take a slow 
> path. But when it does take a slow path, the oop is sometimes 
> reloaded, at historically three different places, and now two places.
>
> 1) We used to do that as part of the mechanism that transferred 
> execution to the slow path because it was easier to write that stub 
> code if the original oop died. Since then, the compiler slow paths 
> have been rewritten to not reload the oop.
>
> 2) Once in the slow path, we sometimes reload weak oops during the 
> resurrection block window, because there used to be a race when it 
> closed. After concurrent class unloading integrated, there is a 
> thread-local handshake before closing the resurrection block window. 
> Therefore, that race no longer exists (when class unloading is used).
>
> 3) Once the final oop of a slow path has been determined, self-healing 
> kicks in. The self-healing CAS may fail. When it does, the oop is 
> reloaded. But this is completely unnecessary.
>
> With obstacle 1 gone, and 2 and 3 having no reason to be in the code 
> any more, I propose to get rid of all reloading of the oops in the 
> slow paths, so that it becomes easier to reason about the code. The 
> object captured by the original load, is then always the same object 
> as the object found after the load barrier completes, although 
> possibly with a new bit representation.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8230661
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230661
>
> Thanks,
> /Erik


From stefan.johansson at oracle.com  Mon Oct 28 19:03:36 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 28 Oct 2019 20:03:36 +0100
Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs 
In-Reply-To: <a1eeabca-f70e-f01a-9459-12bf913688d4@oracle.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
 <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
 <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>
 <CAKSDcxu_t+Ka0LUU3WSzdR52-_+rsvKo_Vxn8av=duFmS4EPyA@mail.gmail.com>
 <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com>
 <e72f06af-8847-844b-107c-afd15e01f71b@oracle.com>
 <CAKSDcxsm3-6u0arR4KCRGF=R-1sD9XJAS3Fb98NxzcPASEpGwg@mail.gmail.com>
 <a1eeabca-f70e-f01a-9459-12bf913688d4@oracle.com>
Message-ID: <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com>

Hi Haoyu, 

I?ve looked through the patch in detail now and created a new webrev at:
http://cr.openjdk.java.net/~sjohanss/8220465/01/

I took the liberty of removing the removal of move_and_update from your patch since I?m addressing that separately in JDK-8233065. The webrev above is still based on that removal, but I expect that to be pushed tomorrow or Wednesday so that should be fine. 

I also changed the subject to make it more clear that this is now a review of:
https://bugs.openjdk.java.net/browse/JDK-8220465

Regarding the current patch, I think that it looks good in general, but I thought a bit more around how to share stuff between the closures and I agree that adding those extra virtual functions doesn?t really feel worth it. I?m wondering if a solution where we revert back to letting destination be the ?real destination? (not ever pointing to the shadow region) and add a copy_destination which is destination + offset. To make this work the normal MoveAndUpdateClosure would also have an offset, but it would always be 0. If do_addr() is then updated to use the copy_destination() in some places we might end up with something pretty nice, but maybe I?m missing something.

I also realized that the current patch will trigger an assert because destination is expected not to be the shadow address:
#  Internal Error (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), pid=12649, tid=12728
#  assert(src_cp->destination() == destination) failed: first live obj in the space must match the destination

So this also suggests that we should keep destination() returning the real destination.

Some other comments:
src/hotspot/share/gc/parallel/psParallelCompact.cpp
?
3383 void ShadowClosure::complete_region(ParCompactionManager *cm, HeapWord *dest_addr,
3384                                     PSParallelCompact::RegionData *region_ptr) {
3385   assert(region_ptr->shadow_state() == ParallelCompactData::RegionData::FINISH, "Region should be finished?);

This assertion will also trigger when running with a debug build and at this point the shadow state should be SHADOW not FINISH.
?

src/hotspot/share/gc/parallel/psParallelCompact.hpp
?
 632 inline bool ParallelCompactData::RegionData::mark_filled() {
 633   return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == SHADOW;
 634 }

Since we never check the return value here we should make it void and maybe instead add an assert that the return value is SHADOW.
?

When you addressed these comments, would it be possible to include both the full patch and and the incremental changes from the current version. That makes it easier for the reviewers to see what changed between version of the patch.

Thanks,
Stefan

> 24 okt. 2019 kl. 14:16 skrev Stefan Johansson <stefan.johansson at oracle.com>:
> 
> Hi Haoyu,
> 
> On 2019-10-23 17:15, Haoyu Li wrote:
>> Hi Stefan,
>> Thanks for your constructive feedback. I've addressed all the issues you mentioned, and the updated patch is attached in this email.
> Nice, I will look at the patch next week, but I'll shortly answer your questions right away.
> 
>> During refining the patch, I have a couple of questions:
>> 1) Now the MoveAndUpdateClosure and ShadowClosure assume the destination address is the very beginning of a region, instead of an arbitrary address like what it used to be. However, there is an unused function named PSParallelCompact::move_and_update() uses the MoveAndUpdateClosure to process a region from its middle, which conflicts with the assumption. I notice that you removed this function in your patch, and so did I in the updated patch. Does it matter?
> Yes, I found this function during my code review and it should be removed, but I think that should be handled as a separate issue. We can do this removal before this patch goes in.
> 
>> 2) Using the same do_addr() in MoveAndUpdateClosure and ShadowClosure is doable, but it does not reuse all the code neatly. Because storing the address of the shadow region in _destination requires extra virtual functions to handle allocating blocks in the start_array and setting addresses of deferred objects. In particular, allocate_blocks() and set_deferred_object_for() in both closures are added. Is it worth avoiding to use _offset to calculate the shadow_destination?
> Ok, sounds like it might be better to have specific do_addr() functions then. I'll think some more around this when reviewing the new patch in depth.
> 
>> If there are any problems with this patch, please contact me anytime. I'm more than happy to keep improving the code. Thanks again for reviewing.
>> 
> Sound good, thanks,
> Stefan


From kim.barrett at oracle.com  Tue Oct 29 01:54:16 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 28 Oct 2019 21:54:16 -0400
Subject: RFR: 8233073: Make BitMap accessors more memory ordering friendly
In-Reply-To: <1b19522f-534d-ca6b-4e97-f837e4ab7212@oracle.com>
References: <1b19522f-534d-ca6b-4e97-f837e4ab7212@oracle.com>
Message-ID: <104E3C62-435C-4F31-87EB-D3EBC34634EE@oracle.com>

Should this review be happening on hotspot-dev rather than
hotspot-gc-dev?  GC is not the only BitMap client; compiler uses them
too (and generally rather differently).

> On Oct 28, 2019, at 11:29 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
> 
> Hi,
> 
> I have need for accessors on BitMap being more explicitly memory ordering aware, to fix a bug in ZGC for AArch64 (https://bugs.openjdk.java.net/browse/JDK-8233061).
> In particular, I need failed bit sets to still have acquire semantics, and I need a getter with acquire semantics.
> 
> My intention is to solve the problem by making the relevant BitMap accessors accept explicitly passed in memory ordering parameters, and utilize them. I draw the line of conservativeness at supporting IRIW-consistent loads. Having spent a great deal of time finding a single algorithm that breaks due to IRIW-consistency violations, and knowing the complexity of algorithms that actually can break due to that, I would be *very* surprised if we got anywhere close to that. Therefore, acquiring loads are the most conservative loads I support. This is explicitly stated in the API, so that anyone that actually relies on IRIW consistency in the future can reconsider that, and add a mode that fences before loads on nMCA machines.
> 
> The main points of controversy with this patch, where I expect people to have wildly different opinions and hopefully get at least a little bit upset are the following:
> 
> 1) For the same reason that our implementation of Atomic::cmpxchg does not supply both one ordering for success and one for failed CAS, unlike the C++11 atomic counter part, I do not do so either in the par_set_bit API. In the Atomic API, this was very much intentional, because it is tricky to reason about the subtle effects of having relaxed failed CAS and conservative success. In fact, it's a bug of precisely that nature I am chasing. Therefore, I wish to transfer that same reasoning to the par_set_bit API, and not allow passing in a weaker failing memory ordering. A consequence of this is that I have made the uses of this API more conservative for failed bit flips than it was in the past. However, this new API allows relaxing the real pain point of the API: the success case (with it's bi-directional full fencing semantics). So I expect it can be applied to make RMO architectures happier where it really matters in the end. However, I will not attempt to prove that relaxing these calls is okay in various places with this patch: that is outside of my scope, I'm merely adding API hooks for allowing that.
> 
> 2) The default strength on the getter is memory_order_relaxed and not memory_order_conservative. After looking at uses, I realize it's used mostly in single threaded contexts by compiler code, and there is seemingly only a single use in the VM that cares about having acquire (ZGC, and it's broken today). While letting the frequency of uses decide what is the default rather than what is safest is not something I would normally do, it does feel like since the norm is so vastly in favour of the relaxed variant, I don't want to let the one ZGC use case clutter half the VM with explicitly relaxing the load. I am okay with reverting that decision if people want me to.
> 
> CR:
> http://cr.openjdk.java.net/~eosterlund/8233073/webrev.00/
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8233073
> 
> Thanks,
> /Erik

------------------------------------------------------------------------------
src/hotspot/share/utilities/bitMap.hpp 
207   bool at(idx_t index, atomic_memory_order memory_order = memory_order_relaxed) const;

My initial reaction here is that I'd prefer adding par_at() rather
than giving at() a memory order argument.  This would also address the
question of what the default should be.  For at(), it's nonatomic.
For par_at() it's acquire.

That would also avoid imposing volatile ordering on at().

As you said, existing uses of at() are relaxed/nonatomic.  The code
rearrangement for MarkBitMap::is_marked() makes me wonder if any of
the calls should be acquire ordered, but obviously none are now...

------------------------------------------------------------------------------
src/hotspot/share/utilities/bitMap.hpp 
205   // The memory ordering goes up to memory_order_acquire, but not higher. It is
206   // assumed that users of the BitMap API will never rely on IRIW consistency.

I think what this means is that memory_order_seq_cst
(memory_order_conservative in HotSpot) isn't supported? So just as we
only have Atomic::load (relaxed) and Atomic::load_acquire (acquire).
That seems okay. But if we aren't going to support the stronger
semantics, I don't think this should permit the corresponding memory
order value.

C++11 specifies that atomic load operations cannot have a memory order
of memory_order_release or memory_order_acq_rel. (Similarly, store
operations cannot have a memory order of memory_order_consume,
memory_order_acquire, or memory_order_acq_rel. That isn't relevant for
this change, as all the modifying operations here are RMW.)

So I think we should just be explicit that only relaxed and acquire
are supported here.  (And actually make that true; see next comment.)

------------------------------------------------------------------------------
src/hotspot/share/utilities/bitMap.inline.hpp 
 55 inline bool BitMap::at(idx_t index, atomic_memory_order memory_order) const {
...
 58   return (load_word_ordered(addr, memory_order) & bit_mask(index)) != 0;

This is using the load_word_ordered helper, but the behavior of that
function is designed to support the RMW operations, and I think isn't
really right for at() (see previous comment). The simplest solution to
get what I'm suggesting would be to add an assert here that the
memory_order is either relaxed or acquire.

------------------------------------------------------------------------------
src/hotspot/share/gc/shared/markBitMap.inline.hpp
71   return _bm.at(addr_to_offset(addr), memory_order_relaxed);

The memory order argument isn't needed with the current default, and
wouldn't even be permitted with the above suggestion add par_at.

------------------------------------------------------------------------------

I'm not understanding part of the problem description though.  You say

  ... have made the uses of this API more conservative for failed bit
  flips than it was in the past. 

But the pre-existing unordered cases in the setting functions (e.g.
don't go through cmpxchg) are those where the bit is already set to
the desired value, so there's no failure to change the bit involved.
It seems reasonable to me that an acquire (at least) is usually
desirable on that path, for reasons similar to why one wants an
acquire on the outside-the-lock test when using the Double Checked
Locking pattern. But that's not what was said, so I'm not sure I'm
understanding the point.

------------------------------------------------------------------------------

The par_xxx_range operations are not being directly modified by this
change. When only 1 bit is actually involved, they'll delegate to the
conservative single-bit operations, so are changed to pick up the
acquire on the already set to the desired value path. Otherwise, they
always go through conservative cmpxchg as before.  That all seems fine.

------------------------------------------------------------------------------


From ioi.lam at oracle.com  Tue Oct 29 04:19:40 2019
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 28 Oct 2019 21:19:40 -0700
Subject: RFR: 8233073: Make BitMap accessors more memory ordering friendly
In-Reply-To: <104E3C62-435C-4F31-87EB-D3EBC34634EE@oracle.com>
References: <1b19522f-534d-ca6b-4e97-f837e4ab7212@oracle.com>
 <104E3C62-435C-4F31-87EB-D3EBC34634EE@oracle.com>
Message-ID: <97d0b6b0-74bf-8a97-e399-f75582294cc3@oracle.com>

CDS also uses BitMap -- always in single threaded code. So as long as 
single-threaded code doesn't get slowed down by this patch, I have no 
objection.

Thanks
- Ioi

On 10/28/19 6:54 PM, Kim Barrett wrote:
> Should this review be happening on hotspot-dev rather than
> hotspot-gc-dev?  GC is not the only BitMap client; compiler uses them
> too (and generally rather differently).
>
>> On Oct 28, 2019, at 11:29 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>>
>> Hi,
>>
>> I have need for accessors on BitMap being more explicitly memory ordering aware, to fix a bug in ZGC for AArch64 (https://bugs.openjdk.java.net/browse/JDK-8233061).
>> In particular, I need failed bit sets to still have acquire semantics, and I need a getter with acquire semantics.
>>
>> My intention is to solve the problem by making the relevant BitMap accessors accept explicitly passed in memory ordering parameters, and utilize them. I draw the line of conservativeness at supporting IRIW-consistent loads. Having spent a great deal of time finding a single algorithm that breaks due to IRIW-consistency violations, and knowing the complexity of algorithms that actually can break due to that, I would be *very* surprised if we got anywhere close to that. Therefore, acquiring loads are the most conservative loads I support. This is explicitly stated in the API, so that anyone that actually relies on IRIW consistency in the future can reconsider that, and add a mode that fences before loads on nMCA machines.
>>
>> The main points of controversy with this patch, where I expect people to have wildly different opinions and hopefully get at least a little bit upset are the following:
>>
>> 1) For the same reason that our implementation of Atomic::cmpxchg does not supply both one ordering for success and one for failed CAS, unlike the C++11 atomic counter part, I do not do so either in the par_set_bit API. In the Atomic API, this was very much intentional, because it is tricky to reason about the subtle effects of having relaxed failed CAS and conservative success. In fact, it's a bug of precisely that nature I am chasing. Therefore, I wish to transfer that same reasoning to the par_set_bit API, and not allow passing in a weaker failing memory ordering. A consequence of this is that I have made the uses of this API more conservative for failed bit flips than it was in the past. However, this new API allows relaxing the real pain point of the API: the success case (with it's bi-directional full fencing semantics). So I expect it can be applied to make RMO architectures happier where it really matters in the end. However, I will not attempt to prove that relaxing these calls is okay in various places with this patch: that is outside of my scope, I'm merely adding API hooks for allowing that.
>>
>> 2) The default strength on the getter is memory_order_relaxed and not memory_order_conservative. After looking at uses, I realize it's used mostly in single threaded contexts by compiler code, and there is seemingly only a single use in the VM that cares about having acquire (ZGC, and it's broken today). While letting the frequency of uses decide what is the default rather than what is safest is not something I would normally do, it does feel like since the norm is so vastly in favour of the relaxed variant, I don't want to let the one ZGC use case clutter half the VM with explicitly relaxing the load. I am okay with reverting that decision if people want me to.
>>
>> CR:
>> http://cr.openjdk.java.net/~eosterlund/8233073/webrev.00/
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8233073
>>
>> Thanks,
>> /Erik
> ------------------------------------------------------------------------------
> src/hotspot/share/utilities/bitMap.hpp
> 207   bool at(idx_t index, atomic_memory_order memory_order = memory_order_relaxed) const;
>
> My initial reaction here is that I'd prefer adding par_at() rather
> than giving at() a memory order argument.  This would also address the
> question of what the default should be.  For at(), it's nonatomic.
> For par_at() it's acquire.
>
> That would also avoid imposing volatile ordering on at().
>
> As you said, existing uses of at() are relaxed/nonatomic.  The code
> rearrangement for MarkBitMap::is_marked() makes me wonder if any of
> the calls should be acquire ordered, but obviously none are now...
>
> ------------------------------------------------------------------------------
> src/hotspot/share/utilities/bitMap.hpp
> 205   // The memory ordering goes up to memory_order_acquire, but not higher. It is
> 206   // assumed that users of the BitMap API will never rely on IRIW consistency.
>
> I think what this means is that memory_order_seq_cst
> (memory_order_conservative in HotSpot) isn't supported? So just as we
> only have Atomic::load (relaxed) and Atomic::load_acquire (acquire).
> That seems okay. But if we aren't going to support the stronger
> semantics, I don't think this should permit the corresponding memory
> order value.
>
> C++11 specifies that atomic load operations cannot have a memory order
> of memory_order_release or memory_order_acq_rel. (Similarly, store
> operations cannot have a memory order of memory_order_consume,
> memory_order_acquire, or memory_order_acq_rel. That isn't relevant for
> this change, as all the modifying operations here are RMW.)
>
> So I think we should just be explicit that only relaxed and acquire
> are supported here.  (And actually make that true; see next comment.)
>
> ------------------------------------------------------------------------------
> src/hotspot/share/utilities/bitMap.inline.hpp
>   55 inline bool BitMap::at(idx_t index, atomic_memory_order memory_order) const {
> ...
>   58   return (load_word_ordered(addr, memory_order) & bit_mask(index)) != 0;
>
> This is using the load_word_ordered helper, but the behavior of that
> function is designed to support the RMW operations, and I think isn't
> really right for at() (see previous comment). The simplest solution to
> get what I'm suggesting would be to add an assert here that the
> memory_order is either relaxed or acquire.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/markBitMap.inline.hpp
> 71   return _bm.at(addr_to_offset(addr), memory_order_relaxed);
>
> The memory order argument isn't needed with the current default, and
> wouldn't even be permitted with the above suggestion add par_at.
>
> ------------------------------------------------------------------------------
>
> I'm not understanding part of the problem description though.  You say
>
>    ... have made the uses of this API more conservative for failed bit
>    flips than it was in the past.
>
> But the pre-existing unordered cases in the setting functions (e.g.
> don't go through cmpxchg) are those where the bit is already set to
> the desired value, so there's no failure to change the bit involved.
> It seems reasonable to me that an acquire (at least) is usually
> desirable on that path, for reasons similar to why one wants an
> acquire on the outside-the-lock test when using the Double Checked
> Locking pattern. But that's not what was said, so I'm not sure I'm
> understanding the point.
>
> ------------------------------------------------------------------------------
>
> The par_xxx_range operations are not being directly modified by this
> change. When only 1 bit is actually involved, they'll delegate to the
> conservative single-bit operations, so are changed to pick up the
> acquire on the already set to the desired value path. Otherwise, they
> always go through conservative cmpxchg as before.  That all seems fine.
>
> ------------------------------------------------------------------------------
>


From thomas.schatzl at oracle.com  Tue Oct 29 08:42:27 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 29 Oct 2019 09:42:27 +0100
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
 <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
Message-ID: <8eefcbf1-9c75-1fe1-14b8-df0f23a53518@oracle.com>

Hi,

On 25.10.19 16:02, sangheon.kim at oracle.com wrote:
[...]
> 
> In addition, Stefan, Thomas and I had some discussion about making 
> PLAB-NUMA aware (only for survivor).
> Stefan provided a patch with it and it is simple enough to include under 
> this CR.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4.inc
> 
> Testing: hs-tier 1 ~ 3, with/without UseNUMA
> 
> Thanks,
> Sangheon
> 
> 

- G1Allocator::nodes() -> G1Allocator::num_nodes()

- g1Allocator.hpp:167: s/depend/depending

- please file an RFE investigating adding the node index to the region 
attributes

Looks good otherwise. I do not need a re-review for these changes.

Thomas


From per.liden at oracle.com  Tue Oct 29 11:29:23 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 29 Oct 2019 12:29:23 +0100
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <837c9e23-6068-53b5-9c50-7880a0f375c7@oracle.com>
References: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>
 <837c9e23-6068-53b5-9c50-7880a0f375c7@oracle.com>
Message-ID: <806e1ac7-fa7a-7058-c4cc-bdd90a9f2b86@oracle.com>

Some suggested adjustments, already discussed with Erik off-line:

http://cr.openjdk.java.net/~pliden/8224817/webrev.review.0

/Per

On 10/28/19 5:11 PM, Erik ?sterlund wrote:
> Hi,
> 
> After some internal discussions with Per and Stefan, some refactorings 
> have been made:
> 
> 1) Use mmap consistently wherever possibly, instead of mach_vm_map, for 
> consistency. And only use mach_vm_remap from a wrapper function to map 
> in views.
> 2) Move the pmem segments up one level so that producer and consumer of 
> the segments is on the same level, and let the virtual "file" know only 
> about offsets.
> 3) Minor polishing.
> 
> Incremental:
> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00..01/
> 
> Full:
> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.01/
> 
> Thanks,
> /Erik
> 
> On 2019-10-24 12:38, erik.osterlund at oracle.com wrote:
>> Hi,
>>
>> Now that some curling has been performed, paving way for this patch:
>>
>> ??? 8229027: Improve how JNIHandleBlock::oops_do distinguishes oops 
>> from non-oops
>> ??? 8229278: Improve hs_err location printing to assume less about GC 
>> internals
>> ??? 8229189: Improve JFR leak profiler tracing to deal with 
>> discontiguous heaps
>> ??? 8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>> ??? 8224820: ZGC: Support discontiguous heap reservations
>>
>> ...the remaining thing to do is plugging in a few platform specific 
>> ZGC files. This patch does that.
>> Decided to go with mach_vm_map/mach_vm_remap to implement 
>> multi-mapping. Previously I didn't want to do that as I couldn't 
>> figure out how to mach_vm_remap memory on top of reserved VA (acquired 
>> using mmap). But apparently VM_FLAGS_OVERWRITE was the missing 
>> ingredient there. With that in place, dodging the terrible ftruncate 
>> implementation on macOS seemed like a good idea. That also implies 
>> this port supports large pages (unlike other GCs on macOS today). Yay!
>>
>> CR:
>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00/
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8229358
>>
>> Thanks,
>> /Erik
> 


From erik.osterlund at oracle.com  Tue Oct 29 11:31:24 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 29 Oct 2019 12:31:24 +0100
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <806e1ac7-fa7a-7058-c4cc-bdd90a9f2b86@oracle.com>
References: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>
 <837c9e23-6068-53b5-9c50-7880a0f375c7@oracle.com>
 <806e1ac7-fa7a-7058-c4cc-bdd90a9f2b86@oracle.com>
Message-ID: <3ac7acf6-7154-1627-38b3-c186e9a18267@oracle.com>

Seems reasonable. Thanks.

/Erik

On 10/29/19 12:29 PM, Per Liden wrote:
> Some suggested adjustments, already discussed with Erik off-line:
>
> http://cr.openjdk.java.net/~pliden/8224817/webrev.review.0
>
> /Per
>
> On 10/28/19 5:11 PM, Erik ?sterlund wrote:
>> Hi,
>>
>> After some internal discussions with Per and Stefan, some 
>> refactorings have been made:
>>
>> 1) Use mmap consistently wherever possibly, instead of mach_vm_map, 
>> for consistency. And only use mach_vm_remap from a wrapper function 
>> to map in views.
>> 2) Move the pmem segments up one level so that producer and consumer 
>> of the segments is on the same level, and let the virtual "file" know 
>> only about offsets.
>> 3) Minor polishing.
>>
>> Incremental:
>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00..01/
>>
>> Full:
>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.01/
>>
>> Thanks,
>> /Erik
>>
>> On 2019-10-24 12:38, erik.osterlund at oracle.com wrote:
>>> Hi,
>>>
>>> Now that some curling has been performed, paving way for this patch:
>>>
>>> ??? 8229027: Improve how JNIHandleBlock::oops_do distinguishes oops 
>>> from non-oops
>>> ??? 8229278: Improve hs_err location printing to assume less about 
>>> GC internals
>>> ??? 8229189: Improve JFR leak profiler tracing to deal with 
>>> discontiguous heaps
>>> ??? 8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>>> ??? 8224820: ZGC: Support discontiguous heap reservations
>>>
>>> ...the remaining thing to do is plugging in a few platform specific 
>>> ZGC files. This patch does that.
>>> Decided to go with mach_vm_map/mach_vm_remap to implement 
>>> multi-mapping. Previously I didn't want to do that as I couldn't 
>>> figure out how to mach_vm_remap memory on top of reserved VA 
>>> (acquired using mmap). But apparently VM_FLAGS_OVERWRITE was the 
>>> missing ingredient there. With that in place, dodging the terrible 
>>> ftruncate implementation on macOS seemed like a good idea. That also 
>>> implies this port supports large pages (unlike other GCs on macOS 
>>> today). Yay!
>>>
>>> CR:
>>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00/
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8229358
>>>
>>> Thanks,
>>> /Erik
>>


From leihouyju at gmail.com  Tue Oct 29 12:52:10 2019
From: leihouyju at gmail.com (Haoyu Li)
Date: Tue, 29 Oct 2019 20:52:10 +0800
Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs
In-Reply-To: <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com>
References: <CAKSDcxsPcvigLaDTRyJALk9O0r-JsYV1CcwFzb3KaM98+LVLcg@mail.gmail.com>
 <CAKSDcxvTzfq1eR0DXi8iLQ4bNP8LcjsV5Px59-EMOgR0zaHvtQ@mail.gmail.com>
 <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com>
 <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com>
 <CAKSDcxs131XdbHHErgbH1UDYX6_+=CSa-4dt4n5LgbyPVgjO_w@mail.gmail.com>
 <E4274448-26B2-46B8-883C-70FC7AFFB23B@oracle.com>
 <CAKSDcxvHwOWtAEm8TPPvO=C8q9LRwJRUQX6nbh-HSZXBKKwvYg@mail.gmail.com>
 <fb385d33-c420-4c81-320c-9aa1dad64a44@oracle.com>
 <CAKSDcxv5eccaV54NThgviLK+84U-Z62U0CYr3s+8ncNR1cna5w@mail.gmail.com>
 <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com>
 <CAKSDcxsQWJ1tpnsc8UnN3E=XJfVwHEdE9WSd_=6KR_tLSHW6rQ@mail.gmail.com>
 <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com>
 <CAKSDcxu_t+Ka0LUU3WSzdR52-_+rsvKo_Vxn8av=duFmS4EPyA@mail.gmail.com>
 <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com>
 <e72f06af-8847-844b-107c-afd15e01f71b@oracle.com>
 <CAKSDcxsm3-6u0arR4KCRGF=R-1sD9XJAS3Fb98NxzcPASEpGwg@mail.gmail.com>
 <a1eeabca-f70e-f01a-9459-12bf913688d4@oracle.com>
 <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com>
Message-ID: <CAKSDcxvDszpWRooMyncdo251s_HCxD=U6va4SzQhX6B0FHiWFg@mail.gmail.com>

Hi Stefan,

Thanks for your constructive comments. I will address these issues in the
next few days and provide both a full patch as well as the incremental
changes.

Best Regrads,
Haoyu Li,
Institute of Parallel and Distributed Systems(IPADS),
School of Software,
Shanghai Jiao Tong University


Stefan Johansson <stefan.johansson at oracle.com> ?2019?10?29??? ??3:03???

> Hi Haoyu,
>
> I?ve looked through the patch in detail now and created a new webrev at:
> http://cr.openjdk.java.net/~sjohanss/8220465/01/
>
> I took the liberty of removing the removal of move_and_update from your
> patch since I?m addressing that separately in JDK-8233065. The webrev above
> is still based on that removal, but I expect that to be pushed tomorrow or
> Wednesday so that should be fine.
>
> I also changed the subject to make it more clear that this is now a review
> of:
> https://bugs.openjdk.java.net/browse/JDK-8220465
>
> Regarding the current patch, I think that it looks good in general, but I
> thought a bit more around how to share stuff between the closures and I
> agree that adding those extra virtual functions doesn?t really feel worth
> it. I?m wondering if a solution where we revert back to letting destination
> be the ?real destination? (not ever pointing to the shadow region) and add
> a copy_destination which is destination + offset. To make this work the
> normal MoveAndUpdateClosure would also have an offset, but it would always
> be 0. If do_addr() is then updated to use the copy_destination() in some
> places we might end up with something pretty nice, but maybe I?m missing
> something.
>
> I also realized that the current patch will trigger an assert because
> destination is expected not to be the shadow address:
> #  Internal Error
> (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), pid=12649,
> tid=12728
> #  assert(src_cp->destination() == destination) failed: first live obj in
> the space must match the destination
>
> So this also suggests that we should keep destination() returning the real
> destination.
>
> Some other comments:
> src/hotspot/share/gc/parallel/psParallelCompact.cpp
> ?
> 3383 void ShadowClosure::complete_region(ParCompactionManager *cm,
> HeapWord *dest_addr,
> 3384                                     PSParallelCompact::RegionData
> *region_ptr) {
> 3385   assert(region_ptr->shadow_state() ==
> ParallelCompactData::RegionData::FINISH, "Region should be finished?);
>
> This assertion will also trigger when running with a debug build and at
> this point the shadow state should be SHADOW not FINISH.
> ?
>
> src/hotspot/share/gc/parallel/psParallelCompact.hpp
> ?
>  632 inline bool ParallelCompactData::RegionData::mark_filled() {
>  633   return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == SHADOW;
>  634 }
>
> Since we never check the return value here we should make it void and
> maybe instead add an assert that the return value is SHADOW.
> ?
>
> When you addressed these comments, would it be possible to include both
> the full patch and and the incremental changes from the current version.
> That makes it easier for the reviewers to see what changed between version
> of the patch.
>
> Thanks,
> Stefan
>
> > 24 okt. 2019 kl. 14:16 skrev Stefan Johansson <
> stefan.johansson at oracle.com>:
> >
> > Hi Haoyu,
> >
> > On 2019-10-23 17:15, Haoyu Li wrote:
> >> Hi Stefan,
> >> Thanks for your constructive feedback. I've addressed all the issues
> you mentioned, and the updated patch is attached in this email.
> > Nice, I will look at the patch next week, but I'll shortly answer your
> questions right away.
> >
> >> During refining the patch, I have a couple of questions:
> >> 1) Now the MoveAndUpdateClosure and ShadowClosure assume the
> destination address is the very beginning of a region, instead of an
> arbitrary address like what it used to be. However, there is an unused
> function named PSParallelCompact::move_and_update() uses the
> MoveAndUpdateClosure to process a region from its middle, which conflicts
> with the assumption. I notice that you removed this function in your patch,
> and so did I in the updated patch. Does it matter?
> > Yes, I found this function during my code review and it should be
> removed, but I think that should be handled as a separate issue. We can do
> this removal before this patch goes in.
> >
> >> 2) Using the same do_addr() in MoveAndUpdateClosure and ShadowClosure
> is doable, but it does not reuse all the code neatly. Because storing the
> address of the shadow region in _destination requires extra virtual
> functions to handle allocating blocks in the start_array and setting
> addresses of deferred objects. In particular, allocate_blocks() and
> set_deferred_object_for() in both closures are added. Is it worth avoiding
> to use _offset to calculate the shadow_destination?
> > Ok, sounds like it might be better to have specific do_addr() functions
> then. I'll think some more around this when reviewing the new patch in
> depth.
> >
> >> If there are any problems with this patch, please contact me anytime.
> I'm more than happy to keep improving the code. Thanks again for reviewing.
> >>
> > Sound good, thanks,
> > Stefan
>
>


From erik.osterlund at oracle.com  Tue Oct 29 13:40:20 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 29 Oct 2019 14:40:20 +0100
Subject: RFR: 8233073: Make BitMap accessors more memory ordering friendly
In-Reply-To: <104E3C62-435C-4F31-87EB-D3EBC34634EE@oracle.com>
References: <1b19522f-534d-ca6b-4e97-f837e4ab7212@oracle.com>
 <104E3C62-435C-4F31-87EB-D3EBC34634EE@oracle.com>
Message-ID: <4786ce40-4889-2862-2e2c-f11bd661076a@oracle.com>

Hi Kim,

On 10/29/19 2:54 AM, Kim Barrett wrote:
> Should this review be happening on hotspot-dev rather than
> hotspot-gc-dev?  GC is not the only BitMap client; compiler uses them
> too (and generally rather differently).

Perhaps. I presumed that the way it is being changed is only interesting 
for GC folks... but I guess that depends on the direction this is going. 
Let's see...

> ------------------------------------------------------------------------------
> src/hotspot/share/utilities/bitMap.hpp
> 207   bool at(idx_t index, atomic_memory_order memory_order = memory_order_relaxed) const;
>
> My initial reaction here is that I'd prefer adding par_at() rather
> than giving at() a memory order argument.  This would also address the
> question of what the default should be.  For at(), it's nonatomic.
> For par_at() it's acquire.
>
> That would also avoid imposing volatile ordering on at().

I like that idea. Let's give that a shot.

> As you said, existing uses of at() are relaxed/nonatomic.  The code
> rearrangement for MarkBitMap::is_marked() makes me wonder if any of
> the calls should be acquire ordered, but obviously none are now...

Yeah, I also wonder about that...

> ------------------------------------------------------------------------------
> src/hotspot/share/utilities/bitMap.hpp
> 205   // The memory ordering goes up to memory_order_acquire, but not higher. It is
> 206   // assumed that users of the BitMap API will never rely on IRIW consistency.
>
> I think what this means is that memory_order_seq_cst
> (memory_order_conservative in HotSpot) isn't supported? So just as we
> only have Atomic::load (relaxed) and Atomic::load_acquire (acquire).
> That seems okay. But if we aren't going to support the stronger
> semantics, I don't think this should permit the corresponding memory
> order value.
>
> C++11 specifies that atomic load operations cannot have a memory order
> of memory_order_release or memory_order_acq_rel. (Similarly, store
> operations cannot have a memory order of memory_order_consume,
> memory_order_acquire, or memory_order_acq_rel. That isn't relevant for
> this change, as all the modifying operations here are RMW.)
>
> So I think we should just be explicit that only relaxed and acquire
> are supported here.  (And actually make that true; see next comment.)
>
> ------------------------------------------------------------------------------
> src/hotspot/share/utilities/bitMap.inline.hpp
>   55 inline bool BitMap::at(idx_t index, atomic_memory_order memory_order) const {
> ...
>   58   return (load_word_ordered(addr, memory_order) & bit_mask(index)) != 0;
>
> This is using the load_word_ordered helper, but the behavior of that
> function is designed to support the RMW operations, and I think isn't
> really right for at() (see previous comment). The simplest solution to
> get what I'm suggesting would be to add an assert here that the
> memory_order is either relaxed or acquire.

That makes sense. I added the assert.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/markBitMap.inline.hpp
> 71   return _bm.at(addr_to_offset(addr), memory_order_relaxed);
>
> The memory order argument isn't needed with the current default, and
> wouldn't even be permitted with the above suggestion add par_at.

Indeed. Reverted in favor of par_at.

> ------------------------------------------------------------------------------
>
> I'm not understanding part of the problem description though.  You say
>
>    ... have made the uses of this API more conservative for failed bit
>    flips than it was in the past.
>
> But the pre-existing unordered cases in the setting functions (e.g.
> don't go through cmpxchg) are those where the bit is already set to
> the desired value, so there's no failure to change the bit involved.
> It seems reasonable to me that an acquire (at least) is usually
> desirable on that path, for reasons similar to why one wants an
> acquire on the outside-the-lock test when using the Double Checked
> Locking pattern. But that's not what was said, so I'm not sure I'm
> understanding the point.

By failed bit flips, I specifically meant that the bit flipping function 
(par_set_at) returns false. This happens for two reasons: 1) the bit was 
already set in the original (relaxed) load, or 2) a concurrent thread 
beat us to it in the subsequent CAS. So if the function returns false, 
previously you couldn't know if the load that made the function return 
false had acquire semantics or not. Now with this patch it will have 
acquire semantics (unless the whole operation is specified to have 
relaxed or release semantics), even when the original load already had 
the bit set already. That is what I meant made the API more conservative 
than before. And as you say, I think that is a good thing. Hope this 
explains our misunderstanding.

> ------------------------------------------------------------------------------
>
> The par_xxx_range operations are not being directly modified by this
> change. When only 1 bit is actually involved, they'll delegate to the
> conservative single-bit operations, so are changed to pick up the
> acquire on the already set to the desired value path. Otherwise, they
> always go through conservative cmpxchg as before.  That all seems fine.

Yeah.

I made a new patch that reverts at() to do what it used to (in the .hpp 
file), and added a new par_at() accessor instead with explicit memory 
ordering (asserted to be acquire or relaxed), defaulting to acquire, as 
suggested. I left some innocent cleanups of missing includes in the area 
that I would like to keep anyway.

New webrev:
http://cr.openjdk.java.net/~eosterlund/8233073/webrev.01/

Incremental:
http://cr.openjdk.java.net/~eosterlund/8233073/webrev.00_01/

Thanks,
/Erik


From per.liden at oracle.com  Tue Oct 29 14:59:40 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 29 Oct 2019 15:59:40 +0100
Subject: RFR: 8233073: Make BitMap accessors more memory ordering friendly
In-Reply-To: <4786ce40-4889-2862-2e2c-f11bd661076a@oracle.com>
References: <1b19522f-534d-ca6b-4e97-f837e4ab7212@oracle.com>
 <104E3C62-435C-4F31-87EB-D3EBC34634EE@oracle.com>
 <4786ce40-4889-2862-2e2c-f11bd661076a@oracle.com>
Message-ID: <dda27b60-e163-0446-bfb1-185ec26e927c@oracle.com>

Hi,

On 10/29/19 2:40 PM, erik.osterlund at oracle.com wrote:
> Hi Kim,
> 
> On 10/29/19 2:54 AM, Kim Barrett wrote:
>> Should this review be happening on hotspot-dev rather than
>> hotspot-gc-dev?? GC is not the only BitMap client; compiler uses them
>> too (and generally rather differently).
> 
> Perhaps. I presumed that the way it is being changed is only interesting 
> for GC folks... but I guess that depends on the direction this is going. 
> Let's see...
> 
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/utilities/bitMap.hpp
>> 207?? bool at(idx_t index, atomic_memory_order memory_order = 
>> memory_order_relaxed) const;
>>
>> My initial reaction here is that I'd prefer adding par_at() rather
>> than giving at() a memory order argument.? This would also address the
>> question of what the default should be.? For at(), it's nonatomic.
>> For par_at() it's acquire.
>>
>> That would also avoid imposing volatile ordering on at().
> 
> I like that idea. Let's give that a shot.
> 
>> As you said, existing uses of at() are relaxed/nonatomic.? The code
>> rearrangement for MarkBitMap::is_marked() makes me wonder if any of
>> the calls should be acquire ordered, but obviously none are now...
> 
> Yeah, I also wonder about that...
> 
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/utilities/bitMap.hpp
>> 205?? // The memory ordering goes up to memory_order_acquire, but not 
>> higher. It is
>> 206?? // assumed that users of the BitMap API will never rely on IRIW 
>> consistency.
>>
>> I think what this means is that memory_order_seq_cst
>> (memory_order_conservative in HotSpot) isn't supported? So just as we
>> only have Atomic::load (relaxed) and Atomic::load_acquire (acquire).
>> That seems okay. But if we aren't going to support the stronger
>> semantics, I don't think this should permit the corresponding memory
>> order value.
>>
>> C++11 specifies that atomic load operations cannot have a memory order
>> of memory_order_release or memory_order_acq_rel. (Similarly, store
>> operations cannot have a memory order of memory_order_consume,
>> memory_order_acquire, or memory_order_acq_rel. That isn't relevant for
>> this change, as all the modifying operations here are RMW.)
>>
>> So I think we should just be explicit that only relaxed and acquire
>> are supported here.? (And actually make that true; see next comment.)
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/utilities/bitMap.inline.hpp
>> ? 55 inline bool BitMap::at(idx_t index, atomic_memory_order 
>> memory_order) const {
>> ...
>> ? 58?? return (load_word_ordered(addr, memory_order) & 
>> bit_mask(index)) != 0;
>>
>> This is using the load_word_ordered helper, but the behavior of that
>> function is designed to support the RMW operations, and I think isn't
>> really right for at() (see previous comment). The simplest solution to
>> get what I'm suggesting would be to add an assert here that the
>> memory_order is either relaxed or acquire.
> 
> That makes sense. I added the assert.
> 
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/shared/markBitMap.inline.hpp
>> 71?? return _bm.at(addr_to_offset(addr), memory_order_relaxed);
>>
>> The memory order argument isn't needed with the current default, and
>> wouldn't even be permitted with the above suggestion add par_at.
> 
> Indeed. Reverted in favor of par_at.
> 
>> ------------------------------------------------------------------------------ 
>>
>>
>> I'm not understanding part of the problem description though.? You say
>>
>> ?? ... have made the uses of this API more conservative for failed bit
>> ?? flips than it was in the past.
>>
>> But the pre-existing unordered cases in the setting functions (e.g.
>> don't go through cmpxchg) are those where the bit is already set to
>> the desired value, so there's no failure to change the bit involved.
>> It seems reasonable to me that an acquire (at least) is usually
>> desirable on that path, for reasons similar to why one wants an
>> acquire on the outside-the-lock test when using the Double Checked
>> Locking pattern. But that's not what was said, so I'm not sure I'm
>> understanding the point.
> 
> By failed bit flips, I specifically meant that the bit flipping function 
> (par_set_at) returns false. This happens for two reasons: 1) the bit was 
> already set in the original (relaxed) load, or 2) a concurrent thread 
> beat us to it in the subsequent CAS. So if the function returns false, 
> previously you couldn't know if the load that made the function return 
> false had acquire semantics or not. Now with this patch it will have 
> acquire semantics (unless the whole operation is specified to have 
> relaxed or release semantics), even when the original load already had 
> the bit set already. That is what I meant made the API more conservative 
> than before. And as you say, I think that is a good thing. Hope this 
> explains our misunderstanding.
> 
>> ------------------------------------------------------------------------------ 
>>
>>
>> The par_xxx_range operations are not being directly modified by this
>> change. When only 1 bit is actually involved, they'll delegate to the
>> conservative single-bit operations, so are changed to pick up the
>> acquire on the already set to the desired value path. Otherwise, they
>> always go through conservative cmpxchg as before.? That all seems fine.
> 
> Yeah.
> 
> I made a new patch that reverts at() to do what it used to (in the .hpp 
> file), and added a new par_at() accessor instead with explicit memory 
> ordering (asserted to be acquire or relaxed), defaulting to acquire, as 
> suggested. I left some innocent cleanups of missing includes in the area 
> that I would like to keep anyway.
> 
> New webrev:
> http://cr.openjdk.java.net/~eosterlund/8233073/webrev.01/

Looks good to me. I like the par_at() approach.

/Per

> 
> Incremental:
> http://cr.openjdk.java.net/~eosterlund/8233073/webrev.00_01/
> 
> Thanks,
> /Erik


From erik.osterlund at oracle.com  Tue Oct 29 15:22:10 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 29 Oct 2019 16:22:10 +0100
Subject: RFR: 8233073: Make BitMap accessors more memory ordering friendly
In-Reply-To: <dda27b60-e163-0446-bfb1-185ec26e927c@oracle.com>
References: <1b19522f-534d-ca6b-4e97-f837e4ab7212@oracle.com>
 <104E3C62-435C-4F31-87EB-D3EBC34634EE@oracle.com>
 <4786ce40-4889-2862-2e2c-f11bd661076a@oracle.com>
 <dda27b60-e163-0446-bfb1-185ec26e927c@oracle.com>
Message-ID: <c8fe8efb-2ac6-4793-7ad5-86caaf532f06@oracle.com>

Hi Per,

Thanks for the review.

/Erik

On 10/29/19 3:59 PM, Per Liden wrote:
> Hi,
>
> On 10/29/19 2:40 PM, erik.osterlund at oracle.com wrote:
>> Hi Kim,
>>
>> On 10/29/19 2:54 AM, Kim Barrett wrote:
>>> Should this review be happening on hotspot-dev rather than
>>> hotspot-gc-dev?? GC is not the only BitMap client; compiler uses them
>>> too (and generally rather differently).
>>
>> Perhaps. I presumed that the way it is being changed is only 
>> interesting for GC folks... but I guess that depends on the direction 
>> this is going. Let's see...
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/utilities/bitMap.hpp
>>> 207?? bool at(idx_t index, atomic_memory_order memory_order = 
>>> memory_order_relaxed) const;
>>>
>>> My initial reaction here is that I'd prefer adding par_at() rather
>>> than giving at() a memory order argument.? This would also address the
>>> question of what the default should be.? For at(), it's nonatomic.
>>> For par_at() it's acquire.
>>>
>>> That would also avoid imposing volatile ordering on at().
>>
>> I like that idea. Let's give that a shot.
>>
>>> As you said, existing uses of at() are relaxed/nonatomic.? The code
>>> rearrangement for MarkBitMap::is_marked() makes me wonder if any of
>>> the calls should be acquire ordered, but obviously none are now...
>>
>> Yeah, I also wonder about that...
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/utilities/bitMap.hpp
>>> 205?? // The memory ordering goes up to memory_order_acquire, but 
>>> not higher. It is
>>> 206?? // assumed that users of the BitMap API will never rely on 
>>> IRIW consistency.
>>>
>>> I think what this means is that memory_order_seq_cst
>>> (memory_order_conservative in HotSpot) isn't supported? So just as we
>>> only have Atomic::load (relaxed) and Atomic::load_acquire (acquire).
>>> That seems okay. But if we aren't going to support the stronger
>>> semantics, I don't think this should permit the corresponding memory
>>> order value.
>>>
>>> C++11 specifies that atomic load operations cannot have a memory order
>>> of memory_order_release or memory_order_acq_rel. (Similarly, store
>>> operations cannot have a memory order of memory_order_consume,
>>> memory_order_acquire, or memory_order_acq_rel. That isn't relevant for
>>> this change, as all the modifying operations here are RMW.)
>>>
>>> So I think we should just be explicit that only relaxed and acquire
>>> are supported here.? (And actually make that true; see next comment.)
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/utilities/bitMap.inline.hpp
>>> ? 55 inline bool BitMap::at(idx_t index, atomic_memory_order 
>>> memory_order) const {
>>> ...
>>> ? 58?? return (load_word_ordered(addr, memory_order) & 
>>> bit_mask(index)) != 0;
>>>
>>> This is using the load_word_ordered helper, but the behavior of that
>>> function is designed to support the RMW operations, and I think isn't
>>> really right for at() (see previous comment). The simplest solution to
>>> get what I'm suggesting would be to add an assert here that the
>>> memory_order is either relaxed or acquire.
>>
>> That makes sense. I added the assert.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/shared/markBitMap.inline.hpp
>>> 71?? return _bm.at(addr_to_offset(addr), memory_order_relaxed);
>>>
>>> The memory order argument isn't needed with the current default, and
>>> wouldn't even be permitted with the above suggestion add par_at.
>>
>> Indeed. Reverted in favor of par_at.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
>>> I'm not understanding part of the problem description though. You say
>>>
>>> ?? ... have made the uses of this API more conservative for failed bit
>>> ?? flips than it was in the past.
>>>
>>> But the pre-existing unordered cases in the setting functions (e.g.
>>> don't go through cmpxchg) are those where the bit is already set to
>>> the desired value, so there's no failure to change the bit involved.
>>> It seems reasonable to me that an acquire (at least) is usually
>>> desirable on that path, for reasons similar to why one wants an
>>> acquire on the outside-the-lock test when using the Double Checked
>>> Locking pattern. But that's not what was said, so I'm not sure I'm
>>> understanding the point.
>>
>> By failed bit flips, I specifically meant that the bit flipping 
>> function (par_set_at) returns false. This happens for two reasons: 1) 
>> the bit was already set in the original (relaxed) load, or 2) a 
>> concurrent thread beat us to it in the subsequent CAS. So if the 
>> function returns false, previously you couldn't know if the load that 
>> made the function return false had acquire semantics or not. Now with 
>> this patch it will have acquire semantics (unless the whole operation 
>> is specified to have relaxed or release semantics), even when the 
>> original load already had the bit set already. That is what I meant 
>> made the API more conservative than before. And as you say, I think 
>> that is a good thing. Hope this explains our misunderstanding.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
>>> The par_xxx_range operations are not being directly modified by this
>>> change. When only 1 bit is actually involved, they'll delegate to the
>>> conservative single-bit operations, so are changed to pick up the
>>> acquire on the already set to the desired value path. Otherwise, they
>>> always go through conservative cmpxchg as before.? That all seems fine.
>>
>> Yeah.
>>
>> I made a new patch that reverts at() to do what it used to (in the 
>> .hpp file), and added a new par_at() accessor instead with explicit 
>> memory ordering (asserted to be acquire or relaxed), defaulting to 
>> acquire, as suggested. I left some innocent cleanups of missing 
>> includes in the area that I would like to keep anyway.
>>
>> New webrev:
>> http://cr.openjdk.java.net/~eosterlund/8233073/webrev.01/
>
> Looks good to me. I like the par_at() approach.
>
> /Per
>
>>
>> Incremental:
>> http://cr.openjdk.java.net/~eosterlund/8233073/webrev.00_01/
>>
>> Thanks,
>> /Erik


From stefan.johansson at oracle.com  Tue Oct 29 17:59:13 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 29 Oct 2019 18:59:13 +0100
Subject: RFR: 8233065: PSParallelCompact::move_and_update is unused and
 should be removed
In-Reply-To: <9f2a0003-c33f-1d31-f4bb-8d491817a4be@oracle.com>
References: <7C04DE96-9918-4F5B-81C5-0ABA5AB6DEAB@oracle.com>
 <9f2a0003-c33f-1d31-f4bb-8d491817a4be@oracle.com>
Message-ID: <645B4E39-E795-47C6-AF84-29506715D0A6@oracle.com>

Thanks for the reviews Thomas and Leo,
Stefan

> 28 okt. 2019 kl. 14:42 skrev Thomas Schatzl <thomas.schatzl at oracle.com>:
> 
> Hi,
> 
> On 28.10.19 13:41, Stefan Johansson wrote:
>> Hi,
>> Please review this small fix that removes an unused function.
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8233065
>> Webrev: http://cr.openjdk.java.net/~sjohanss/8233065/00/
>> Summary:
>> The function move_and_update was not removed when its last use was removed during the removal of PermGen.
>> Testing:
>> Build and tested through mach5 (tier1)
> 
>  looks good.
> 
> Thomas


From sangheon.kim at oracle.com  Tue Oct 29 20:06:45 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 29 Oct 2019 13:06:45 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <8eefcbf1-9c75-1fe1-14b8-df0f23a53518@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
 <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
 <8eefcbf1-9c75-1fe1-14b8-df0f23a53518@oracle.com>
Message-ID: <4e872474-df42-069e-84dc-cfd5e8700914@oracle.com>

Hi Thomas,

On 10/29/19 1:42 AM, Thomas Schatzl wrote:
> Hi,
>
> On 25.10.19 16:02, sangheon.kim at oracle.com wrote:
> [...]
>>
>> In addition, Stefan, Thomas and I had some discussion about making 
>> PLAB-NUMA aware (only for survivor).
>> Stefan provided a patch with it and it is simple enough to include 
>> under this CR.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4.inc
>>
>> Testing: hs-tier 1 ~ 3, with/without UseNUMA
>>
>> Thanks,
>> Sangheon
>>
>>
>
> - G1Allocator::nodes() -> G1Allocator::num_nodes()
Done.

>
> - g1Allocator.hpp:167: s/depend/depending
Done.

>
> - please file an RFE investigating adding the node index to the region 
> attributes
Filed https://bugs.openjdk.java.net/browse/JDK-8233149: Investigate 
adding node index at G1HeapRegionAttr.

>
> Looks good otherwise. I do not need a re-review for these changes.
Thanks for your review.

Sangheon


>
> Thomas


From stefan.johansson at oracle.com  Tue Oct 29 20:13:45 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 29 Oct 2019 21:13:45 +0100
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
 <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
Message-ID: <5A6C0668-86F6-4A3F-AC4D-75097D40A1C4@oracle.com>

Hi Sangheon,

> 25 okt. 2019 kl. 16:02 skrev sangheon.kim at oracle.com:
> 
> Hi Stefan,
> 
> On 10/23/19 1:47 AM, Stefan Johansson wrote:
>> Hi Sangheon,
>> 
>> On 2019-10-22 18:47, sangheon.kim at oracle.com wrote:
>>> Hi Kim,
>>> 
>>> On 10/22/19 12:19 AM, Kim Barrett wrote:
>>>>> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
>>>>> What do you think about below comment?
>>>>> 
>>>>>    // Tries to allocate word_sz in the PLAB of the next "generation" after trying to
>>>>>    // allocate into dest. Previous_plab_refill_failed indicates whether previous
>>>>>    // PLAB refill for the original (source) object was failed.
>>>> Drop ?was?.  Otherwise looks good.
>>> Done.
>>> 
>>> Webrev:
>>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3
>>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3.inc
>> Looks good in general, just one minor thing, no need for a new webrev though:
>> src/hotspot/share/gc/g1/g1Allocator.cpp
>> ---
>> 144   for (uint nodex_index = 0; nodex_index < _num_alloc_regions; nodex_index++) {
>> 
>> The name nodex_index has one too many x:es =) I would prefer node_index.
> Ouch!
> Fixed..
> 
> In addition, Stefan, Thomas and I had some discussion about making PLAB-NUMA aware (only for survivor).
> Stefan provided a patch with it and it is simple enough to include under this CR.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4.inc

Looks good in general, just one comment.

src/hotspot/share/gc/g1/g1Allocator.inline.hpp
---
  78   assert(_alloc_buffers[dest.type()] != NULL,
  79          "Allocation buffer is NULL: %s", dest.get_type_str());

  80   G1HeapRegionAttr::region_type_t type = dest.type();
  81   return alloc_buffer(type, node_index);

As I mentioned to you offline, I think it is a bit unfortunate that we can?t index our way to the correct PLAB in G1PLABAllocator::alloc_buffer(?) without the if-statement, but I agree that having multiple array slots pointing to the same PLAB isn?t a optimal either. So I think this is approach is good for now, but I have a very minor comment on the code snippet above. I would prefer if line 80 was skipped and the call on 81 just did return alloc_buffer(dest.type(), node_index).
?

I don?t need a new webrev for this. 

Thanks,
Stefan


> 
> Testing: hs-tier 1 ~ 3, with/without UseNUMA
> 
> Thanks,
> Sangheon
> 
> 
>> ---
>> 
>> Thanks,
>> Stefan
>> 
>>> 
>>> Thanks,
>>> Sangheon
>>> 
>>> 
>>>> 
>>>>>    // Returns a non-NULL pointer if successful, and updates dest if required.
>>>>>    // Also determines whether we should continue to try to allocate into the various
>>>>>    // generations or just end trying to allocate.
>>>>>    HeapWord* allocate_in_next_plab(G1HeapRegionAttr* dest,
>>>>> ...
>>>>> 
>>>>> Let me post the webrev when we decide. :)
>>>>> 
>>>>> Thanks,
>>>>> Sangheon
>>>>> 
>>>>> 
>>>>>> ------------------------------------------------------------------------------ 
>>>>>> 
>>>>>> Looks good, other than that one comment issue.


From sangheon.kim at oracle.com  Tue Oct 29 20:39:00 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 29 Oct 2019 13:39:00 -0700
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <9d9494cd-82cd-6cf6-94e6-432a6ae187fb@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
 <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>
 <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>
 <9d9494cd-82cd-6cf6-94e6-432a6ae187fb@oracle.com>
Message-ID: <8430eee6-8990-6367-8ede-0741de8fc836@oracle.com>

Hi Kim and Per,

Thanks for your reviews.

-----------
To all reviewers,

Stefan suggested a safer handling of node index so here's another webrev.
Basically when we enable AlwaysPreTouch, we expect to get actual node id 
of the address.
However, in theory we still may get something unknown id. So below 
change is added to have safer handling of node index.

uint G1NUMA::index_for_region(HeapRegion* hr) const { if (!is_enabled()) 
{ return 0; } if (AlwaysPreTouch) { // If we already pretouched, we can 
check actual node index here. - return index_of_address(hr->bottom());
+ // However, if node index is still unknown, use preferred node index.
+ uint node_index = index_of_address(hr->bottom());
+ if (node_index != UnknownNodeIndex) {
+ return node_index;
+ }


Webrev:
http://cr.openjdk.java.net/~sangheki/8220310/webrev.8
http://cr.openjdk.java.net/~sangheki/8220310/webrev.8.inc
Testing: local build

Thanks,
Sangheon


On 10/26/19 1:36 AM, Per Liden wrote:
> On 10/25/19 11:56 PM, sangheon.kim at oracle.com wrote:
>> Hi Kim,
>>
>> On 10/24/19 4:05 PM, Kim Barrett wrote:
>>>> On Oct 23, 2019, at 12:20 PM,sangheon.kim at oracle.com? wrote:
>>>>
>>>> Hi Per,
>>>>
>>>> Thanks for taking a look at this.
>>>>
>>>> I agree all your comments and here's the webrev.
>>>> - All comments from Per.
>>>> - Move G1PageBasedVirtualSpace::page_size() near to page_start() 
>>>> from Kim.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6
>>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6.inc
>>>> Testing: build test for linux, solaris, windows and mac.
>>>>
>>>> FYI, as I think existing numa related API names and -1 stuff seem 
>>>> not good, I planned to refine those later after pushing. But as you 
>>>> said following existing rule and then refine all together later 
>>>> seems better.
>>> The type of the argument for numa_get_group_id(void* address) should
>>> be "const void*".? Sorry I didn't notice that earlier.? Of course,
>>> this will require a const_cast to remove the const qualifier when
>>> calling get_mempolicy, but it is better to isolate the workaround for
>>> that missing qualifier to that one place.
>>>
>>> I'm not sure I like the overload for os::numa_get_group_id. While
>>> both are getting the numa id associated with something, the 
>>> associations
>>> involved seem pretty different to me.
>>>
>>> Spelling them out, they could be
>>>
>>> numa_get_group_id_for_current_thread()
>>> numa_get_group_id_for_address(const void* address)
>>>
>>> Those seem semantically unrelated to me, so violate the usual guidance
>>> of only overloading operations that are roughly equivalent (*).? Or put
>>> another way, one should not need to determine which overload is 
>>> selected
>>> to understand a call site.
>>>
>>> Of course, "roughly equivalent" is in the eye of the beholder.
>>>
>>> (*) Operator overloading sometimes violates this on the basis that the
>>> syntactic concision of using operators is more important, and there
>>> are a limited set of operators.? Such violations are often used as an
>>> argument against using operator overloading at all.
>> I think the overload looks okay to me.
>> But as you are not sure about it, I renamed the newly added one.
>>
>> - static int numa_get_group_id(void* address);
>> + static int numa_get_group_id_for_address(const void* address);
>>
>
> Works for me.
>
> /Per
>
>>
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.7
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.7.inc
>>
>> Testing: hs-tier1
>>
>> Thanks,
>> Sangheon
>>
>>
>>


From sangheon.kim at oracle.com  Tue Oct 29 20:44:40 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 29 Oct 2019 13:44:40 -0700
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <5A6C0668-86F6-4A3F-AC4D-75097D40A1C4@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
 <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
 <5A6C0668-86F6-4A3F-AC4D-75097D40A1C4@oracle.com>
Message-ID: <85b282d1-0837-af5c-745f-efd0000d0ae1@oracle.com>

Hi Stefan,

Thanks for reviewing this.

On 10/29/19 1:13 PM, Stefan Johansson wrote:
> Hi Sangheon,
>
>> 25 okt. 2019 kl. 16:02 skrev sangheon.kim at oracle.com:
>>
>> Hi Stefan,
>>
>> On 10/23/19 1:47 AM, Stefan Johansson wrote:
>>> Hi Sangheon,
>>>
>>> On 2019-10-22 18:47, sangheon.kim at oracle.com wrote:
>>>> Hi Kim,
>>>>
>>>> On 10/22/19 12:19 AM, Kim Barrett wrote:
>>>>>> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
>>>>>> What do you think about below comment?
>>>>>>
>>>>>>     // Tries to allocate word_sz in the PLAB of the next "generation" after trying to
>>>>>>     // allocate into dest. Previous_plab_refill_failed indicates whether previous
>>>>>>     // PLAB refill for the original (source) object was failed.
>>>>> Drop ?was?.  Otherwise looks good.
>>>> Done.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3
>>>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3.inc
>>> Looks good in general, just one minor thing, no need for a new webrev though:
>>> src/hotspot/share/gc/g1/g1Allocator.cpp
>>> ---
>>> 144   for (uint nodex_index = 0; nodex_index < _num_alloc_regions; nodex_index++) {
>>>
>>> The name nodex_index has one too many x:es =) I would prefer node_index.
>> Ouch!
>> Fixed..
>>
>> In addition, Stefan, Thomas and I had some discussion about making PLAB-NUMA aware (only for survivor).
>> Stefan provided a patch with it and it is simple enough to include under this CR.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4.inc
> Looks good in general, just one comment.
>
> src/hotspot/share/gc/g1/g1Allocator.inline.hpp
> ---
>    78   assert(_alloc_buffers[dest.type()] != NULL,
>    79          "Allocation buffer is NULL: %s", dest.get_type_str());
>
>    80   G1HeapRegionAttr::region_type_t type = dest.type();
>    81   return alloc_buffer(type, node_index);
>
> As I mentioned to you offline, I think it is a bit unfortunate that we can?t index our way to the correct PLAB in G1PLABAllocator::alloc_buffer(?) without the if-statement, but I agree that having multiple array slots pointing to the same PLAB isn?t a optimal either. So I think this is approach is good for now, but I have a very minor comment on the code snippet above. I would prefer if line 80 was skipped and the call on 81 just did return alloc_buffer(dest.type(), node_index).
Done.
It is leftover from testing code.

You and Thomas didn't ask for webrev, but here's the next one for the 
record. :)

Webrev:
http://cr.openjdk.java.net/~sangheki/8220311/webrev.5
http://cr.openjdk.java.net/~sangheki/8220311/webrev.5.inc

Testing: local build

Thanks,
Sangheon


> ?
>
> I don?t need a new webrev for this.
>
> Thanks,
> Stefan
>
>
>> Testing: hs-tier 1 ~ 3, with/without UseNUMA
>>
>> Thanks,
>> Sangheon
>>
>>
>>> ---
>>>
>>> Thanks,
>>> Stefan
>>>
>>>> Thanks,
>>>> Sangheon
>>>>
>>>>
>>>>>>     // Returns a non-NULL pointer if successful, and updates dest if required.
>>>>>>     // Also determines whether we should continue to try to allocate into the various
>>>>>>     // generations or just end trying to allocate.
>>>>>>     HeapWord* allocate_in_next_plab(G1HeapRegionAttr* dest,
>>>>>> ...
>>>>>>
>>>>>> Let me post the webrev when we decide. :)
>>>>>>
>>>>>> Thanks,
>>>>>> Sangheon
>>>>>>
>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>>
>>>>>>> Looks good, other than that one comment issue.


From stefan.johansson at oracle.com  Wed Oct 30 07:25:18 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 30 Oct 2019 08:25:18 +0100
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <8430eee6-8990-6367-8ede-0741de8fc836@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
 <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>
 <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>
 <9d9494cd-82cd-6cf6-94e6-432a6ae187fb@oracle.com>
 <8430eee6-8990-6367-8ede-0741de8fc836@oracle.com>
Message-ID: <7268E931-1F9D-47CF-86CE-F7AA29D4D10D@oracle.com>


> 29 okt. 2019 kl. 21:39 skrev sangheon.kim at oracle.com:
> 
> Hi Kim and Per,
> 
> Thanks for your reviews.
> 
> -----------
> To all reviewers,
> 
> Stefan suggested a safer handling of node index so here's another webrev.
> Basically when we enable AlwaysPreTouch, we expect to get actual node id of the address.
> However, in theory we still may get something unknown id. So below change is added to have safer handling of node index.
> 
> uint G1NUMA::index_for_region(HeapRegion* hr) const {
>   if (!is_enabled()) {
>     return 0;
>   }
> 
> 
>    if (AlwaysPreTouch) {
>      // If we already pretouched, we can check actual node index here.
> -  return index_of_address(hr->bottom());
> 
> +    // However, if node index is still unknown, use preferred node index.
> +    uint node_index = index_of_address(hr->bottom());
> +    if (node_index != UnknownNodeIndex) {
> +      return node_index;
> +    }
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.8
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.8.inc
Looks good,
Stefan
> Testing: local build
> 
> Thanks,
> Sangheon
> 
> 
> On 10/26/19 1:36 AM, Per Liden wrote:
>> On 10/25/19 11:56 PM, sangheon.kim at oracle.com wrote: 
>>> Hi Kim, 
>>> 
>>> On 10/24/19 4:05 PM, Kim Barrett wrote: 
>>>>> On Oct 23, 2019, at 12:20 PM,sangheon.kim at oracle.com  wrote: 
>>>>> 
>>>>> Hi Per, 
>>>>> 
>>>>> Thanks for taking a look at this. 
>>>>> 
>>>>> I agree all your comments and here's the webrev. 
>>>>> - All comments from Per. 
>>>>> - Move G1PageBasedVirtualSpace::page_size() near to page_start() from Kim. 
>>>>> 
>>>>> Webrev: 
>>>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6 
>>>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.6.inc 
>>>>> Testing: build test for linux, solaris, windows and mac. 
>>>>> 
>>>>> FYI, as I think existing numa related API names and -1 stuff seem not good, I planned to refine those later after pushing. But as you said following existing rule and then refine all together later seems better. 
>>>> The type of the argument for numa_get_group_id(void* address) should 
>>>> be "const void*".  Sorry I didn't notice that earlier.  Of course, 
>>>> this will require a const_cast to remove the const qualifier when 
>>>> calling get_mempolicy, but it is better to isolate the workaround for 
>>>> that missing qualifier to that one place. 
>>>> 
>>>> I'm not sure I like the overload for os::numa_get_group_id.  While 
>>>> both are getting the numa id associated with something, the associations 
>>>> involved seem pretty different to me. 
>>>> 
>>>> Spelling them out, they could be 
>>>> 
>>>> numa_get_group_id_for_current_thread() 
>>>> numa_get_group_id_for_address(const void* address) 
>>>> 
>>>> Those seem semantically unrelated to me, so violate the usual guidance 
>>>> of only overloading operations that are roughly equivalent (*).  Or put 
>>>> another way, one should not need to determine which overload is selected 
>>>> to understand a call site. 
>>>> 
>>>> Of course, "roughly equivalent" is in the eye of the beholder. 
>>>> 
>>>> (*) Operator overloading sometimes violates this on the basis that the 
>>>> syntactic concision of using operators is more important, and there 
>>>> are a limited set of operators.  Such violations are often used as an 
>>>> argument against using operator overloading at all. 
>>> I think the overload looks okay to me. 
>>> But as you are not sure about it, I renamed the newly added one. 
>>> 
>>> - static int numa_get_group_id(void* address); 
>>> + static int numa_get_group_id_for_address(const void* address); 
>>> 
>> 
>> Works for me. 
>> 
>> /Per 
>> 
>>> 
>>> webrev: 
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.7 
>>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.7.inc 
>>> 
>>> Testing: hs-tier1 
>>> 
>>> Thanks, 
>>> Sangheon 
>>> 
>>> 
>>> 
> 


From stefan.johansson at oracle.com  Wed Oct 30 07:27:20 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 30 Oct 2019 08:27:20 +0100
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <85b282d1-0837-af5c-745f-efd0000d0ae1@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
 <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
 <5A6C0668-86F6-4A3F-AC4D-75097D40A1C4@oracle.com>
 <85b282d1-0837-af5c-745f-efd0000d0ae1@oracle.com>
Message-ID: <967B0943-BD9F-451B-9A00-5FE7A200C620@oracle.com>

Looks good,
Stefan

> 29 okt. 2019 kl. 21:44 skrev sangheon.kim at oracle.com:
> 
> Hi Stefan,
> 
> Thanks for reviewing this.
> 
> On 10/29/19 1:13 PM, Stefan Johansson wrote:
>> Hi Sangheon,
>> 
>>> 25 okt. 2019 kl. 16:02 skrev sangheon.kim at oracle.com:
>>> 
>>> Hi Stefan,
>>> 
>>> On 10/23/19 1:47 AM, Stefan Johansson wrote:
>>>> Hi Sangheon,
>>>> 
>>>> On 2019-10-22 18:47, sangheon.kim at oracle.com wrote:
>>>>> Hi Kim,
>>>>> 
>>>>> On 10/22/19 12:19 AM, Kim Barrett wrote:
>>>>>>> On Oct 22, 2019, at 1:52 AM, sangheon.kim at oracle.com wrote:
>>>>>>> What do you think about below comment?
>>>>>>> 
>>>>>>>    // Tries to allocate word_sz in the PLAB of the next "generation" after trying to
>>>>>>>    // allocate into dest. Previous_plab_refill_failed indicates whether previous
>>>>>>>    // PLAB refill for the original (source) object was failed.
>>>>>> Drop ?was?.  Otherwise looks good.
>>>>> Done.
>>>>> 
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3
>>>>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.3.inc
>>>> Looks good in general, just one minor thing, no need for a new webrev though:
>>>> src/hotspot/share/gc/g1/g1Allocator.cpp
>>>> ---
>>>> 144   for (uint nodex_index = 0; nodex_index < _num_alloc_regions; nodex_index++) {
>>>> 
>>>> The name nodex_index has one too many x:es =) I would prefer node_index.
>>> Ouch!
>>> Fixed..
>>> 
>>> In addition, Stefan, Thomas and I had some discussion about making PLAB-NUMA aware (only for survivor).
>>> Stefan provided a patch with it and it is simple enough to include under this CR.
>>> 
>>> Webrev:
>>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4
>>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4.inc
>> Looks good in general, just one comment.
>> 
>> src/hotspot/share/gc/g1/g1Allocator.inline.hpp
>> ---
>>   78   assert(_alloc_buffers[dest.type()] != NULL,
>>   79          "Allocation buffer is NULL: %s", dest.get_type_str());
>> 
>>   80   G1HeapRegionAttr::region_type_t type = dest.type();
>>   81   return alloc_buffer(type, node_index);
>> 
>> As I mentioned to you offline, I think it is a bit unfortunate that we can?t index our way to the correct PLAB in G1PLABAllocator::alloc_buffer(?) without the if-statement, but I agree that having multiple array slots pointing to the same PLAB isn?t a optimal either. So I think this is approach is good for now, but I have a very minor comment on the code snippet above. I would prefer if line 80 was skipped and the call on 81 just did return alloc_buffer(dest.type(), node_index).
> Done.
> It is leftover from testing code.
> 
> You and Thomas didn't ask for webrev, but here's the next one for the record. :)
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.5
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.5.inc
> 
> Testing: local build
> 
> Thanks,
> Sangheon
> 
> 
>> ?
>> 
>> I don?t need a new webrev for this.
>> 
>> Thanks,
>> Stefan
>> 
>> 
>>> Testing: hs-tier 1 ~ 3, with/without UseNUMA
>>> 
>>> Thanks,
>>> Sangheon
>>> 
>>> 
>>>> ---
>>>> 
>>>> Thanks,
>>>> Stefan
>>>> 
>>>>> Thanks,
>>>>> Sangheon
>>>>> 
>>>>> 
>>>>>>>    // Returns a non-NULL pointer if successful, and updates dest if required.
>>>>>>>    // Also determines whether we should continue to try to allocate into the various
>>>>>>>    // generations or just end trying to allocate.
>>>>>>>    HeapWord* allocate_in_next_plab(G1HeapRegionAttr* dest,
>>>>>>> ...
>>>>>>> 
>>>>>>> Let me post the webrev when we decide. :)
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Sangheon
>>>>>>> 
>>>>>>> 
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> 
>>>>>>>> Looks good, other than that one comment issue.


From thomas.schatzl at oracle.com  Wed Oct 30 08:02:07 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 30 Oct 2019 09:02:07 +0100
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <7268E931-1F9D-47CF-86CE-F7AA29D4D10D@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
 <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>
 <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>
 <9d9494cd-82cd-6cf6-94e6-432a6ae187fb@oracle.com>
 <8430eee6-8990-6367-8ede-0741de8fc836@oracle.com>
 <7268E931-1F9D-47CF-86CE-F7AA29D4D10D@oracle.com>
Message-ID: <2055ec92-5116-1a86-4002-5c304e63c29d@oracle.com>

Hi,

On 30.10.19 08:25, Stefan Johansson wrote:
> 
> 
>> 29 okt. 2019 kl. 21:39 skrev sangheon.kim at oracle.com:
>>
>> Hi Kim and Per,
>>
>> Thanks for your reviews.
>>
>> -----------
>> To all reviewers,
>>
>> Stefan suggested a safer handling of node index so here's another webrev.
>> Basically when we enable AlwaysPreTouch, we expect to get actual node id of the address.
>> However, in theory we still may get something unknown id. So below change is added to have safer handling of node index.
>>
>> uint G1NUMA::index_for_region(HeapRegion* hr) const {
>>    if (!is_enabled()) {
>>      return 0;
>>    }
>>
>>
>>     if (AlwaysPreTouch) {
>>       // If we already pretouched, we can check actual node index here.
>> -  return index_of_address(hr->bottom());
>>
>> +    // However, if node index is still unknown, use preferred node index.
>> +    uint node_index = index_of_address(hr->bottom());
>> +    if (node_index != UnknownNodeIndex) {
>> +      return node_index;
>> +    }
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.8
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.8.inc
> Looks good,
> Stefan

   +1

Thomas


From thomas.schatzl at oracle.com  Wed Oct 30 08:00:48 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 30 Oct 2019 09:00:48 +0100
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <967B0943-BD9F-451B-9A00-5FE7A200C620@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
 <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
 <5A6C0668-86F6-4A3F-AC4D-75097D40A1C4@oracle.com>
 <85b282d1-0837-af5c-745f-efd0000d0ae1@oracle.com>
 <967B0943-BD9F-451B-9A00-5FE7A200C620@oracle.com>
Message-ID: <f3d0a1f3-6fd6-838f-63cd-9d8f0ede8c3a@oracle.com>

Hi,

   webrev.5 looks good.

Thomas

On 30.10.19 08:27, Stefan Johansson wrote:
> Looks good,
> Stefan
> 


From stefan.karlsson at oracle.com  Wed Oct 30 10:50:05 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 30 Oct 2019 11:50:05 +0100
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <3ac7acf6-7154-1627-38b3-c186e9a18267@oracle.com>
References: <adc7cecf-80ae-01e6-fcef-f8588fcb940a@oracle.com>
 <837c9e23-6068-53b5-9c50-7880a0f375c7@oracle.com>
 <806e1ac7-fa7a-7058-c4cc-bdd90a9f2b86@oracle.com>
 <3ac7acf6-7154-1627-38b3-c186e9a18267@oracle.com>
Message-ID: <446b360d-498d-18ed-7d6a-ce7fd2dba14f@oracle.com>

Hi Erik,

Reviewed:
https://cr.openjdk.java.net/~eosterlund/8224817/webrev.02

Looks good.

Thanks,
StefanK


On 2019-10-29 12:31, erik.osterlund at oracle.com wrote:
> Seems reasonable. Thanks.
> 
> /Erik
> 
> On 10/29/19 12:29 PM, Per Liden wrote:
>> Some suggested adjustments, already discussed with Erik off-line:
>>
>> http://cr.openjdk.java.net/~pliden/8224817/webrev.review.0
>>
>> /Per
>>
>> On 10/28/19 5:11 PM, Erik ?sterlund wrote:
>>> Hi,
>>>
>>> After some internal discussions with Per and Stefan, some 
>>> refactorings have been made:
>>>
>>> 1) Use mmap consistently wherever possibly, instead of mach_vm_map, 
>>> for consistency. And only use mach_vm_remap from a wrapper function 
>>> to map in views.
>>> 2) Move the pmem segments up one level so that producer and consumer 
>>> of the segments is on the same level, and let the virtual "file" know 
>>> only about offsets.
>>> 3) Minor polishing.
>>>
>>> Incremental:
>>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00..01/
>>>
>>> Full:
>>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.01/
>>>
>>> Thanks,
>>> /Erik
>>>
>>> On 2019-10-24 12:38, erik.osterlund at oracle.com wrote:
>>>> Hi,
>>>>
>>>> Now that some curling has been performed, paving way for this patch:
>>>>
>>>> ??? 8229027: Improve how JNIHandleBlock::oops_do distinguishes oops 
>>>> from non-oops
>>>> ??? 8229278: Improve hs_err location printing to assume less about 
>>>> GC internals
>>>> ??? 8229189: Improve JFR leak profiler tracing to deal with 
>>>> discontiguous heaps
>>>> ??? 8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>>>> ??? 8224820: ZGC: Support discontiguous heap reservations
>>>>
>>>> ...the remaining thing to do is plugging in a few platform specific 
>>>> ZGC files. This patch does that.
>>>> Decided to go with mach_vm_map/mach_vm_remap to implement 
>>>> multi-mapping. Previously I didn't want to do that as I couldn't 
>>>> figure out how to mach_vm_remap memory on top of reserved VA 
>>>> (acquired using mmap). But apparently VM_FLAGS_OVERWRITE was the 
>>>> missing ingredient there. With that in place, dodging the terrible 
>>>> ftruncate implementation on macOS seemed like a good idea. That also 
>>>> implies this port supports large pages (unlike other GCs on macOS 
>>>> today). Yay!
>>>>
>>>> CR:
>>>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00/
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8229358
>>>>
>>>> Thanks,
>>>> /Erik
>>>
> 


From zgu at redhat.com  Wed Oct 30 12:56:33 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 30 Oct 2019 08:56:33 -0400
Subject: RFR 8233165: Shenandoah:SBSA::gen_load_reference_barrier_stub()
 should use pointer register for address on aarch64
Message-ID: <82ee1d40-a7a8-1b68-1aec-5b15597d7cbb@redhat.com>

The load address can come in in single-size or double-size register, 
as_pointer_register() can deal with both case.

Bug: https://bugs.openjdk.java.net/browse/JDK-8233165
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233165/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release)
   jcstress quick tests (fastdebug and release)
   on aarch64 Linux

Thanks,

-Zhengyu


From rkennke at redhat.com  Wed Oct 30 13:40:04 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 30 Oct 2019 14:40:04 +0100
Subject: RFR 8233165: Shenandoah:SBSA::gen_load_reference_barrier_stub()
 should use pointer register for address on aarch64
In-Reply-To: <82ee1d40-a7a8-1b68-1aec-5b15597d7cbb@redhat.com>
References: <82ee1d40-a7a8-1b68-1aec-5b15597d7cbb@redhat.com>
Message-ID: <bca2c6cd-25e4-c83b-b0ee-4bba2836d56e@redhat.com>

Nice.

Please push the fix.

Thanks,
Roman

> The load address can come in in single-size or double-size register,
> as_pointer_register() can deal with both case.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8233165
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233165/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release)
> ? jcstress quick tests (fastdebug and release)
> ? on aarch64 Linux
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Wed Oct 30 13:43:51 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 30 Oct 2019 09:43:51 -0400
Subject: RFR 8233165: Shenandoah:SBSA::gen_load_reference_barrier_stub()
 should use pointer register for address on aarch64
In-Reply-To: <bca2c6cd-25e4-c83b-b0ee-4bba2836d56e@redhat.com>
References: <82ee1d40-a7a8-1b68-1aec-5b15597d7cbb@redhat.com>
 <bca2c6cd-25e4-c83b-b0ee-4bba2836d56e@redhat.com>
Message-ID: <dad1014a-c341-db1f-6154-eee96357658b@redhat.com>

Thanks for the review, and pushed.

-Zhengyu

On 10/30/19 9:40 AM, Roman Kennke wrote:
> Nice.
> 
> Please push the fix.
> 
> Thanks,
> Roman
> 
>> The load address can come in in single-size or double-size register,
>> as_pointer_register() can deal with both case.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8233165
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233165/webrev.00/
>>
>> Test:
>>  ? hotspot_gc_shenandoah (fastdebug and release)
>>  ? jcstress quick tests (fastdebug and release)
>>  ? on aarch64 Linux
>>
>> Thanks,
>>
>> -Zhengyu
>>
> 


From kim.barrett at oracle.com  Wed Oct 30 14:18:30 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 30 Oct 2019 10:18:30 -0400
Subject: RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation
 for G1, Mutator (1/3)
In-Reply-To: <8430eee6-8990-6367-8ede-0741de8fc836@oracle.com>
References: <e4c60a5c-cb08-004a-ce77-e4d20d4d6891@oracle.com>
 <2b37edd6-3e0f-013d-1616-9d003f8ac1ed@oracle.com>
 <74ACAF31-8233-482A-892E-0D2E7CA72F4F@oracle.com>
 <4afe9f43-4cfa-9384-f45f-f985399629dd@oracle.com>
 <CD6BDB8C-7777-4872-BCC0-CDFB4978F876@oracle.com>
 <d8758f38-4818-e6b7-c158-118974f0ff1c@oracle.com>
 <CB368332-D86B-40D2-B152-43B726938DD2@oracle.com>
 <77f6c57a-65a6-2727-cbe9-fbc1ed52a015@oracle.com>
 <b3b70e9d-5be9-b069-b631-5733f157c9eb@oracle.com>
 <7C1985BF-A769-49FB-A658-E1B1060B5897@oracle.com>
 <3F549477-A2DF-42CF-A0E5-586F78BBCC47@oracle.com>
 <f348053c-ef9b-df03-ae17-393ace99182b@oracle.com>
 <AB6E68AA-F9AC-4E62-9CF5-1886C469A702@oracle.com>
 <9219a118-0c1d-2cee-10e5-f9bb87c72eb9@oracle.com>
 <f6c2bffa-b3e6-dda1-e453-5b01a7214c4d@oracle.com>
 <521b3b8a-70e6-6fef-cb67-b6327fa08c03@oracle.com>
 <0A9D98F3-479D-421D-A5E0-0AB8BB203717@oracle.com>
 <1615ad5b-6be7-7e7d-6815-68cfc338fd6f@oracle.com>
 <9d9494cd-82cd-6cf6-94e6-432a6ae187fb@oracle.com>
 <8430eee6-8990-6367-8ede-0741de8fc836@oracle.com>
Message-ID: <455CC0A4-D794-4BB9-9408-D1314E8CD008@oracle.com>

> On Oct 29, 2019, at 4:39 PM, sangheon.kim at oracle.com wrote:
> 
> Hi Kim and Per,
> 
> Thanks for your reviews.
> 
> -----------
> To all reviewers,
> 
> Stefan suggested a safer handling of node index so here's another webrev.
> Basically when we enable AlwaysPreTouch, we expect to get actual node id of the address.
> However, in theory we still may get something unknown id. So below change is added to have safer handling of node index.
> 
> uint G1NUMA::index_for_region(HeapRegion* hr) const {
>   if (!is_enabled()) {
>     return 0;
>   }
> 
> 
>    if (AlwaysPreTouch) {
>      // If we already pretouched, we can check actual node index here.
> -  return index_of_address(hr->bottom());
> 
> +    // However, if node index is still unknown, use preferred node index.
> +    uint node_index = index_of_address(hr->bottom());
> +    if (node_index != UnknownNodeIndex) {
> +      return node_index;
> +    }
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.8
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.8.inc
> Testing: local build

Looks good.


From erik.osterlund at oracle.com  Wed Oct 30 14:47:04 2019
From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=)
Date: Wed, 30 Oct 2019 15:47:04 +0100
Subject: RFR: 8224817: Implementation of JEP 364: ZGC on macOS
In-Reply-To: <446b360d-498d-18ed-7d6a-ce7fd2dba14f@oracle.com>
References: <446b360d-498d-18ed-7d6a-ce7fd2dba14f@oracle.com>
Message-ID: <459ABC22-5A33-4A10-ADFD-61B9CE776B69@oracle.com>

Hi Stefan,

Thank you for the review.

/Erik

> On 30 Oct 2019, at 11:50, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> 
> ?Hi Erik,
> 
> Reviewed:
> https://cr.openjdk.java.net/~eosterlund/8224817/webrev.02
> 
> Looks good.
> 
> Thanks,
> StefanK
> 
> 
>> On 2019-10-29 12:31, erik.osterlund at oracle.com wrote:
>> Seems reasonable. Thanks.
>> /Erik
>>> On 10/29/19 12:29 PM, Per Liden wrote:
>>> Some suggested adjustments, already discussed with Erik off-line:
>>> 
>>> http://cr.openjdk.java.net/~pliden/8224817/webrev.review.0
>>> 
>>> /Per
>>> 
>>> On 10/28/19 5:11 PM, Erik ?sterlund wrote:
>>>> Hi,
>>>> 
>>>> After some internal discussions with Per and Stefan, some refactorings have been made:
>>>> 
>>>> 1) Use mmap consistently wherever possibly, instead of mach_vm_map, for consistency. And only use mach_vm_remap from a wrapper function to map in views.
>>>> 2) Move the pmem segments up one level so that producer and consumer of the segments is on the same level, and let the virtual "file" know only about offsets.
>>>> 3) Minor polishing.
>>>> 
>>>> Incremental:
>>>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00..01/
>>>> 
>>>> Full:
>>>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.01/
>>>> 
>>>> Thanks,
>>>> /Erik
>>>> 
>>>> On 2019-10-24 12:38, erik.osterlund at oracle.com wrote:
>>>>> Hi,
>>>>> 
>>>>> Now that some curling has been performed, paving way for this patch:
>>>>> 
>>>>>     8229027: Improve how JNIHandleBlock::oops_do distinguishes oops from non-oops
>>>>>     8229278: Improve hs_err location printing to assume less about GC internals
>>>>>     8229189: Improve JFR leak profiler tracing to deal with discontiguous heaps
>>>>>     8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>>>>>     8224820: ZGC: Support discontiguous heap reservations
>>>>> 
>>>>> ...the remaining thing to do is plugging in a few platform specific ZGC files. This patch does that.
>>>>> Decided to go with mach_vm_map/mach_vm_remap to implement multi-mapping. Previously I didn't want to do that as I couldn't figure out how to mach_vm_remap memory on top of reserved VA (acquired using mmap). But apparently VM_FLAGS_OVERWRITE was the missing ingredient there. With that in place, dodging the terrible ftruncate implementation on macOS seemed like a good idea. That also implies this port supports large pages (unlike other GCs on macOS today). Yay!
>>>>> 
>>>>> CR:
>>>>> http://cr.openjdk.java.net/~eosterlund/8224817/webrev.00/
>>>>> 
>>>>> Bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8229358
>>>>> 
>>>>> Thanks,
>>>>> /Erik
>>>> 


From aph at redhat.com  Wed Oct 30 16:50:36 2019
From: aph at redhat.com (Andrew Haley)
Date: Wed, 30 Oct 2019 16:50:36 +0000
Subject: RFR 8228532: Shenandoah: Implement
 SBSA::try_resolve_jobject_in_native()
In-Reply-To: <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
References: <f82e0e75-124c-5135-23eb-3f6075615c25@redhat.com>
 <5a2ee72a-ea25-669e-226f-7eb62084068a@redhat.com>
 <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
Message-ID: <c0dcb5f4-926a-058b-abb6-6e6463ccd95b@redhat.com>

On 7/26/19 2:18 AM, Zhengyu Gu wrote:
> Updated Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228532/webrev.01/
> 
> On X86 platforms, r15 does not have valid thread value, instead, it 
> should be derived from jni_env argument.
> 
> Test:
>    hotspot_gc_shenandoah (fastdebug and release) on
>    Linux x86_64, x86_32
>    Windows x86_64.

FYI:

I found a bug in AArch64. When we are resolving an object in native,
rthread does not contain a valid thread value. Instead it should be
derived from the jni_env argument.

I believe this is true for all platforms: none will have a valid
rthread when called from native code.

Has this bug been backported? How should we handle it?

Suggested patch:

diff -r 6a05019acb67 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
--- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Tue Sep 17 14:00:36 2019 -0400
+++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Wed Oct 30 12:44:23 2019 -0400
@@ -424,9 +448,12 @@
   // Check for null.
   __ cbz(obj, done);

   assert(obj != rscratch2, "need rscratch2");
-  Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
-  __ ldrb(rscratch2, gc_state);
+  Address gc_state(jni_env, ShenandoahThreadLocalData::gc_state_offset() - JavaThread::jni_environment_offset());
+  __ lea(rscratch2, gc_state);
+  __ ldrb(rscratch2, Address(rscratch2));

   // Check for heap in evacuation phase
   __ tbnz(rscratch2, ShenandoahHeap::EVACUATION_BITPOS, slowpath);

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From shade at redhat.com  Wed Oct 30 17:02:15 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 30 Oct 2019 18:02:15 +0100
Subject: RFR 8228532: Shenandoah: Implement
 SBSA::try_resolve_jobject_in_native()
In-Reply-To: <c0dcb5f4-926a-058b-abb6-6e6463ccd95b@redhat.com>
References: <f82e0e75-124c-5135-23eb-3f6075615c25@redhat.com>
 <5a2ee72a-ea25-669e-226f-7eb62084068a@redhat.com>
 <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
 <c0dcb5f4-926a-058b-abb6-6e6463ccd95b@redhat.com>
Message-ID: <abe0f9e0-4f0e-cd74-cd95-8062bf41515f@redhat.com>

On 10/30/19 5:50 PM, Andrew Haley wrote:
> Has this bug been backported? How should we handle it?

JDK-8228532 is only in 14, it had not been backported.

> Suggested patch:
> 
> diff -r 6a05019acb67 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
> --- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Tue Sep 17 14:00:36 2019 -0400
> +++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Wed Oct 30 12:44:23 2019 -0400
> @@ -424,9 +448,12 @@
>    // Check for null.
>    __ cbz(obj, done);
> 
>    assert(obj != rscratch2, "need rscratch2");
> -  Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
> -  __ ldrb(rscratch2, gc_state);
> +  Address gc_state(jni_env, ShenandoahThreadLocalData::gc_state_offset() - JavaThread::jni_environment_offset());
> +  __ lea(rscratch2, gc_state);
> +  __ ldrb(rscratch2, Address(rscratch2));
> 
>    // Check for heap in evacuation phase
>    __ tbnz(rscratch2, ShenandoahHeap::EVACUATION_BITPOS, slowpath);

Yes, RFR that under new bug and link it to 8228532 :)

I think x86 does it correctly already:
  https://hg.openjdk.java.net/jdk/jdk/rev/db740ced41c4

-- 
Thanks,
-Aleksey


From aph at redhat.com  Wed Oct 30 17:07:24 2019
From: aph at redhat.com (Andrew Haley)
Date: Wed, 30 Oct 2019 17:07:24 +0000
Subject: RFR 8233165: Shenandoah:SBSA::gen_load_reference_barrier_stub()
 should use pointer register for address on aarch64
In-Reply-To: <82ee1d40-a7a8-1b68-1aec-5b15597d7cbb@redhat.com>
References: <82ee1d40-a7a8-1b68-1aec-5b15597d7cbb@redhat.com>
Message-ID: <c19d0343-efab-1ac0-4051-e3dcb3bf077e@redhat.com>

On 10/30/19 12:56 PM, Zhengyu Gu wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8233165
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233165/webrev.00/
> 
> Test:
>    hotspot_gc_shenandoah (fastdebug and release)
>    jcstress quick tests (fastdebug and release)
>    on aarch64 Linux

That looks right.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Oct 30 17:38:05 2019
From: aph at redhat.com (Andrew Haley)
Date: Wed, 30 Oct 2019 17:38:05 +0000
Subject: RFR: 8233232: AArch64: jni_fast_GetLongField is broken
In-Reply-To: <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
References: <f82e0e75-124c-5135-23eb-3f6075615c25@redhat.com>
 <5a2ee72a-ea25-669e-226f-7eb62084068a@redhat.com>
 <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
Message-ID: <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>

I found a bug in AArch64. When we are resolving an object in native,
rthread does not contain a valid thread value. Instead it should be
derived from the jni_env argument. x86 does not use rthread, and is
OK.

I believe this is true for all platforms: none will have a valid
rthread when called from native code.

Fixed thusly, the same as x86. OK?

diff -r 6a05019acb67 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
--- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Tue Sep 17 14:00:36 2019 -0400
+++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Wed Oct 30 12:44:23 2019 -0400
@@ -424,9 +448,12 @@
   // Check for null.
   __ cbz(obj, done);

   assert(obj != rscratch2, "need rscratch2");
-  Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
-  __ ldrb(rscratch2, gc_state);
+  Address gc_state(jni_env, ShenandoahThreadLocalData::gc_state_offset() - JavaThread::jni_environment_offset());
+  __ lea(rscratch2, gc_state);
+  __ ldrb(rscratch2, Address(rscratch2));

   // Check for heap in evacuation phase
   __ tbnz(rscratch2, ShenandoahHeap::EVACUATION_BITPOS, slowpath);

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From rkennke at redhat.com  Wed Oct 30 17:45:09 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 30 Oct 2019 18:45:09 +0100
Subject: RFR: 8233232: AArch64: jni_fast_GetLongField is broken
In-Reply-To: <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>
References: <f82e0e75-124c-5135-23eb-3f6075615c25@redhat.com>
 <5a2ee72a-ea25-669e-226f-7eb62084068a@redhat.com>
 <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
 <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>
Message-ID: <a358de9a-6238-38b8-a1b0-20559052c03d@redhat.com>

Is it not possible to use the gc_state Address directly in ldrb?

Roman

> I found a bug in AArch64. When we are resolving an object in native,
> rthread does not contain a valid thread value. Instead it should be
> derived from the jni_env argument. x86 does not use rthread, and is
> OK.
> 
> I believe this is true for all platforms: none will have a valid
> rthread when called from native code.
> 
> Fixed thusly, the same as x86. OK?
> 
> diff -r 6a05019acb67 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
> --- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Tue Sep 17 14:00:36 2019 -0400
> +++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Wed Oct 30 12:44:23 2019 -0400
> @@ -424,9 +448,12 @@
>    // Check for null.
>    __ cbz(obj, done);
> 
>    assert(obj != rscratch2, "need rscratch2");
> -  Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
> -  __ ldrb(rscratch2, gc_state);
> +  Address gc_state(jni_env, ShenandoahThreadLocalData::gc_state_offset() - JavaThread::jni_environment_offset());
> +  __ lea(rscratch2, gc_state);
> +  __ ldrb(rscratch2, Address(rscratch2));
> 
>    // Check for heap in evacuation phase
>    __ tbnz(rscratch2, ShenandoahHeap::EVACUATION_BITPOS, slowpath);
> 


From zgu at redhat.com  Wed Oct 30 17:48:23 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 30 Oct 2019 13:48:23 -0400
Subject: RFR 8233165: Shenandoah:SBSA::gen_load_reference_barrier_stub()
 should use pointer register for address on aarch64
In-Reply-To: <c19d0343-efab-1ac0-4051-e3dcb3bf077e@redhat.com>
References: <82ee1d40-a7a8-1b68-1aec-5b15597d7cbb@redhat.com>
 <c19d0343-efab-1ac0-4051-e3dcb3bf077e@redhat.com>
Message-ID: <6e455f3c-b306-31b4-5aa6-f065a3bdcf59@redhat.com>

Thanks for the review, Andrew.

It already pushed, so I can not add you as a reviewer.

-Zhengyu

On 10/30/19 1:07 PM, Andrew Haley wrote:
> On 10/30/19 12:56 PM, Zhengyu Gu wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8233165
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233165/webrev.00/
>>
>> Test:
>>     hotspot_gc_shenandoah (fastdebug and release)
>>     jcstress quick tests (fastdebug and release)
>>     on aarch64 Linux
> 
> That looks right.
> 


From zgu at redhat.com  Wed Oct 30 17:59:16 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 30 Oct 2019 13:59:16 -0400
Subject: RFR: 8233232: AArch64: jni_fast_GetLongField is broken
In-Reply-To: <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>
References: <f82e0e75-124c-5135-23eb-3f6075615c25@redhat.com>
 <5a2ee72a-ea25-669e-226f-7eb62084068a@redhat.com>
 <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
 <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>
Message-ID: <5ab024a1-0f34-ccf2-d77f-4d2ded3af38d@redhat.com>

Hi Andrew,

Fix looks good.

Sorry for neglecting aarch64 during past barrier works, I will double 
check them.

-Zhengyu

On 10/30/19 1:38 PM, Andrew Haley wrote:
> I found a bug in AArch64. When we are resolving an object in native,
> rthread does not contain a valid thread value. Instead it should be
> derived from the jni_env argument. x86 does not use rthread, and is
> OK.
> 
> I believe this is true for all platforms: none will have a valid
> rthread when called from native code.
> 
> Fixed thusly, the same as x86. OK?
> 
> diff -r 6a05019acb67 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
> --- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Tue Sep 17 14:00:36 2019 -0400
> +++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Wed Oct 30 12:44:23 2019 -0400
> @@ -424,9 +448,12 @@
>     // Check for null.
>     __ cbz(obj, done);
> 
>     assert(obj != rscratch2, "need rscratch2");
> -  Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
> -  __ ldrb(rscratch2, gc_state);
> +  Address gc_state(jni_env, ShenandoahThreadLocalData::gc_state_offset() - JavaThread::jni_environment_offset());
> +  __ lea(rscratch2, gc_state);
> +  __ ldrb(rscratch2, Address(rscratch2));
> 
>     // Check for heap in evacuation phase
>     __ tbnz(rscratch2, ShenandoahHeap::EVACUATION_BITPOS, slowpath);
> 


From kim.barrett at oracle.com  Wed Oct 30 19:53:19 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 30 Oct 2019 15:53:19 -0400
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <5A6C0668-86F6-4A3F-AC4D-75097D40A1C4@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
 <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
 <5A6C0668-86F6-4A3F-AC4D-75097D40A1C4@oracle.com>
Message-ID: <479069C4-526E-47CB-A86D-3ADE04076A07@oracle.com>

> On Oct 29, 2019, at 4:13 PM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>> 
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4
>> http://cr.openjdk.java.net/~sangheki/8220311/webrev.4.inc
> 
> Looks good in general, just one comment.
> 
> src/hotspot/share/gc/g1/g1Allocator.inline.hpp
> ---
>  78   assert(_alloc_buffers[dest.type()] != NULL,
>  79          "Allocation buffer is NULL: %s", dest.get_type_str());
> 
>  80   G1HeapRegionAttr::region_type_t type = dest.type();
>  81   return alloc_buffer(type, node_index);
> 
> As I mentioned to you offline, I think it is a bit unfortunate that we can?t index our way to the correct PLAB in G1PLABAllocator::alloc_buffer(?) without the if-statement, but I agree that having multiple array slots pointing to the same PLAB isn?t a optimal either. So I think this is approach is good for now,

I wondered about that too.  Multiple array slots pointing to the same PLAB doesn?t
seem bad to me, though it makes the PLAB management a little more complicated.
I agree this is good for now though, and can be investigated further in a followup.

> but I have a very minor comment on the code snippet above. I would prefer if line 80 was skipped and the call on 81 just did return alloc_buffer(dest.type(), node_index).

+1


From kim.barrett at oracle.com  Wed Oct 30 19:55:42 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 30 Oct 2019 15:55:42 -0400
Subject: RFR(M): 8220311: Implementation: NUMA-Aware Memory Allocation for
 G1, Survivor (2/3)
In-Reply-To: <85b282d1-0837-af5c-745f-efd0000d0ae1@oracle.com>
References: <d153b49b-fbb2-0d73-37e0-ff1534a83086@oracle.com>
 <de0f8e9a-27d8-eaf3-99a7-7b57e1b419c1@oracle.com>
 <9a78e353-7908-b546-8f6a-7acd92eb40ac@oracle.com>
 <D4FAC1BE-C707-48DA-979F-03502E2651F5@oracle.com>
 <846eb849-8a49-5872-73d7-6bbc8f98369c@oracle.com>
 <56788E04-DC92-461F-B3A7-DEEBC524DB5B@oracle.com>
 <3fe39096-43cb-4828-c042-0fc976a0307a@oracle.com>
 <01a9ebcf-34ed-06b2-2da8-18d84feae858@oracle.com>
 <196b55d5-01f4-0202-effb-4495ae409df0@oracle.com>
 <5A6C0668-86F6-4A3F-AC4D-75097D40A1C4@oracle.com>
 <85b282d1-0837-af5c-745f-efd0000d0ae1@oracle.com>
Message-ID: <E5ADD37C-FF31-4B8F-A803-4B432B70300E@oracle.com>

> On Oct 29, 2019, at 4:44 PM, sangheon.kim at oracle.com wrote:
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.5
> http://cr.openjdk.java.net/~sangheki/8220311/webrev.5.inc

Looks good.


From aph at redhat.com  Wed Oct 30 20:26:52 2019
From: aph at redhat.com (Andrew Haley)
Date: Wed, 30 Oct 2019 20:26:52 +0000
Subject: RFR: 8233232: AArch64: jni_fast_GetLongField is broken
In-Reply-To: <a358de9a-6238-38b8-a1b0-20559052c03d@redhat.com>
References: <f82e0e75-124c-5135-23eb-3f6075615c25@redhat.com>
 <5a2ee72a-ea25-669e-226f-7eb62084068a@redhat.com>
 <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
 <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>
 <a358de9a-6238-38b8-a1b0-20559052c03d@redhat.com>
Message-ID: <cb6cded5-63c7-ce67-c0a8-182cedcb05c6@redhat.com>

On 10/30/19 5:45 PM, Roman Kennke wrote:
> Is it not possible to use the gc_state Address directly in ldrb?

No, because it's a large negative offset.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From rkennke at redhat.com  Wed Oct 30 20:35:40 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 30 Oct 2019 21:35:40 +0100
Subject: RFR: 8233232: AArch64: jni_fast_GetLongField is broken
In-Reply-To: <cb6cded5-63c7-ce67-c0a8-182cedcb05c6@redhat.com>
References: <f82e0e75-124c-5135-23eb-3f6075615c25@redhat.com>
 <5a2ee72a-ea25-669e-226f-7eb62084068a@redhat.com>
 <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
 <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>
 <a358de9a-6238-38b8-a1b0-20559052c03d@redhat.com>
 <cb6cded5-63c7-ce67-c0a8-182cedcb05c6@redhat.com>
Message-ID: <574a0d77-4c32-2d54-cf63-1a6c0fceed7e@redhat.com>

> On 10/30/19 5:45 PM, Roman Kennke wrote:
>> Is it not possible to use the gc_state Address directly in ldrb?
> 
> No, because it's a large negative offset.

Ah ok. Then it looks good.

Thanks,
Roman


From shade at redhat.com  Wed Oct 30 20:40:26 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 30 Oct 2019 21:40:26 +0100
Subject: RFR: 8233232: AArch64: jni_fast_GetLongField is broken
In-Reply-To: <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>
References: <f82e0e75-124c-5135-23eb-3f6075615c25@redhat.com>
 <5a2ee72a-ea25-669e-226f-7eb62084068a@redhat.com>
 <2b7ec01b-085b-6e33-2946-2ad570d89ec6@redhat.com>
 <cffc12ea-241c-708c-3d46-be1c97fc1c5c@redhat.com>
Message-ID: <76364785-81b1-f1df-4c87-419bfcad9dc3@redhat.com>

On 10/30/19 6:38 PM, Andrew Haley wrote:
> I found a bug in AArch64. When we are resolving an object in native,
> rthread does not contain a valid thread value. Instead it should be
> derived from the jni_env argument. x86 does not use rthread, and is
> OK.
> 
> I believe this is true for all platforms: none will have a valid
> rthread when called from native code.
> 
> Fixed thusly, the same as x86. OK?
> 
> diff -r 6a05019acb67 src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
> --- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Tue Sep 17 14:00:36 2019 -0400
> +++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp	Wed Oct 30 12:44:23 2019 -0400
> @@ -424,9 +448,12 @@
>    // Check for null.
>    __ cbz(obj, done);
> 
>    assert(obj != rscratch2, "need rscratch2");
> -  Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
> -  __ ldrb(rscratch2, gc_state);
> +  Address gc_state(jni_env, ShenandoahThreadLocalData::gc_state_offset() - JavaThread::jni_environment_offset());
> +  __ lea(rscratch2, gc_state);
> +  __ ldrb(rscratch2, Address(rscratch2));
> 
>    // Check for heap in evacuation phase
>    __ tbnz(rscratch2, ShenandoahHeap::EVACUATION_BITPOS, slowpath);

Looks good.

-- 
Thanks,
-Aleksey


From mark.reinhold at oracle.com  Wed Oct 30 21:45:22 2019
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Wed, 30 Oct 2019 14:45:22 -0700 (PDT)
Subject: New candidate JEP: 365: ZGC on Windows
Message-ID: <20191030214522.6102930C239@eggemoggin.niobe.net>

https://openjdk.java.net/jeps/365

- Mark


From mark.reinhold at oracle.com  Wed Oct 30 22:05:19 2019
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Wed, 30 Oct 2019 15:05:19 -0700 (PDT)
Subject: New candidate JEP: 366: Deprecate the ParallelScavenge + SerialOld GC
 Combination
Message-ID: <20191030220519.8FE0830C244@eggemoggin.niobe.net>

https://openjdk.java.net/jeps/366

- Mark


From christoph.goettschkes at microdoc.com  Thu Oct 31 09:12:05 2019
From: christoph.goettschkes at microdoc.com (christoph.goettschkes at microdoc.com)
Date: Thu, 31 Oct 2019 10:12:05 +0100
Subject: RFR: 8231955: ARM32: Address displacement is 0 for volatile field
 access because of Unsafe field access.
In-Reply-To: <587f6363-bbdc-da12-9e50-82acc5bc5853@oracle.com>
References: <20191010143426.BA4B6319F46@aojmv0009>
 <20191015073212.7FCCA319074@aojmv0009>
 <f40cbf84-aef3-f235-4861-403ce30dc03d@oracle.com>
 <587f6363-bbdc-da12-9e50-82acc5bc5853@oracle.com>
Message-ID: <mailman.0.1666111720.128766.hotspot-gc-dev@openjdk.org>

> I see now that BarrierSetC1::resolve_address() is calling 
> generate_address(), at least when access isn't patched.  So now I'm 
> thinking that the address passed to 
> volatile_field_load/volatile_field_store should be correct, and the call 

> to add_large_constant() isn't necessary.

Yes, this is correct. The LIR_Address is created by
LIRGenerator::generate_address and has a displacement of 0.
I attached a backtrace of the failing assert at the end of this mail.

Do you think the patch makes sense and can be pushed?
The HotSpot tier1 JTreg tests are passing with this and other patches I am 
working on applied with a debug VM.

-- Christoph

#0  0x7636b860 in LIRGenerator::add_large_constant
    (this=0x641ae2f0, src=0xe500b, c=0, dest=0xe900b)
    at src/hotspot/cpu/arm/c1_LIRGenerator_arm.cpp:166
#1  0x7636f266 in LIRGenerator::volatile_field_load
    (this=0x641ae2f0, address=0x6429c970, result=0xdd093, info=0x0) 
    at src/hotspot/cpu/arm/c1_LIRGenerator_arm.cpp:1326
#2  0x762d9806 in BarrierSetC1::load_at_resolved
    (this=0x7602b1f0, access=..., result=0xdd093)
    at src/hotspot/share/gc/shared/c1/barrierSetC1.cpp:183
#3  0x762d929a in BarrierSetC1::load_at 
    (this=0x7602b1f0, access=..., result=0xdd093)
    at src/hotspot/share/gc/shared/c1/barrierSetC1.cpp:94
#4  0x7635f6cc in LIRGenerator::access_load_at
    (this=0x641ae2f0, decorators=9127331840, type=T_LONG, base=...,
     offset=0xd900b, result=0xdd093, patch_info=0x0, load_emit_info=0x0)
    at src/hotspot/share/c1/c1_LIRGenerator.cpp:1618
#5  0x7636133e in LIRGenerator::do_UnsafeGetObject
    (this=0x641ae2f0, x=0x6429a0d0)
    at src/hotspot/share/c1/c1_LIRGenerator.cpp:2173
#6  0x76328bdc in UnsafeGetObject::visit
    (this=0x6429a0d0, v=0x641ae2f0)
    at src/hotspot/share/c1/c1_Instruction.hpp:2407
#7  0x7635b2d2 in LIRGenerator::do_root
    (this=0x641ae2f0, instr=0x6429a0d0)
    at src/hotspot/share/c1/c1_LIRGenerator.cpp:373
#8  0x7635b1f2 in LIRGenerator::block_do
    (this=0x641ae2f0, block=0x64299788)
    at src/hotspot/share/c1/c1_LIRGenerator.cpp:354
#9  0x76337d5a in BlockList::iterate_forward
    (this=0x6429bf00, closure=0x641ae2f4)
    at src/hotspot/share/c1/c1_Instruction.cpp:921
#10 0x76332936 in IR::iterate_linear_scan_order
    (this=0x642994d0, closure=0x641ae2f4) 
    at src/hotspot/share/c1/c1_IR.cpp:1221
#11 0x7630ed10 in Compilation::emit_lir
    (this=0x641ae5c0)
    at src/hotspot/share/c1/c1_Compilation.cpp:259
#12 0x7630f2be in Compilation::compile_java_method
    (this=0x641ae5c0) 
    at src/hotspot/share/c1/c1_Compilation.cpp:398
#13 0x7630f566 in Compilation::compile_method
    (this=0x641ae5c0)
    at src/hotspot/share/c1/c1_Compilation.cpp:460
#14 0x7630fabc in Compilation::Compilation
    (this=0x641ae5c0, compiler=0x760eb610, env=0x641ae848,
     method=0x63d2edc8, osr_bci=-1, buffer_blob=0x73eb7448,
     directive=0x760cf858)
    at src/hotspot/share/c1/c1_Compilation.cpp:583
#15 0x76312d6e in Compiler::compile_method
    (this=0x760eb610, env=0x641ae848, method=0x63d2edc8, entry_bci=-1,
     directive=0x760cf858)
    at src/hotspot/share/c1/c1_Compiler.cpp:247
#16 0x76453704 in CompileBroker::invoke_compiler_on_method
    (task=0x642cfa50)
    at src/hotspot/share/compiler/compileBroker.cpp:2115
#17 0x764529ba in CompileBroker::compiler_thread_loop
    ()
    at src/hotspot/share/compiler/compileBroker.cpp:1800
#18 0x7693548c in compiler_thread_entry
    (thread=0x6423b400, __the_thread__=0x6423b400)
    at src/hotspot/share/runtime/thread.cpp:3401
#19 0x769315d4 in JavaThread::thread_main_inner
    (this=0x6423b400)
    at src/hotspot/share/runtime/thread.cpp:1917
#20 0x769314ac in JavaThread::run
    (this=0x6423b400)
    at src/hotspot/share/runtime/thread.cpp:1900
#21 0x7692e884 in Thread::call_run
    (this=0x6423b400)
    at src/hotspot/share/runtime/thread.cpp:398
#22 0x768285ce in thread_native_entry
    (thread=0x6423b400)
    at src/hotspot/os/linux/os_linux.cpp:790
#23 0x76f84568 in start_thread() from target:/usr/lib/libpthread.so.0
#24 0x76ef8ac8 in ?? () from target:/usr/lib/libc.so.6


From shade at redhat.com  Thu Oct 31 09:15:51 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 31 Oct 2019 10:15:51 +0100
Subject: RFR (XS) 8233303: Shenandoah: verifier assert erroneously uses
 byte_size_in_exact_unit
Message-ID: <217faac8-b8bc-f499-8ea2-5b52768da5df@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8233303

Typo in JDK-8232102 found by sh/jdk8 backports, where byte_size_in_exact_unit is not defined. Should
actually be "proper_unit".

Fix:

diff -r b026a43e1809 src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp
--- a/src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp    Tue Oct 29 09:34:23 2019 +0800
+++ b/src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp    Thu Oct 31 10:08:22 2019 +0100
@@ -693,12 +693,12 @@

     size_t heap_committed = _heap->committed();
     guarantee(cl.committed() == heap_committed,
               "%s: heap committed size must be consistent: heap-committed = " SIZE_FORMAT "%s,
regions-committed = " SIZE_FORMAT "%s",
               label,
-              byte_size_in_exact_unit(heap_committed), proper_unit_for_byte_size(heap_committed),
-              byte_size_in_exact_unit(cl.committed()), proper_unit_for_byte_size(cl.committed()));
+              byte_size_in_proper_unit(heap_committed), proper_unit_for_byte_size(heap_committed),
+              byte_size_in_proper_unit(cl.committed()), proper_unit_for_byte_size(cl.committed()));
   }

   // Internal heap region checks
   if (ShenandoahVerifyLevel >= 1) {
     ShenandoahVerifyHeapRegionClosure cl(label, regions);

Testing: x86_64 build

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Thu Oct 31 09:17:25 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 31 Oct 2019 10:17:25 +0100
Subject: RFR (XS) 8233303: Shenandoah: verifier assert erroneously uses
 byte_size_in_exact_unit
In-Reply-To: <217faac8-b8bc-f499-8ea2-5b52768da5df@redhat.com>
References: <217faac8-b8bc-f499-8ea2-5b52768da5df@redhat.com>
Message-ID: <28db9580-76b0-4eea-b6c9-252e69f67f51@redhat.com>

Ok.

Thanks,
Roman

> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8233303
> 
> Typo in JDK-8232102 found by sh/jdk8 backports, where byte_size_in_exact_unit is not defined. Should
> actually be "proper_unit".
> 
> Fix:
> 
> diff -r b026a43e1809 src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp
> --- a/src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp    Tue Oct 29 09:34:23 2019 +0800
> +++ b/src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp    Thu Oct 31 10:08:22 2019 +0100
> @@ -693,12 +693,12 @@
> 
>      size_t heap_committed = _heap->committed();
>      guarantee(cl.committed() == heap_committed,
>                "%s: heap committed size must be consistent: heap-committed = " SIZE_FORMAT "%s,
> regions-committed = " SIZE_FORMAT "%s",
>                label,
> -              byte_size_in_exact_unit(heap_committed), proper_unit_for_byte_size(heap_committed),
> -              byte_size_in_exact_unit(cl.committed()), proper_unit_for_byte_size(cl.committed()));
> +              byte_size_in_proper_unit(heap_committed), proper_unit_for_byte_size(heap_committed),
> +              byte_size_in_proper_unit(cl.committed()), proper_unit_for_byte_size(cl.committed()));
>    }
> 
>    // Internal heap region checks
>    if (ShenandoahVerifyLevel >= 1) {
>      ShenandoahVerifyHeapRegionClosure cl(label, regions);
> 
> Testing: x86_64 build
> 


From thomas.schatzl at oracle.com  Thu Oct 31 09:51:58 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 31 Oct 2019 10:51:58 +0100
Subject: RFR (XS): 8232951: TestG1ParallelPhases.java fails with phase
 NonYoungFreeCSet not found
In-Reply-To: <1fff7f47-aebc-bbcd-bda6-bb6185c11c3a@oracle.com>
References: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com>
 <951EB603-F273-4787-9D8D-32D8194ECFAA@oracle.com>
 <7734C751-1D9B-45F5-86FB-D51D2BE8985F@oracle.com>
 <1fff7f47-aebc-bbcd-bda6-bb6185c11c3a@oracle.com>
Message-ID: <f230fb99-5e88-4932-aa8a-ca3292a47e76@oracle.com>

Hi,

On 28.10.19 11:40, Leo Korinth wrote:
> Hi.
> 
> Just want to add some information, because I think it will fail again.
> 
> The buggy test case is written by me and the provoke mixed gc part is 
> copied mostly either from TestOldGenCollectionUsage or TestLogging (as 
> it is hard to share this code due to JTREG). However when I did "copy" 
> the code I also did try to improve the code, this could be the reason 
> for this failure. I did at least two "improvements" in that I removed 
> magic constants when allocating the 20k arrays and instead calculated 
> how many I would need; this made the algorithm allocate ~2M instead of 
> ~3M which could be a problem although to my understanding it should not 
> be. Another change I made is that I will not provoke a gc by allocating 
> until out-of-memory. The original code seems to try to provoke a gc by 
> starting concurrent marks and young gc, but kind of fail-safes with the 
> code after the comment // allocate more objects to provoke GC. Having 
> this code I guess would fix the problem with the test case, but on the 
> other hand, we would not know why the youngGC() after concurrent mark 
> does not provoke a mixed gc (I guess it should, but correct me if this 
> is false).

I do not think either change makes a difference.

> 
> I have talked to Thomas off-list, and I think AlwaysTenure is not the 
> solution to the problem we have. I think adding the debug options is 
> great and should be done, and AlwaysTenure seems better than 
> MaxTenuringThreshold=1 but we should expect the test case to continue to 
> fail in the future.
> 
> If you go by adding AlwaysTenure instead of MaxTenuringThreshold=1, 
> please also remove one getWhiteBox().youngGC() in allocateOldObjects so 
> that we do not leave "magic" lines in the test case. Also update the 
> comment to // Do *one* young collections...
> and there is another "-XX:MaxTenuringThreshold=1" that needs to be 
> updated. I need no webrev for these changes.

Updated in place; also fixed Kim's comment about line length.

http://cr.openjdk.java.net/~tschatzl/8232951/webrev/

> 
> I am sorry that my "improvements" probably caused this failure, though 
> just having heaps of code and not understanding why, is probably worse 
> in the long run --- at least that is my thinking.

The question I have is whether I can push these changes under this CR 
(and if it occurs again we at least have a log to look at) or use 
another CR for it?

Thanks,
   Thomas


From leo.korinth at oracle.com  Thu Oct 31 10:06:59 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Thu, 31 Oct 2019 11:06:59 +0100
Subject: RFR (XS): 8232951: TestG1ParallelPhases.java fails with phase
 NonYoungFreeCSet not found
In-Reply-To: <f230fb99-5e88-4932-aa8a-ca3292a47e76@oracle.com>
References: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com>
 <951EB603-F273-4787-9D8D-32D8194ECFAA@oracle.com>
 <7734C751-1D9B-45F5-86FB-D51D2BE8985F@oracle.com>
 <1fff7f47-aebc-bbcd-bda6-bb6185c11c3a@oracle.com>
 <f230fb99-5e88-4932-aa8a-ca3292a47e76@oracle.com>
Message-ID: <816cd016-b10c-dcfd-292a-99ab36685c04@oracle.com>

On 31/10/2019 10:51, Thomas Schatzl wrote:
> Hi,
> 
> On 28.10.19 11:40, Leo Korinth wrote:
>> Hi.
>>
>> Just want to add some information, because I think it will fail again.
>>
>> The buggy test case is written by me and the provoke mixed gc part is 
>> copied mostly either from TestOldGenCollectionUsage or TestLogging (as 
>> it is hard to share this code due to JTREG). However when I did "copy" 
>> the code I also did try to improve the code, this could be the reason 
>> for this failure. I did at least two "improvements" in that I removed 
>> magic constants when allocating the 20k arrays and instead calculated 
>> how many I would need; this made the algorithm allocate ~2M instead of 
>> ~3M which could be a problem although to my understanding it should 
>> not be. Another change I made is that I will not provoke a gc by 
>> allocating until out-of-memory. The original code seems to try to 
>> provoke a gc by starting concurrent marks and young gc, but kind of 
>> fail-safes with the code after the comment // allocate more objects to 
>> provoke GC. Having this code I guess would fix the problem with the 
>> test case, but on the other hand, we would not know why the youngGC() 
>> after concurrent mark does not provoke a mixed gc (I guess it should, 
>> but correct me if this is false).
> 
> I do not think either change makes a difference.
> 
>>
>> I have talked to Thomas off-list, and I think AlwaysTenure is not the 
>> solution to the problem we have. I think adding the debug options is 
>> great and should be done, and AlwaysTenure seems better than 
>> MaxTenuringThreshold=1 but we should expect the test case to continue 
>> to fail in the future.
>>
>> If you go by adding AlwaysTenure instead of MaxTenuringThreshold=1, 
>> please also remove one getWhiteBox().youngGC() in allocateOldObjects 
>> so that we do not leave "magic" lines in the test case. Also update 
>> the comment to // Do *one* young collections...
>> and there is another "-XX:MaxTenuringThreshold=1" that needs to be 
>> updated. I need no webrev for these changes.
> 
> Updated in place; also fixed Kim's comment about line length.
> 
> http://cr.openjdk.java.net/~tschatzl/8232951/webrev/
> 
>>
>> I am sorry that my "improvements" probably caused this failure, though 
>> just having heaps of code and not understanding why, is probably worse 
>> in the long run --- at least that is my thinking.
> 
> The question I have is whether I can push these changes under this CR 
> (and if it occurs again we at least have a log to look at) or use 
> another CR for it?

I am fine with you pushing under the current CR.

Thanks,
Leo
> 
> Thanks,
>  ? Thomas


From stefan.karlsson at oracle.com  Thu Oct 31 10:18:20 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 31 Oct 2019 11:18:20 +0100
Subject: RFR: 8233299: Implementation: JEP 365: ZGC on Windows
Message-ID: <8fbffe58-7045-52e4-687c-35cb8c146365@oracle.com>

Hi all,

Please review this patch to add ZGC support on Windows.

https://cr.openjdk.java.net/~stefank/8233299/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8233299

As mentioned in the JEP (https://openjdk.java.net/jeps/365), there were 
some preparation patches that needed to go in to pave the way for this 
patch:

     8232601: ZGC: Parameterize the ZGranuleMap table size
     8232602: ZGC: Make ZGranuleMap ZAddress agnostic
     8232604: ZGC: Make ZVerifyViews mapping and unmapping precise
     8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of declarations
     8232649: ZGC: Add callbacks to ZMemoryManager
     8232650: ZGC: Add initialization hooks for OS specific code
     8232651: Add implementation of os::processor_id() for Windows

... they have all been pushed now.

One important key-point to this implementation is to use the new Windows 
APIs that support reservation and mapping of memory through 
"placeholders":  VirtualAlloc2, VirtualFreeEx, MapViewOfFile3, and 
UnmapViewOfFile2. These functions are available starting from version 
1803 of Windows 10 and Windows Server. ZGC will lookup these symbols to 
determine if the Windows version supports these functions.


Correlating the text in the JEP with the code:

* '"Support for multi-mapping memory". ZGC's use of colored pointers 
requires support for heap multi-mapping, so that the same physical 
memory can be accessed from multiple different locations in the process 
address space. On Windows, paging-file backed memory provides physical 
memory with an identity (a handle), which is unrelated to the virtual 
address where it is mapped. Using this identity allows ZGC to map the 
same physical memory into multiple locations.'

We commit memory via paging file mappings and map views into that memory.

The function ZMapper::create_and_commit_paging_file_mapping uses 
CreateFileMappingW with SEC_RESERVE to create this mapping, 
MapViewOfFile3 to map a temporary view into the mapping, VirtualAlloc2 
to commit the memory, and then UnmapViewOfFile2 to unmap the view.

The reason to use SEC_RESERVE and the extra VirtualAlloc2, instead of 
SEC_COMMIT, is to ensure that the later multi-mappings of committed file 
mappings don't fail under low-memory situations. Earlier prototypes used 
SEC_COMMIT and saw these kind of OOME errors when mapping new views to 
already committed memory. The current platform-independent ZGC code 
isn't prepared to handle OOME errors when mapping views, so we chose 
this solution.

MapViewOfFile3 is then used to multi-map into the committed memory.

* '"Support for mapping paging-file backed memory into a reserved 
address space". The Windows memory management API is not as flexible as 
POSIX's mmap/munmap, especially when it comes to mapping file backed 
memory into a previously reserved address space region. To do this, ZGC 
will use the Windows concept of address space placeholders. The 
placeholder concept was introduced in version 1803 of Windows 10 and 
Windows Server. ZGC support for older versions of Windows will not be 
implemented.'

Before the placeholder APIs there was no way to first reserve a specific 
virtual memory range, and then map a view of a committed paging file 
over that range. The VirtuaAlloc function could be used to first reserve 
and then commit anonymous memory, but nothing similar existed for mapped 
views. Now with placeholders, we can create a placeholder reservation of 
memory with VirtualAlloc2, and then replace that reservation with 
MapViewOfFile3. When memory is unmapped, we can use UnmapViewOfFile2 to 
"preserve" the placeholder memory reservation.


* '"Support for mapping and unmapping arbitrary parts of the heap". 
ZGC's heap layout in combination with its dynamic sizing (and re-sizing) 
of heap pages requires support for mapping and unmapping arbitrary heap 
granules. This requirement in combination with Windows address space 
placeholders requires special attention, since placeholders must be 
explicitly split/coalesced by the program, as opposed to being 
automatically split/coalesced by the operating system (as on Linux).'

Half of the preparation patches were put in place to support this. When 
replacing a placeholder with a view of the backing file, we need to 
exactly match the address and size of a placeholder. Also, when 
unmapping a view, we need to exactly match the address and size of the 
view, and replace it with a placeholder.

To make it easier to map and unmap arbitrary parts of the heap, we split 
reserved memory into ZGranuleSize-sized placeholders. So, whenever we 
perform any of these operations, we know that any given memory range 
could be dealt with as a number of granules.

When memory is reserved, but not mapped, it is registered in the 
ZVirtualMemoryManager. It splits memory into granule-sized placholders 
when reserved memory is fetched, and coalesces placeholders when 
reserved memory is handed back.


* '"Support for committing and uncommitting arbitrary parts of the 
heap". ZGC can commit and uncommit physical memory dynamically while the 
Java program is running. To support these operations the physical memory 
will be divided into, and backed by, multiple paging-file segments. Each 
paging-file segment corresponds to a ZGC heap granule, and can be 
committed and uncommitted independently of other segments.'

Just like we can map and unmap in granules, we want to be able to commit 
and uncommit memory in granules. You can see how memory is committed and 
uncommitted in granules in ZBackingFile::commit_from_paging_file and 
ZBackingFile::uncommit_from_paging_file. Each committed granule is 
associated with one registered handle. When memory for a granule is 
uncommitted, the handle is closed. At this point, no views exist to the 
mapping and the memory is handed back to the OS.


Final point about ZPhysicalMemoryBacking. We've tried to make this file 
similar on all OSes, with the hope to be able to combine them when both 
the Windows and macOS ports have been merged.

Thanks,
StefanK


From thomas.schatzl at oracle.com  Thu Oct 31 13:07:25 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 31 Oct 2019 14:07:25 +0100
Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the
 ParallelScavenge + SerialOld GC Combination
Message-ID: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com>

Hi all,

   can I have reviews for this small change that implements deprecation 
as outlined in JEP 366: Deprecate the ParallelScavenge + SerialOld GC 
Combination?

CR:
https://bugs.openjdk.java.net/browse/JDK-8233301
Webrev:
http://cr.openjdk.java.net/~tschatzl/8233301/webrev/
Testing:
hs-tier1-5

Thanks,
   Thomas


From per.liden at oracle.com  Thu Oct 31 13:31:00 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 31 Oct 2019 14:31:00 +0100
Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the
 ParallelScavenge + SerialOld GC Combination
In-Reply-To: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com>
References: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com>
Message-ID: <d0d1a8e2-5b46-9678-fed8-e4d3e5de66ae@oracle.com>

Looks good!

/Per

On 10/31/19 2:07 PM, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I have reviews for this small change that implements deprecation 
> as outlined in JEP 366: Deprecate the ParallelScavenge + SerialOld GC 
> Combination?
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8233301
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8233301/webrev/
> Testing:
> hs-tier1-5
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Thu Oct 31 13:43:17 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 31 Oct 2019 14:43:17 +0100
Subject: RFR (M): 8189737: Make HeapRegion not derive from Space
Message-ID: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com>

Hi all,

   can I get reviews for this refactoring that removes the inheritance 
of HeapRegion from Space?

Since JDK10 we did not use much of the shared code in G1, so apart from 
inheriting a few trivial members (bottom, top, compaction_top) there is 
not much gain in inheriting from (Contiguous-)Space, except adding quite 
a few unused members and lots of legacy code.

In JDK10 we already considered removing this inheritance, but never got 
around until now :)

There will be a follow-up JDK-8233306 that cleans up the code a bit 
(sorting members and methods), but to keep this a bit more easily 
reviewable, the change is as it is.

The change is smaller than webrev indicates, for some reason the 
single-line include change in test_g1HeapVerifier.cpp caused it to be 
included as a "new" file. There is also a lot of one-line 
#include-wrangling.

CR:
https://bugs.openjdk.java.net/browse/JDK-8189737
Webrev:
http://cr.openjdk.java.net/~tschatzl/8189737/webrev/
Testing:
hs-tier-1-5

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Thu Oct 31 13:47:04 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 31 Oct 2019 14:47:04 +0100
Subject: RFR (M): 8233306: Sort members in G1's HeapRegion after removal of
 Space dependency
Message-ID: <568f7bca-3c39-f554-b557-953e5f7f157c@oracle.com>

Hi all,

  after the change to HeapRegion in JDK-8233306 the declaration fo the 
HeapRegion class is a bit messed up (merging G1ContiguousSpace, adding a 
few members needed from ContiguousSpace).

This change tries to fix this as much as possible by shuffling around 
stuff (i.e. grouping allocation related methods, evacuation related 
methods, some helper pointers in HeapRegion, etc).

Depends on JDK-8189737 also out for review.

CR:
https://bugs.openjdk.java.net/browse/JDK-8233306
Webrev:
http://cr.openjdk.java.net/~tschatzl/8233306/webrev/
Testing:
hs-tier1-5

Thanks,
   Thomas


From stefan.johansson at oracle.com  Thu Oct 31 15:31:39 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 31 Oct 2019 16:31:39 +0100
Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for
 G1, Logging (3/3)
In-Reply-To: <c55f0f8f-af07-eb42-202d-760f11170aa7@oracle.com>
References: <e7c52f60-a5c7-072a-4e3b-65c608907679@oracle.com>
 <e903223b-90a5-9d01-5421-a47011bd5985@oracle.com>
 <ba8c3fa4-9ee1-6a98-d13f-ffaacc59025c@oracle.com>
 <b5f39fc2-3319-a81c-25b4-f979282aef9f@oracle.com>
 <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com>
 <c55f0f8f-af07-eb42-202d-760f11170aa7@oracle.com>
Message-ID: <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com>

Hi Sangheon,

On 2019-10-23 08:39, sangheon.kim at oracle.com wrote:
> Hi Thomas,
> 
> I am posting the next webrev as Kim is waiting it.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.3
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.3.inc

Here are my comments:
src/hotspot/share/gc/g1/g1CollectedHeap.hpp
---
2397     st->print("  remaining free region(s) from each node id: ");

What do you think about changing this to "... region(s) on each NUMA 
node: "? I think we should be clear about the logging being for NUMA.
---

src/hotspot/share/gc/g1/g1EdenRegions.hpp
---
33 class G1EdenRegions : public G1RegionCounts {

I don?t think G1EdenRegions is a G1RegionCounts but rather it should 
have one. So instead of using inheritance here I think G1EdenRegions 
should have a G1RegionsCount. Instead of overloading length I would then 
suggest adding a region_count(uint node_index) to get the count.

Same goes for G1SurvivorRegions.
---

src/hotspot/share/gc/g1/g1NUMA.cpp
---
  279 bool NodeIndexCheckClosure::do_heap_region(HeapRegion* hr) {
  280   uint preferred_node_index = 
_numa->preferred_node_index_for_index(hr->hrm_index());
  281   uint active_node_index = _numa->index_of_address(hr->bottom());
  282
  283   if (preferred_node_index == active_node_index) {
  284     _matched[preferred_node_index]++;
  285   } else if (active_node_index == G1NUMA::UnknownNodeIndex) {
  286     _unknown++;
  287   }
  288   _total++;
  289
  290   return false;
  291 }

As we discussed offline, I would like to know the mismatches as well, I 
think the easiest approach would be to make the total count per node as 
well and that way we can see if there were any regions that didn't 
match. What do you think about printing the info like this:
[3,009s][trace][gc,heap,numa ] GC(6) NUMA region verification 
(actual/expected): 0: 1024/1024, 1: 270/1024, Unknown: 0

When testing this I also realized this output is problematic in the case 
where we have committed regions that have not yet been used. Reading the 
manual for get_mempolicy (the way we get the numa id for the address) say:
"If no page has yet been allocated for the specified address, 
get_mempolicy() will allocate a page as if the thread had performed a 
read (load) access to that address, and return the ID of the node where 
that page was allocated."

Doing a read access seem to always get a page on NUMA node 0, so the 
accounting will not be correct in this case.

One way to fix this would be to only do accounting for regions currently 
used (!hr->is_free()) but I'm not sure that is exactly what we want, at 
least not if we only do this after the GC, then only the survivors and 
old will be checked. We could solve this by also do verification before 
the GC. I think this might be the way to go, what do you think? If my 
proposal was hard to follow, here's a patch:
http://cr.openjdk.java.net/~sjohanss/numa/verify-alternative/

The output from this patch would be:
9,233s][trace][gc,heap,numa   ] GC(18) GC Start: NUMA region 
verification (actual/expected): 0: 358/358, 1: 361/361, Unknown: 0
[9,306s][trace][gc,heap,numa   ] GC(18) GC End: NUMA region verification 
(actual/expected): 0: 348/348, 1: 347/347, Unknown: 0

One can also see that this verification takes some time, so maybe it 
would make sense to have this logging under gc+numa+verify.
---

  234   uint converted_req_index = requested_node_index;
  235   if(converted_req_index == AnyNodeIndex) {
  236     converted_req_index = _num_active_node_ids;
  237   }
  238   if (converted_req_index <= _num_active_node_ids) {
  239     _times->update(phase, converted_req_index, allocated_node_index);
  240   }

I had to read this more than once to understand what it really did and I 
think we can simplify it a bit, by just doing an if-else that checks for 
AnyNodeIndex and if so passes in _num_active_node_ids to update(). This 
should be ok since requested_node_index never can be larger than 
_num_active_node_ids.
---

src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
---
I would prefer if we hide all the accounting in helper functions, but it 
might be good to declare them to be inlined.

   85   if (_numa->is_enabled()) {
   86     LogTarget(Info, gc, heap, numa) lt;
   87
   88     if (lt.is_enabled()) {
   89       uint num_nodes = _numa->num_active_nodes();
   90       // Record only if there are multiple active nodes.
   91       _obj_alloc_stat = NEW_C_HEAP_ARRAY(size_t, num_nodes, mtGC);
   92       memset((void*)_obj_alloc_stat, 0, sizeof(size_t) * num_nodes);
   93     }
   94   }

Move to something like initialize_numa_stats().

  108   if (_obj_alloc_stat != NULL) {
  109     uint node_index = _numa->index_of_current_thread();
  110 
_numa->copy_statistics(G1NodeTimes::LocalObjProcessAtCopyToSurv, 
node_index, _obj_alloc_stat);
  111   }

This could be called flush_numa_stats().

  268     if (_obj_alloc_stat != NULL) {
  269       _obj_alloc_stat[node_index]++;
  270     }

And this something like update_numa_stats(uint).
--

heapRegionSet.hpp
---
159   inline void update_length(HeapRegion* hr, bool increase);
254   inline void update_length(HeapRegion* hr, bool increase);

Is there any reason for having update_length that takes a bool rather 
than having one function for increments and one for decrements? To me it 
looks like all uses are pretty well defined and it would make the code 
easier to read. I also think we could pass in the node index rather than 
the HeapRegion since the getter lenght() does this.
---

src/hotspot/share/gc/g1/g1NodeTimes.cpp
---
First, a question about the names, G1NodeTimes signals that it has to do 
with timing, but currently we don't really record any timings. Same 
thing with NodeStatPhases, not really the same type of phases that we 
have for the rest of the GC logging. What do you think about renaming 
the class to G1NUMAStats and the enum to NodeDataItems?

  166 void G1NodeTimes::print_phase_info(G1NodeTimes::NodeStatPhases 
phase) {
  167   LogTarget(Info, gc, heap, numa) lt;

I think this should be on debug level, but if you don't agree leave it 
as is.
---

  191 void G1NodeTimes::print_mutator_alloc_stat_debug() {
  192   LogTarget(Debug, gc, heap, numa) lt;

And if you agree on moving the above to debug I think this should be on 
trace level.
---

This is it for now. Thanks,
Stefan


> Testing: hs-tier 1 ~ 4 with/without UseNUMA. hs-tier5 is almost finished 
> without new failures.
> 
> Thanks,
> Sangheon
> 
> 


From aph at redhat.com  Thu Oct 31 16:45:55 2019
From: aph at redhat.com (Andrew Haley)
Date: Thu, 31 Oct 2019 16:45:55 +0000
Subject: RFR 8232992: Shenandoah: Implement self-fixing interpreter LRB
In-Reply-To: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
References: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
Message-ID: <15f764f4-8cbe-5a9c-88f9-1843b1e97d0c@redhat.com>

On 10/25/19 3:29 PM, Zhengyu Gu wrote:
> Test:
>    hotspot_gc_shenandoah (fastdebug and release)
>    x86_64 and x86_32 on Linux
>    aarch64 Linux
>    Windows x86_64

I didn't see this because I don't read all the Shenandoah and GC
messages.

The AArch64 code is unidiomatic and cumbersome in places, not to
mention extremely confusing, and I can help with that.

 236 void ShenandoahBarrierSetAssembler::load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr) {
 237   assert(ShenandoahLoadRefBarrier, "Should be enabled");
 238   assert(dst != rscratch2, "need rscratch2");
 239   assert_different_registers(load_addr.base(), load_addr.index(), rscratch1);
 240   assert_different_registers(load_addr.base(), load_addr.index(), rscratch2);
 241
 242   Label done;
 243   __ enter();
 244   Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
 245   __ ldrb(rscratch2, gc_state);
 246
 247   // Check for heap stability
 248   __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
 249
 250   // use r1 for load address
 251   Register result_dst = dst;
 252   if (dst == r1) {
 253     __ mov(rscratch1, dst);

This is pointless. On AArch64 mov(Rn, Rm) generates no code if Rn == Rm.

 254     dst = rscratch1;
 255   }
 256
 257   RegSet to_save_r1 = RegSet::of(r1);
 258   // If outgoing register is r1, we can clobber it
 259   if (result_dst != r1) {
 260     __ push(to_save_r1, sp);
 261   }

On AArch64 registers are always saved in pairs, so it makes sense to push
individual registers. You might as well push both if either is to be saved.

 262   __ lea(r1, load_addr);
 263
 264   RegSet to_save_r0 = RegSet::of(r0);
 265   if (dst != r0) {
 266     __ push(to_save_r0, sp);
 267     __ mov(r0, dst);
 268   }
 269
 270   __ far_call(RuntimeAddress(CAST_FROM_FN_PTR(address, ShenandoahBarrierSetAssembler::shenandoah_lrb())));
 271
 272   if (result_dst != r0) {
 273     __ mov(result_dst, r0);
 274   }
 275
 276   if (dst != r0) {
 277     __ pop(to_save_r0, sp);
 278   }
 279
 280   if (result_dst != r1) {
 281     __ pop(to_save_r1, sp);
 282   }
 283
 284   __ bind(done);
 285   __ leave();
 286 }

So, you want to save r1 and r0, but if either of those is the destination you
don't want to save it. The code at ShenandoahBarrierSetAssembler::shenandoah_lrb()
preserves everything but r1 and r0.

I believe this is what you want:

void ShenandoahBarrierSetAssembler::load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr) {
  assert(ShenandoahLoadRefBarrier, "Should be enabled");
  assert(dst != rscratch2, "need rscratch2");
  assert_different_registers(load_addr.base(), load_addr.index(), rscratch1, rscratch2);

  Label done;
  __ enter();
  Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
  __ ldrb(rscratch2, gc_state);

  // Check for heap stability
  __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);

  // use r1 for load address
  Register result_dst = dst;
  if (dst == r1) {
    __ mov(rscratch1, dst);
    dst = rscratch1;
  }

  RegSet to_save = RegSet::of(r0, r1) - result_dst;
  __ push(to_save, sp);
  __ lea(r1, load_addr);
  __ mov(r0, dst);

  __ far_call(RuntimeAddress(CAST_FROM_FN_PTR(address, ShenandoahBarrierSetAssembler::shenandoah_lrb())));

  __ mov(result_dst, r0);
  __ pop(to_save, sp);

  __ bind(done);
  __ leave();
}


Please forward any patches which contain AArch64 assembly code to the
aarch64-port-dev at openjdk.java.net list.

I don't mean any criticism of you personally, but the AArch64 code in
the Shenandoah GC barriers is gnarly and some of the most difficult to
read in the whole port, probably because its authors, while
undoubtedly brilliant, were not experienced AArch64 programmers. Let
me help.  :-)

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Thu Oct 31 16:50:04 2019
From: aph at redhat.com (Andrew Haley)
Date: Thu, 31 Oct 2019 16:50:04 +0000
Subject: RFR 8232992: Shenandoah: Implement self-fixing interpreter LRB
In-Reply-To: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
References: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
Message-ID: <8adb0165-383d-b18b-4a05-52828100d397@redhat.com>

On 10/25/19 3:29 PM, Zhengyu Gu wrote:
> Test:
>    hotspot_gc_shenandoah (fastdebug and release)
>    x86_64 and x86_32 on Linux
>    aarch64 Linux
>    Windows x86_64

I didn't see this because I don't read all the Shenandoah and GC
messages.

The AArch64 code is unidiomatic and cumbersome in places, not to
mention extremely confusing, and I can help with that.

 236 void ShenandoahBarrierSetAssembler::load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr) {
 237   assert(ShenandoahLoadRefBarrier, "Should be enabled");
 238   assert(dst != rscratch2, "need rscratch2");
 239   assert_different_registers(load_addr.base(), load_addr.index(), rscratch1);
 240   assert_different_registers(load_addr.base(), load_addr.index(), rscratch2);
 241
 242   Label done;
 243   __ enter();
 244   Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
 245   __ ldrb(rscratch2, gc_state);
 246
 247   // Check for heap stability
 248   __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
 249
 250   // use r1 for load address
 251   Register result_dst = dst;
 252   if (dst == r1) {
 253     __ mov(rscratch1, dst);

This is pointless. On AArch64 mov(Rn, Rm) generates no code if Rn == Rm.

 254     dst = rscratch1;
 255   }
 256
 257   RegSet to_save_r1 = RegSet::of(r1);
 258   // If outgoing register is r1, we can clobber it
 259   if (result_dst != r1) {
 260     __ push(to_save_r1, sp);
 261   }

On AArch64 registers are always saved in pairs, so it makes sense to push
individual registers. You might as well push both if either is to be saved.

 262   __ lea(r1, load_addr);
 263
 264   RegSet to_save_r0 = RegSet::of(r0);
 265   if (dst != r0) {
 266     __ push(to_save_r0, sp);
 267     __ mov(r0, dst);
 268   }
 269
 270   __ far_call(RuntimeAddress(CAST_FROM_FN_PTR(address, ShenandoahBarrierSetAssembler::shenandoah_lrb())));
 271
 272   if (result_dst != r0) {
 273     __ mov(result_dst, r0);
 274   }
 275
 276   if (dst != r0) {
 277     __ pop(to_save_r0, sp);
 278   }
 279
 280   if (result_dst != r1) {
 281     __ pop(to_save_r1, sp);
 282   }
 283
 284   __ bind(done);
 285   __ leave();
 286 }

So, you want to save r1 and r0, but if either of those is the destination you
don't want to save it. The code at ShenandoahBarrierSetAssembler::shenandoah_lrb()
preserves everything but r1 and r0.

I believe this is what you want:

void ShenandoahBarrierSetAssembler::load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr) {
  assert(ShenandoahLoadRefBarrier, "Should be enabled");
  assert(dst != rscratch2, "need rscratch2");
  assert_different_registers(load_addr.base(), load_addr.index(), rscratch1, rscratch2);

  Label done;
  __ enter();
  Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
  __ ldrb(rscratch2, gc_state);

  // Check for heap stability
  __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);

  // use r1 for load address
  Register result_dst = dst;
  if (dst == r1) {
    __ mov(rscratch1, dst);
    dst = rscratch1;
  }

  RegSet to_save = RegSet::of(r0, r1) - result_dst;
  __ push(to_save, sp);
  __ lea(r1, load_addr);
  __ mov(r0, dst);

  __ far_call(RuntimeAddress(CAST_FROM_FN_PTR(address, ShenandoahBarrierSetAssembler::shenandoah_lrb())));

  __ mov(result_dst, r0);
  __ pop(to_save, sp);

  __ bind(done);
  __ leave();
}


Please forward any patches which contain AArch64 assembly code to the
aarch64-port-dev at openjdk.java.net list.

I don't mean any criticism of you personally, but the AArch64 code in
the Shenandoah GC barriers is gnarly and some of the most difficult to
read in the whole port, probably because its authors, while
undoubtedly brilliant, were not experienced AArch64 programmers. Let
me help.  :-)

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From zgu at redhat.com  Thu Oct 31 18:09:09 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 31 Oct 2019 14:09:09 -0400
Subject: RFR 8232992: Shenandoah: Implement self-fixing interpreter LRB
In-Reply-To: <8adb0165-383d-b18b-4a05-52828100d397@redhat.com>
References: <1648aef7-6df9-6f54-6601-fde9d7251187@redhat.com>
 <8adb0165-383d-b18b-4a05-52828100d397@redhat.com>
Message-ID: <540275f5-595a-faa0-2304-a95e657f92a0@redhat.com>

Hi Andrew,

Thanks for the suggestions. Filed JDK-8233337 to clean this up.

-Zhengyu

On 10/31/19 12:50 PM, Andrew Haley wrote:
> On 10/25/19 3:29 PM, Zhengyu Gu wrote:
>> Test:
>>     hotspot_gc_shenandoah (fastdebug and release)
>>     x86_64 and x86_32 on Linux
>>     aarch64 Linux
>>     Windows x86_64
> 
> I didn't see this because I don't read all the Shenandoah and GC
> messages.
> 
> The AArch64 code is unidiomatic and cumbersome in places, not to
> mention extremely confusing, and I can help with that.
> 
>   236 void ShenandoahBarrierSetAssembler::load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr) {
>   237   assert(ShenandoahLoadRefBarrier, "Should be enabled");
>   238   assert(dst != rscratch2, "need rscratch2");
>   239   assert_different_registers(load_addr.base(), load_addr.index(), rscratch1);
>   240   assert_different_registers(load_addr.base(), load_addr.index(), rscratch2);
>   241
>   242   Label done;
>   243   __ enter();
>   244   Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
>   245   __ ldrb(rscratch2, gc_state);
>   246
>   247   // Check for heap stability
>   248   __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
>   249
>   250   // use r1 for load address
>   251   Register result_dst = dst;
>   252   if (dst == r1) {
>   253     __ mov(rscratch1, dst);
> 
> This is pointless. On AArch64 mov(Rn, Rm) generates no code if Rn == Rm.
> 
>   254     dst = rscratch1;
>   255   }
>   256
>   257   RegSet to_save_r1 = RegSet::of(r1);
>   258   // If outgoing register is r1, we can clobber it
>   259   if (result_dst != r1) {
>   260     __ push(to_save_r1, sp);
>   261   }
> 
> On AArch64 registers are always saved in pairs, so it makes sense to push
> individual registers. You might as well push both if either is to be saved.
> 
>   262   __ lea(r1, load_addr);
>   263
>   264   RegSet to_save_r0 = RegSet::of(r0);
>   265   if (dst != r0) {
>   266     __ push(to_save_r0, sp);
>   267     __ mov(r0, dst);
>   268   }
>   269
>   270   __ far_call(RuntimeAddress(CAST_FROM_FN_PTR(address, ShenandoahBarrierSetAssembler::shenandoah_lrb())));
>   271
>   272   if (result_dst != r0) {
>   273     __ mov(result_dst, r0);
>   274   }
>   275
>   276   if (dst != r0) {
>   277     __ pop(to_save_r0, sp);
>   278   }
>   279
>   280   if (result_dst != r1) {
>   281     __ pop(to_save_r1, sp);
>   282   }
>   283
>   284   __ bind(done);
>   285   __ leave();
>   286 }
> 
> So, you want to save r1 and r0, but if either of those is the destination you
> don't want to save it. The code at ShenandoahBarrierSetAssembler::shenandoah_lrb()
> preserves everything but r1 and r0.
> 
> I believe this is what you want:
> 
> void ShenandoahBarrierSetAssembler::load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr) {
>    assert(ShenandoahLoadRefBarrier, "Should be enabled");
>    assert(dst != rscratch2, "need rscratch2");
>    assert_different_registers(load_addr.base(), load_addr.index(), rscratch1, rscratch2);
> 
>    Label done;
>    __ enter();
>    Address gc_state(rthread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
>    __ ldrb(rscratch2, gc_state);
> 
>    // Check for heap stability
>    __ tbz(rscratch2, ShenandoahHeap::HAS_FORWARDED_BITPOS, done);
> 
>    // use r1 for load address
>    Register result_dst = dst;
>    if (dst == r1) {
>      __ mov(rscratch1, dst);
>      dst = rscratch1;
>    }
> 
>    RegSet to_save = RegSet::of(r0, r1) - result_dst;
>    __ push(to_save, sp);
>    __ lea(r1, load_addr);
>    __ mov(r0, dst);
> 
>    __ far_call(RuntimeAddress(CAST_FROM_FN_PTR(address, ShenandoahBarrierSetAssembler::shenandoah_lrb())));
> 
>    __ mov(result_dst, r0);
>    __ pop(to_save, sp);
> 
>    __ bind(done);
>    __ leave();
> }
> 
> 
> Please forward any patches which contain AArch64 assembly code to the
> aarch64-port-dev at openjdk.java.net list.
> 
> I don't mean any criticism of you personally, but the AArch64 code in
> the Shenandoah GC barriers is gnarly and some of the most difficult to
> read in the whole port, probably because its authors, while
> undoubtedly brilliant, were not experienced AArch64 programmers. Let
> me help.  :-)
> 


From zgu at redhat.com  Thu Oct 31 18:48:04 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 31 Oct 2019 14:48:04 -0400
Subject: RFR 8233339: Shenandoah: Centralize load barrier decisions into
 ShenandoahBarrierSet
Message-ID: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com>

Right now, the decisions on, if a load barrier needs load reference 
barrier, if so, what kind? and if the reference needs to be kept alive, 
are scattered inside interpreter/c1/2 load barrier code, which is hard 
to make them consistent.

I would like to centralize the decision making into 
ShenandoahBarrierSet, so them can be consistent and easy to maintain.

Bug: https://bugs.openjdk.java.net/browse/JDK-8233339
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233339/webrev.00/index.html

Test:
   hotspot_gc_shenandoah (fastdebug and release)
   x86_64 and x86_32 on Linux
   AArch64 on Linux

Thanks,

-Zhengyu


From kim.barrett at oracle.com  Thu Oct 31 20:53:01 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 31 Oct 2019 16:53:01 -0400
Subject: RFR: 8232588: G1 concurrent System.gc can return early or late 
Message-ID: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com>

RFR: 8232588: G1 concurrent System.gc can return early or late
RFR: 8233279: G1: GCLocker GC with +GCLockerInvokesConcurrent spins while cycle in progress 

Please review this refactoring and fixing of the state machine used by
G1CollectedHeap::collect for handling requests for concurrent collections.

The handling of concurrent collection requests is now split out into a
helper function for that purpose.  All of the state machine logic for
checking for completion, waiting for completions, and performing retries is
now in that new helper function, rather than being distributed between
try_collect() and various parts of the VMOp.

Added a new VMOp, VM_G1TryInitiateConcMark.  This simplified both the
handling of this case and VM_G1CollectForAllocation.  The new VMOp provides
some additional information for use by the state machine.

For user-requested concurrent GC requests, the previously intended behavior
was to wait for an in-progress concurrent marking cycle (if any), then start
a new concurrent marking cycle and wait for it to complete.  However, there
were various race conditions that might result in returning either sooner or
later than intended.  This change addresses those races, so that we get
consistent behavior for such requests.

(WhiteBox.g1StartConcMarkCycle is the function that uses _wb_conc_mark.
With that name, it's not obvious that the full waiting behavior is intended,
but that's what it used to do, so not changing it.  Some tests follow it
with a sleep-wait for !WB.g1InConcurrentMark(), while others seem to expect
it to perform a complete collection.)

A change is that waiting by a user-requested GC for a concurrent marking
cycle to complete used to be performed with the thread transitioned to
native and without safepoint checks on the associated monitor lock and wait.
This was noted as having been cribbed from CMS.  Coleen and I looked at this
and could not come up with a reason for doing that for G1 (anymore, after
the recent spate of locking improvements), so there's a new G1-specific
monitor being used and the locking and waiting is now "normal".  (This makes
the FullGCCount_lock monitor largely CMS-specific.)

For other concurrent GC requests, the only intentional change is for
_gc_locker with GCLockerInvokesConcurrent.  Previously it would spin in
try_collect while there was a concurrent marking cycle in progress, also
blocking any callers of GCLocker::stall_until_clear() (JDK-8233279).  Now it
returns in that situation, though it's not clear that's a great idea either.
Indeed, even when that option was introduced (for CMS, as part of fixing a
bad interaction between GCLocker GCs and +ExplicitGCInvokesConcurrent) it
was not clear it was a good idea (see JDK-6919638).  Fortunately it's off by
default. JDK-8233280 has been filed to remove this option.

CR:
https://bugs.openjdk.java.net/browse/JDK-8233279
https://bugs.openjdk.java.net/browse/JDK-8232588

Webrev:
https://cr.openjdk.java.net/~kbarrett/8232588/open.00/

Testing:
mach5 tier1-6

Local (linux-x64) testing with a program that allocates some live data in
the old gen, then has several threads all repeatedly looping on System.gc().
Looked at output from new logging in try_collect_concurrently and verified
the interleavings of GC start/end and new log messages were as expected.


From kim.barrett at oracle.com  Thu Oct 31 21:08:51 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 31 Oct 2019 17:08:51 -0400
Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the
 ParallelScavenge + SerialOld GC Combination
In-Reply-To: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com>
References: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com>
Message-ID: <898970DE-A743-491B-9689-C1E3C2848755@oracle.com>

> On Oct 31, 2019, at 9:07 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  can I have reviews for this small change that implements deprecation as outlined in JEP 366: Deprecate the ParallelScavenge + SerialOld GC Combination?
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8233301
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8233301/webrev/
> Testing:
> hs-tier1-5
> 
> Thanks,
>  Thomas

Looks good.


From kim.barrett at oracle.com  Thu Oct 31 22:12:20 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 31 Oct 2019 18:12:20 -0400
Subject: RFR (M): 8189737: Make HeapRegion not derive from Space
In-Reply-To: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com>
References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com>
Message-ID: <F17C32E6-CB43-404C-BFF4-DD7E5987B4B8@oracle.com>

> On Oct 31, 2019, at 9:43 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  can I get reviews for this refactoring that removes the inheritance of HeapRegion from Space?
> 
> Since JDK10 we did not use much of the shared code in G1, so apart from inheriting a few trivial members (bottom, top, compaction_top) there is not much gain in inheriting from (Contiguous-)Space, except adding quite a few unused members and lots of legacy code.
> 
> In JDK10 we already considered removing this inheritance, but never got around until now :)
> 
> There will be a follow-up JDK-8233306 that cleans up the code a bit (sorting members and methods), but to keep this a bit more easily reviewable, the change is as it is.
> 
> The change is smaller than webrev indicates, for some reason the single-line include change in test_g1HeapVerifier.cpp caused it to be included as a "new" file. There is also a lot of one-line #include-wrangling.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8189737
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8189737/webrev/
> Testing:
> hs-tier-1-5
> 
> Thanks,
>  Thomas

It's a little unfortunate that you needed to touch the #includes in a
couple of cms files.  Looks like it should be an easy merge for
whichever of you or Leo goes second though.

------------------------------------------------------------------------------
 102 inline HeapWord* HeapRegion::par_allocate(size_t min_word_size,
 103                                                  size_t desired_word_size,
 104                                                  size_t* actual_size) {

Parameter list indentation needs fixing.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegion.hpp

There's a big comment block about BlockOffsetTable divergence, time
stamps, &etc in front of G1ContiguousSpace that seems to have simply
disappeared. I take it this was leftover commentary that should have
been removed with JDK-8199326 and maybe others?

------------------------------------------------------------------------------

Looks good.

I don't need a new webrev for the parameter list indentation fix.


From ecki at zusammenkunft.net  Thu Oct 31 23:05:18 2019
From: ecki at zusammenkunft.net (Bernd Eckenfels)
Date: Thu, 31 Oct 2019 23:05:18 +0000
Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the
 ParallelScavenge + SerialOld GC Combination
In-Reply-To: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com>
References: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com>
Message-ID: <DB7PR08MB3307D761105FF8DBFAEF56CFFF630@DB7PR08MB3307.eurprd08.prod.outlook.com>

The help message:

Use the Parallel Old garbage collector. Deprecated.

Looks a bit missleading to me. I know it means the option is deprecated (especially the non default negative value), but it could easily be understood as ParallelOld beeing deprecated.

There is no jtreg for +UseParallelOld. It would need to document that deprecation warning is expected for that as well?

Gruss
Bernd
--
http://bernd.eckenfels.net

________________________________
Von: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> im Auftrag von Thomas Schatzl <thomas.schatzl at oracle.com>
Gesendet: Donnerstag, Oktober 31, 2019 2:07 PM
An: hotspot-gc-dev at openjdk.java.net
Betreff: RFR (S): 8233301: Implementation of JEP 366: Deprecate the ParallelScavenge + SerialOld GC Combination

Hi all,

can I have reviews for this small change that implements deprecation
as outlined in JEP 366: Deprecate the ParallelScavenge + SerialOld GC
Combination?

CR:
https://bugs.openjdk.java.net/browse/JDK-8233301
Webrev:
http://cr.openjdk.java.net/~tschatzl/8233301/webrev/
Testing:
hs-tier1-5

Thanks,
Thomas


From thomas.schatzl at oracle.com  Thu Oct 31 23:20:51 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 01 Nov 2019 00:20:51 +0100
Subject: RFR (M): 8189737: Make HeapRegion not derive from Space
In-Reply-To: <F17C32E6-CB43-404C-BFF4-DD7E5987B4B8@oracle.com>
References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com>
 <F17C32E6-CB43-404C-BFF4-DD7E5987B4B8@oracle.com>
Message-ID: <e55e020fda0d81797dd180ef84f6d9935fb67c4b.camel@oracle.com>

Hi Kim,

  thanks for your review.

On Thu, 2019-10-31 at 18:12 -0400, Kim Barrett wrote:
> > On Oct 31, 2019, at 9:43 AM, Thomas Schatzl <
> > thomas.schatzl at oracle.com> wrote:
> > 
> > Hi all,
> > 
> >  can I get reviews for this refactoring that removes the
> > inheritance of HeapRegion from Space?
> > 
> > 
[...]
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8189737
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8189737/webrev/
> > Testing:
> > hs-tier-1-5
> > 
> > Thanks,
> >  Thomas
> 
> It's a little unfortunate that you needed to touch the #includes in a
> couple of cms files.  Looks like it should be an easy merge for
> whichever of you or Leo goes second though.
> 

Yeah, np for either of us I guess.

> -------------------------------------------------------------------
> -----------
>  102 inline HeapWord* HeapRegion::par_allocate(size_t min_word_size,
>  103                                                  size_t
> desired_word_size,
>  104                                                  size_t*
> actual_size) {
> 
> Parameter list indentation needs fixing.

Will do.

> 
> -------------------------------------------------------------------
> -----------
> src/hotspot/share/gc/g1/heapRegion.hpp
> 
> There's a big comment block about BlockOffsetTable divergence, time
> stamps, &etc in front of G1ContiguousSpace that seems to have simply
> disappeared. I take it this was leftover commentary that should have
> been removed with JDK-8199326 and maybe others?

The first one about the divergence is obsolete because with that change
we officially and intentionally abandon any way to converge.

The other about the time stamps should have, as you correctly noticed,
been removed with JDK-8199326.

> 
> -------------------------------------------------------------------
> -----------
> 
> Looks good.
> 
> I don't need a new webrev for the parameter list indentation fix.
> 

I will update the webrev later in place.

Thanks,
  Thomas