RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2]

Sun Mar 14 21:36:08 UTC 2021

On Sun, 14 Mar 2021 13:27:26 GMT, Amit Pawar <github.com+71302734+amitdpawar at openjdk.org> wrote:

>> In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call.
>> 
>> This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread.
>> 
>> Following minimum expansion size are seen during expansion.
>> 1. 512KB without largepages and without UseNUMA.
>> 2. 64MB without largepages and with UseNUMA,
>> 3. 2MB (on x86)  with large pages and without UseNUMA,
>> 4. 64MB without large pages and with UseNUMA.
>> 
>> When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch.
>> 
>> Jtreg all test passed.
>> 
>> Please review this change.
>
> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixed build issues for some targets and updated with suggested changes.

src/hotspot/share/gc/shared/pretouchTask.cpp line 66:

> 64: 
> 65:   // Following atomic loads are required to make other processor store
> 66:   // visible to all threads from this points.

That is not what an Atomic::load does; they only protect against word-tearing (which is only a theoretical possibility on some platforms). If you want to guarantee those loads see the most recent store then the fence needs to come first.

src/hotspot/share/gc/shared/pretouchTask.cpp line 94:

> 92: 
> 93:   if (thread_num == 0) {
> 94:     Atomic::release_store(&_cur_addr, _end_addr);

The release is unnecessary given the Atomic::sub.

src/hotspot/share/gc/shared/pretouchTask.hpp line 47:

> 45: 
> 46:   void set_task_status(TaskStatus status) { Atomic::release_store(&_task_status, (size_t)status);  }
> 47:   void set_task_done()                    { Atomic::release_store(&_task_status, (size_t)Done);    }

There is a convention that when a setter embodies a release_store that release is included in the name e.g. release_set_task_done().

src/hotspot/share/gc/shared/pretouchTask.hpp line 51:

> 49:   void set_task_notready()                { set_task_status(NotReady); }
> 50:   bool is_task_ready()                    { return Atomic::load(&_task_status) == Ready; }
> 51:   bool is_task_done()                     { return Atomic::load(&_task_status) == Done;  }

Given these fields are set with a release_store I would expect to see them read with a load_acquire to match with it. And named eg. is_task_done_acquire().

-------------

PR: https://git.openjdk.java.net/jdk/pull/2976