RFR: 8027545: Improve object array chunking test in G1's copy_to_survivor_space
Albert Mingkun Yang
ayang at openjdk.java.net
Wed Sep 9 12:43:02 UTC 2020
On Wed, 9 Sep 2020 07:24:27 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:
> This is rework of initial change from before the transition to git.
> The initial RFR email is attached below.
>
> The primary change is to limit the number of partial array tasks in
> the queues for any given array. The original change just split up an
> array into N tasks that were all enqueued at once. But for a large
> array this could be a lot of tasks, leading to significant and
> unnecessary queue expansion. Instead we now limit the number of tasks
> for a given array based on the number of workers, and only gradually
> add tasks up to that limit. This gives other threads an opportunity
> to steal such tasks, while still keeping queue growth under control.
>
> Most of the calculation for this is handled by a new helper class, so
> this can later be shared with ParallelGC.
>
> The dispatch on array klass type for has also been changed. It now
> affirmatively breaks Project Valhalla, rather than quietly doing
> something that probably isn't what is actually wanted. I'll discuss
> with them so there is a plan for dealing with it when they take this
> update.
>
> Ran tier1-6 in mach5 and some local stress tests.
>
> Performance testing was unchanged from previous, except I wasn't able
> to reproduce the small specjbb2015 critical-jops improvement
> previously seen on one platform. My suspicion is that improvement was
> a statistical abberation.
>
> --- Initial RFR email ---
>
> RFR: 8158045: Improve large object handling during evacuation
> RFR: 8027761: Investigate fast-path for scanning only objects with references during gc
> RFR: 8027545: Improve object array chunking test in G1's copy_to_survivor_space
>
> Please review this change to type dispatching and handling in G1's
> evacuation copying, in order to improve the hot paths and improve array
> handling. This change addresses several closely co-located enhancement
> requests; it seemed difficult to split them up in a sensible way.
>
> do_copy_to_survivor_space now gets the klass of the object being copied
> once, up front, for use in multiple places. This avoids fetching (including
> re-decoding when compressed) the klass multiple times. This addresses part
> of JDK-8027545.
>
> Moved check for and handling of string deduplication later, only applying it
> after the special array cases have been dealt with, since strings are not
> arrays. (They are header objects pointing to an array of character values.)
>
> Special case typeArray, doing nothing other than the copy, since they
> contain no oops that need to be processed. This addresses JDK-8027761.
>
> Changed handling of objArray, pushing all of the partial array tasks up
> front, rather than processing the current chunk after pushing a single task
> for the remaining work. This addresses JDK-8158045.
>
> As part of these, cached some more frequently accessed values in
> G1ParScanThreadState member variables. This addresses part of part of
> JDK-8027545.
>
> While both the old and new code will work for Project Valhalla, the handling
> of arrays should be updated for that project, which introduces new array
> types.
>
> Deleted a lingering reference to G1ParScanPartialArrayClosure that was
> deleted long ago (JDK-8035330, JDK 9).
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8158045
> https://bugs.openjdk.java.net/browse/JDK-8027761
> https://bugs.openjdk.java.net/browse/JDK-8027545
>
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8158045/open.00/
>
> Testing:
> tier1-3
>
> performance testing - seems to be at worst performance neutral, with a
> statistically significant 3% improvement in specjbb2015 critical-jops seen
> on one platform.
src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 253:
> 251: for (uint i = 0; i < step._ncreate; ++i) {
> 252: push_on_queue(ScannerTask(PartialArrayScanTask(from_obj)));
> 253: }
I think the intention would be clearer, if `start` returns a pair `<int intial_chunk_size, bool has_left_over>` instead
of reusing `Step` struct.
src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 498:
> 496: }
> 497:
> 498: if (G1StringDedup::is_enabled() && (klass == SystemDictionary::String_klass())) {
It's unclear to me why the additional checking with `String_klass()`.
src/hotspot/share/oops/arrayOop.hpp line 107:
> 105: // Accessors for instance variable which is not a C++ declared nonstatic
> 106: // field.
> 107: int length() const { return *length_addr(); }
Personally, I find the original code easier to read, since the identical content, `*(int*)(((char*)mem) +
length_offset_in_bytes())` exists in both `length()` and `set_length(HeapWord* mem, int length)`.
-------------
PR: https://git.openjdk.java.net/jdk/pull/90
More information about the hotspot-gc-dev
mailing list