RFR: 8027545: Improve object array chunking test in G1's copy_to_survivor_space

Kim Barrett kbarrett at openjdk.java.net
Thu Sep 10 13:34:21 UTC 2020


On Wed, 9 Sep 2020 12:36:11 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> This is rework of initial change from before the transition to git.
>> The initial RFR email is attached below.
>> 
>> The primary change is to limit the number of partial array tasks in
>> the queues for any given array.  The original change just split up an
>> array into N tasks that were all enqueued at once.  But for a large
>> array this could be a lot of tasks, leading to significant and
>> unnecessary queue expansion.  Instead we now limit the number of tasks
>> for a given array based on the number of workers, and only gradually
>> add tasks up to that limit.  This gives other threads an opportunity
>> to steal such tasks, while still keeping queue growth under control.
>> 
>> Most of the calculation for this is handled by a new helper class, so
>> this can later be shared with ParallelGC.
>> 
>> The dispatch on array klass type for has also been changed.  It now
>> affirmatively breaks Project Valhalla, rather than quietly doing
>> something that probably isn't what is actually wanted.  I'll discuss
>> with them so there is a plan for dealing with it when they take this
>> update.
>> 
>> Ran tier1-6 in mach5 and some local stress tests.
>> 
>> Performance testing was unchanged from previous, except I wasn't able
>> to reproduce the small specjbb2015 critical-jops improvement
>> previously seen on one platform.  My suspicion is that improvement was
>> a statistical abberation.
>> 
>> --- Initial RFR email ---
>> 
>> RFR: 8158045: Improve large object handling during evacuation
>> RFR: 8027761: Investigate fast-path for scanning only objects with references during gc
>> RFR: 8027545: Improve object array chunking test in G1's copy_to_survivor_space
>> 
>> Please review this change to type dispatching and handling in G1's
>> evacuation copying, in order to improve the hot paths and improve array
>> handling. This change addresses several closely co-located enhancement
>> requests; it seemed difficult to split them up in a sensible way.
>> 
>> do_copy_to_survivor_space now gets the klass of the object being copied
>> once, up front, for use in multiple places. This avoids fetching (including
>> re-decoding when compressed) the klass multiple times. This addresses part
>> of JDK-8027545.
>> 
>> Moved check for and handling of string deduplication later, only applying it
>> after the special array cases have been dealt with, since strings are not
>> arrays.  (They are header objects pointing to an array of character values.)
>> 
>> Special case typeArray, doing nothing other than the copy, since they
>> contain no oops that need to be processed.  This addresses JDK-8027761.
>> 
>> Changed handling of objArray, pushing all of the partial array tasks up
>> front, rather than processing the current chunk after pushing a single task
>> for the remaining work.  This addresses JDK-8158045.
>> 
>> As part of these, cached some more frequently accessed values in
>> G1ParScanThreadState member variables. This addresses part of part of
>> JDK-8027545.
>> 
>> While both the old and new code will work for Project Valhalla, the handling
>> of arrays should be updated for that project, which introduces new array
>> types.
>> 
>> Deleted a lingering reference to G1ParScanPartialArrayClosure that was
>> deleted long ago (JDK-8035330, JDK 9).
>> 
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8158045
>> https://bugs.openjdk.java.net/browse/JDK-8027761
>> https://bugs.openjdk.java.net/browse/JDK-8027545
>> 
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8158045/open.00/
>> 
>> Testing:
>> tier1-3
>> 
>> performance testing - seems to be at worst performance neutral, with a
>> statistically significant 3% improvement in specjbb2015 critical-jops seen
>> on one platform.
>
> src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 498:
> 
>> 496:     }
>> 497:
>> 498:     if (G1StringDedup::is_enabled() && (klass == SystemDictionary::String_klass())) {
> 
> It's unclear to me why the additional checking with `String_klass()`.

Faster fast-path, avoiding some of the other calculations before going into G1StringDedup::enqueue_from_evacuation and
only there discovering we're not dealing with a String at all. But this doesn't belong in this changeset.  I'll remove
it.

> src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 253:
> 
>> 251:   for (uint i = 0; i < step._ncreate; ++i) {
>> 252:     push_on_queue(ScannerTask(PartialArrayScanTask(from_obj)));
>> 253:   }
> 
> I think the intention would be clearer, if `start` returns a pair `<int intial_chunk_size, bool has_left_over>` instead
> of reusing `Step` struct.

I reused Step intentionally, to not commit at this layer to only creating one partial task. The stepper can change its
calculations and this layer won't care.

> src/hotspot/share/oops/arrayOop.hpp line 107:
> 
>> 105:   // Accessors for instance variable which is not a C++ declared nonstatic
>> 106:   // field.
>> 107:   int length() const { return *length_addr(); }
> 
> Personally, I find the original code easier to read, since the identical content, `*(int*)(((char*)mem) +
> length_offset_in_bytes())` exists in both `length()` and `set_length(HeapWord* mem, int length)`.

I should have included the 2-arg set_length in the changes to make everything consistent.  I'll revise in the next
update.

-------------

PR: https://git.openjdk.java.net/jdk/pull/90



More information about the hotspot-gc-dev mailing list