RFR: 8027545: Improve object array chunking test in G1's copy_to_survivor_space

Thu Sep 10 21:49:04 UTC 2020

On Thu, 10 Sep 2020 14:05:25 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 68:
>> 
>>> 66:     _old_gen_is_full(false),
>>> 67:     _partial_objarray_chunk_size(ParGCArrayScanChunk),
>>> 68:     _partial_array_stepper(n_workers),
>> 
>> What do you think about saving the chunk size in the stepper instead? Then we don't need to pass it in to `start()` and
>> `next()`. To avoid needing it for the call to `oop_iterate_range()` we could instead have the `Step` include both the
>> start and end index.
>
> See above discussion about naming and factoring.  The same stepper can support multiple array types if the chunk size
> is external to the stepper.

It can, but I think I would prefer multiple "steppers" for that case. This is of course just a matter of taste and I'm
fine with leaving the chunk size external.

>> src/hotspot/share/gc/g1/g1ParScanThreadState.hpp line 84:
>> 
>>> 82:   bool _old_gen_is_full;
>>> 83:   // Size (in elements) of a partial objArray task chunk.
>>> 84:   int _partial_objarray_chunk_size;
>> 
>> I would prefer to skip the "obj"-part here to have more consistent naming or, as mentioned above, include it in the
>> stepper instead.
>
> Some of the naming and factoring I've done is forward looking.
> 
> I think we should consider splitting the copying part as well as the scanning, and chunking the copy of typeArrays. For
> example, JDK-8031565 suggests copying large typeArrays as part of termination waiting; I think splitting them into
> partial array tasks to use the normal parallelism in the framework is better than some new side channel. The chunk size
> for that should probably be substantially larger (and depend on the element size) than for objArray.  Also, Project
> Valhalla is going to add new kinds of arrays that are neither objArray nor typeArray. We'll want to split them too. The
> same splitting calculations can apply, even though the chunk size may be different (and probably depends on the element
> klass).

Ok, keeping the "obj"-part sounds reasonable.

>> src/hotspot/share/gc/g1/g1ParScanThreadState.hpp line 165:
>> 
>>> 163: private:
>>> 164:   inline void do_partial_array(PartialArrayScanTask task);
>>> 165:   inline void start_partial_objArray(G1HeapRegionAttr dest_dir, oop from, oop to);
>> 
>> Same here, drop "obj" for consistent naming and avoiding the camel-case.
>
> See response above about the name of _partial_objarray_chunk_size.  But I should probably be consistent about objarray
> vs objArray.  Since objArray is what it is over in runtime oop-land I'm going to go with that.

Again, we have different taste. We very seldom go with camel-case in members and especially in the middle it looks
strange. Looking through the GC-code I find mostly `objarray` or `obj_array`

>> src/hotspot/share/gc/shared/partialArrayTaskStepper.cpp line 34:
>> 
>>> 32:   // that maximizes available parallelism.
>>> 33:   return n_workers;
>>> 34: }
>> 
>> In preparation for a more advanced logic for the limit, or why not just use the input at the call-site?
>
> Being noncommittal about whether something more "clever" could or should be done.

👍

>> src/hotspot/share/gc/shared/partialArrayTaskStepper.hpp line 57:
>> 
>>> 55:     int _index;                 // Array index for the step.
>>> 56:     uint _ncreate;              // Number of new tasks to create.
>>> 57:   };
>> 
>> I wouldn't mind having getters for those, but it's not a hard request :)
>
> Step is intended to be a trivial data carrier.  A (HotSpot) Pair or std::pair (if we used stdlib) would do, except I
> like named data items.

I prefer them named as well 👍

>> src/hotspot/share/gc/shared/partialArrayTaskStepper.inline.hpp line 103:
>> 
>>> 101:   uint ncreate = MIN2(_task_fannout, MIN2(remaining_tasks, _task_limit + 1) - pending);
>>> 102:   Step result = { start, ncreate };
>>> 103:   return result;
>> 
>> Similar to my comment above, what do you think about trying to write some test for this to verify we never get to many
>> "tasks".
>
> The asserts in the stepper's next() after the increment of to's length verify that we haven't overrun.  I haven't
> thought of a way to verify the algorithm doesn't generate too few tasks though.  Well, other than getting crashes
> because some array elements didn't get processed.  But maybe you mean unit tests?  I will try to write some; I should
> have done so already.

Should have been a bit more clear. I'm talking about unit tests, so if you plan writing some that's great.

-------------

PR: https://git.openjdk.java.net/jdk/pull/90