RFR: 8332163: C2 SuperWord: refactor PacksetGraph and SuperWord::output into VTransformGraph [v2]

Tue Jul 2 14:48:56 UTC 2024

On Tue, 2 Jul 2024 09:46:37 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:

>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   updates for Christian
>
> src/hotspot/share/opto/superword.cpp line 1883:
> 
>> 1881: // Apply the vectorization, i.e. we irreversibly edit the C2 graph. At this point, all
>> 1882: // correctness and profitability checks have passed, and the graph was successfully scheduled.
>> 1883: void VTransformGraph::apply() {
> 
> Maybe also assert here that the schedule is not empty?

Can do, good idea :)

> src/hotspot/share/opto/superword.cpp line 1896:
> 
>> 1894:   C->print_method(PHASE_AUTO_VECTORIZATION1_BEFORE_APPLY, 4, cl());
>> 1895: 
>> 1896:   apply_memops_reordering_with_schedule();
> 
> How about having a separate schedule class that creates the schedule and also offers this method? It could make the class `VTransformGraph` lighter.

I only renamed `apply_memops_reordering_with_schedule` from a pre-existing method. I could consider transforming this into a class in a future RFE.

> src/hotspot/share/opto/superword.cpp line 1899:
> 
>> 1897:   C->print_method(PHASE_AUTO_VECTORIZATION2_AFTER_REORDER, 4, cl());
>> 1898: 
>> 1899:   adjust_pre_loop_limit_to_align_main_loop_vectors();
> 
> Same thought here: Could this be a separate class? It looks like adjusting the pre loop is not really something the graph should know about. We are rather using information about the graph (i.e. the graph could be passed to the schedule class).

We could do this in a separate RFE, yes. I only moved it over from SuperWord.

> src/hotspot/share/opto/superword.cpp line 1910:
> 
>> 1908: // ordered according to the _schedule. This means that all packed memops are consecutive
>> 1909: // in the memory graph after the reordering.
>> 1910: void VTransformGraph::apply_memops_reordering_with_schedule() const {
> 
> General thought here (was like that before and could also be done separately at some point): Steps (1) - (3) could probably be moved to separate methods.

Could be done in a separate RFE. Also we can mobe the code over to `vectorization.hpp/cpp`. It would at that point even become a class. Would have to experiment to see if that really makes the code easier to read / maintain.

> src/hotspot/share/opto/superword.cpp line 2011:
> 
>> 2009:   // generated def (input) nodes when we are generating the use nodes in "apply".
>> 2010:   int length = _vtnodes.length();
>> 2011:   GrowableArray<Node*> vnode_idx_to_transformed_node(length, length, nullptr);
> 
> Also applies to parameter names of the `apply()` methods.
> Suggestion:
> 
>   GrowableArray<Node*> vtnode_idx_to_transformed_node(length, length, nullptr);

good catch

> src/hotspot/share/opto/superword.cpp line 3041:
> 
>> 3039:   VectorSet vtn_dependencies; // Shared, but cleared for every vtnode.
>> 3040:   build_edges_for_vector_vtnodes(vtn_dependencies);
>> 3041:   build_edges_for_scalar_vtnodes(vtn_dependencies);
> 
> This is a little confusing since you are not only adding edges but also new vtnodes. Maybe you find a better name. But I suggest to at least add some comments at these methods to mention that. Otherwise, it reads like: Build vector vtnodes, then scalar vtnodes, then connect them somehow (without new nodes being added).

But this is exactly the idea!

> src/hotspot/share/opto/superword.cpp line 3047:
> 
>> 3045:   for (int i = 0; i < _packset.length(); i++) {
>> 3046:     Node_List* pack = _packset.at(i);
>> 3047:     VTransformVectorNode* vtn = make_vtnode_for_pack(pack);
> 
> For consistency, you can name this function `make_vector_vtnode_for_pack()` since you are only calling it to create vector vtnodes.

Good idea.

> src/hotspot/share/opto/superword.cpp line 3049:
> 
>> 3047:     VTransformVectorNode* vtn = make_vtnode_for_pack(pack);
>> 3048:     for (uint k = 0; k < pack->size(); k++) {
>> 3049:       set_vtnode(pack->at(k), vtn);
> 
> Maybe rename this to `map_node_to_vtnode()` to make it more explicit.

Good idea.

> src/hotspot/share/opto/superword.cpp line 3143:
> 
>> 3141: 
>> 3142:   if (p0->is_Load()) {
>> 3143:     vtn = new (_graph.arena()) VTransformLoadVectorNode(_graph, pack_size);
> 
> You use `_graph.arena()` quite often. Maybe you want to create a new `_arena` field for easier access.

I'll see if I can do this directly with the `new` operator...

> src/hotspot/share/opto/superword.cpp line 3290:
> 
>> 3288: }
>> 3289: 
>> 3290: void SuperWordVTransformBuilder::add_dependencies_of_node_to_vtn(Node*n, VTransformNode* vtn, VectorSet& vtn_dependencies) {
> 
> Suggestion:
> 
> void SuperWordVTransformBuilder::add_dependencies_of_node_to_vtnode(Node*n, VTransformNode* vtn, VectorSet& vtn_dependencies) {

good idea, better for consistency

> src/hotspot/share/opto/superword.hpp line 644:
> 
>> 642: 
>> 643: public:
>> 644:   SuperWordVTransformBuilder(const PackSet& packset,
> 
> Maybe add a comment here that we pass in an empty graph and then start to build/initialize it.

Added some comments and asserts.

> src/hotspot/share/opto/superword.hpp line 651:
> 
>> 649:       _graph(graph)
>> 650:   {
>> 651:     build_vtransform();
> 
> Could also just be named `build()`

Good idea.

> src/hotspot/share/opto/superword.hpp line 688:
> 
>> 686: 
>> 687:   // Ensure that the main loop vectors are aligned by adjusting the pre loop limit.
>> 688:   void determine_mem_ref_and_aw_for_main_loop_alignment();
> 
> This and the next line can be removed since they are defined in `VTransformGraph`.

Good catch!

> src/hotspot/share/opto/vectorization.hpp line 1326:
> 
>> 1324: };
>> 1325: 
>> 1326: // VTransform
> 
> I'm wondering if all these class together with vtnodes should go to a separate vtransform file?

Good idea!

> src/hotspot/share/opto/vectorization.hpp line 1356:
> 
>> 1354: 
>> 1355: // Status output from a VTransformNode::apply
>> 1356: class VTransformApplyStatus {
> 
> "Status" sounds like an error code or something like that. But it rather looks like a node transformation result. How about naming this something like `VTransformationResult`, `VNodeTransformationResult` or `VTNodeTransformation?

Renamed it to `VTransformApplyResult`

> src/hotspot/share/opto/vectorization.hpp line 1387:
> 
>> 1385: };
>> 1386: 
>> 1387: class VTransformGraph : public StackObj {
> 
> I was looking for some summary/guidance here but only found it above on L1327 which is hard to find once this is in. How about putting the comments here instead?

I will make a summary here. But I also like having a more broad explanation at the beginning of the `vtransform.hpp` file I'll move all this code to.

> src/hotspot/share/opto/vectorization.hpp line 1457:
> 
>> 1455:   IdealLoopTree* lpt()        const { return _vloop.lpt(); }
>> 1456:   CountedLoopNode* cl()       const { return _vloop.cl(); }
>> 1457:   PhiNode* iv()               const { return _vloop.iv(); }
> 
> Unused and can be removed
> Suggestion:

Nice catch, used to be necessary somewhere at some point in my refactoring I guess 😅

> src/hotspot/share/opto/vectorization.hpp line 1461:
> 
>> 1459:   bool in_bb(const Node* n)   const { return _vloop.in_bb(n); }
>> 1460: 
>> 1461:   // VLoopVPointer accessors
> 
> Suggestion:
> 
>   // VLoopVPointers accessors

Good catch

> src/hotspot/share/opto/vectorization.hpp line 1541:
> 
>> 1539:   VTransformNode* out(int i) const { return _out.at(i); }
>> 1540: 
>> 1541:   bool has_req_or_dep() const {
> 
> To match `add_dependency`, I suggest to go with:
> Suggestion:
> 
>   bool has_req_or_dependency() const {

Good idea.

> src/hotspot/share/opto/vectorization.hpp line 1555:
> 
>> 1553:   virtual VTransformReductionVectorNode* isa_ReductionVector() { return nullptr; }
>> 1554: 
>> 1555:   virtual VTransformApplyStatus apply(const VLoopAnalyzer& vloop_analyzer,
> 
> "apply" is quite a generic name. We've discussed about this offline. What about naming it `transform()`? This would match the IGVN naming convention of transforming nodes. Moreover, you already have `find_transformed_input()` which would also support the name `transform()` over `apply()`.

Let's keep discussing this offline then ;)

> src/hotspot/share/opto/vectorization.hpp line 1568:
> 
>> 1566: };
>> 1567: 
>> 1568: class VTransformScalarNode : public VTransformNode {
> 
> Not sure if you left out comments above the vtnode classes on purpose. But maybe a short summary about them could help readers of the code (if the class itself is not already self-explanatory).

Ok, will add some for each class.

> src/hotspot/share/opto/vectorization.hpp line 1612:
> 
>> 1610: };
>> 1611: 
>> 1612: 
> 
> Suggestion:

good catch

> src/hotspot/share/opto/vectorization.hpp line 1682:
> 
>> 1680:   const VTransformBoolTest _test;
>> 1681: public:
>> 1682:   VTransformBoolVectorNode(VTransformGraph& graph, int number_of_nodes, VTransformBoolTest test) :
> 
> `number_of_nodes` should be a `uint` to match the `VTransformVectorNode` constructor which takes a `uint`. Same for other classes.

good idea

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662485099
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662487904
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662489348
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662494011
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662496144
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662497496
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662513123
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662516363
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662672161
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662559743
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662575581
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662581388
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662582436
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662603662
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662666543
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662589918
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662615172
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662616439
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662618059
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662627046
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662628056
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662647208
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662657844