RFR: JDK-8308994: C2: Re-implement experimental post loop vectorization
Emanuel Peter
epeter at openjdk.org
Mon Jun 26 14:59:25 UTC 2023
On Wed, 21 Jun 2023 08:24:19 GMT, Pengfei Li <pli at openjdk.org> wrote:
> ## TL;DR
>
> This patch completely re-implements C2's experimental post loop vectorization for better stability, maintainability and performance. Compared with the original implementation, this new implementation adds a standalone loop phase in C2's ideal loop phases and can vectorize more post loops. The original implementation and all code related to multi-versioned post loops are deleted in this patch. More details about this patch can be found in the document replied in this pull request.
A few more comments, did not have much time today. More tomorrow ;)
src/hotspot/share/opto/vmaskloop.cpp line 403:
> 401: int opc = node->Opcode();
> 402: BasicType bt = elem_bt(node);
> 403: int vlen = Matcher::max_vector_size(bt);
Theoretically, different `bt` can have different `Matcher::vector_width_in_bytes`. So `vlen` would not always correspond to `MaxVectorSize / element_size`. It just means that here you would end up checking for a shorter length than maybe expected? But maybe that is intended, it depends on how you generate the nodes later.
src/hotspot/share/opto/vmaskloop.cpp line 424:
> 422: int vopc = 0;
> 423: if (node->is_Mem()) {
> 424: vopc = node->is_Store() ? Op_StoreVectorMasked : Op_LoadVectorMasked;
Mabye just for good measure: add an assert that it can only be a Load or a Store.
src/hotspot/share/opto/vmaskloop.cpp line 429:
> 427: }
> 428: if (vopc == 0 ||
> 429: !Matcher::match_rule_supported_vector_masked(vopc, vlen, bt)) {
Do all nodes need to be maskable? Or is it enough if only load/store are maskable?
src/hotspot/share/opto/vmaskloop.cpp line 442:
> 440: // nodes to bail out for complex loops
> 441: bool VectorMaskedLoop::analyze_loop_body_nodes() {
> 442: VectorSet tracked(_arena);
This is probably a good case where you could use `ResourceMark rm;` and just put the `VectorSet` on the default resource arena.
src/hotspot/share/opto/vmaskloop.cpp line 465:
> 463: for (int idx = 0; idx < n_nodes; idx++) {
> 464: Node* node = _body_nodes.at(idx);
> 465: if ((node->is_Mem() && node->as_Mem()->is_Store())) {
Suggestion:
if ((node->is_Mem() && node->is_Store())) {
src/hotspot/share/opto/vmaskloop.cpp line 474:
> 472: if (!in_body(out)) {
> 473: trace_msg(node, "Node has out-of-loop user found");
> 474: return false;
Can this be handled in the future with a extract node? I guess you would have to extract it from a variable element, as the last iteration is not always the same.
-------------
Changes requested by epeter (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/14581#pullrequestreview-1497718513
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1241683613
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1241665519
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1241668030
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1242324850
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1242330208
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1242333503
More information about the hotspot-dev
mailing list