RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2]
Emanuel Peter
epeter at openjdk.org
Mon Jul 3 14:40:18 UTC 2023
On Mon, 3 Jul 2023 07:37:22 GMT, Pengfei Li <pli at openjdk.org> wrote:
>> ## TL;DR
>>
>> This patch completely re-implements C2's experimental post loop vectorization for better stability, maintainability and performance. Compared with the original implementation, this new implementation adds a standalone loop phase in C2's ideal loop phases and can vectorize more post loops. The original implementation and all code related to multi-versioned post loops are deleted in this patch. More details about this patch can be found in the document replied in this pull request.
>
> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision:
>
> Address part of comments from Emanuel
src/hotspot/share/opto/vmaskloop.cpp line 978:
> 976:
> 977: // Update loop increment/decrement to the vector mask true count
> 978: Node* true_cnt = new VectorMaskTrueCountNode(root_vmask, TypeInt::INT);
This seems expensive to have to use inside the loop. Is there a way we could move this outside the loop? Because if we do take the backedge then we know that we have to take the full `stride`, right?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250973547
More information about the hotspot-dev
mailing list