RFR: 8326962: C2 SuperWord: cache VPointer
Johan Sjölen
jsjolen at openjdk.org
Tue Apr 2 14:00:11 UTC 2024
On Tue, 2 Apr 2024 09:04:45 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> This is a subtask of [JDK-8315361](https://bugs.openjdk.org/browse/JDK-8315361).
>
> Parsing `VPointer` currently happens all over SuperWord. And often in quadratic loops, where we compare all-with-all loads/stores.
>
> I propose to cache the `VPointer`s, then we can do a constant-time cache lookup rather than parsing the pointer subgraph every time.
>
> There are now only a few cases where we cannot use the cached `VPointer`:
> - `SuperWord::unrolling_analysis`: we have no `VLoopAnalyzer`, and so no submodules like `VLoopPointers`. We don't need to cache, since we only iterate over the loop body once, and create only a single `VPointer` per memop.
> - `SuperWord::output`: when we have a `Load`, and try to bypass `StoreVector` nodes. The `StoreVector` nodes are new, and so we have no cached `VPointer` for them. This could be fixed somehow, but I don't want to deal with it now. I intend to refactor `SuperWord::output` soon, and can look into options at that point (either I bypass before we insert the vector nodes, or I remember what scalar memop the vector was created from, and then get the cached pointer this way).
>
> This changeset is also a preparation step for [JDK-8325155](https://bugs.openjdk.org/browse/JDK-8325155). I will have a list of pointers, and sort them such that creating adjacent refs is much more efficient.
>
> **Benchmarking SuperWord Compile Time**
>
> I use the same benchmark from https://github.com/openjdk/jdk/pull/18532.
>
> On master:
>
> C2 Compile Time: 56.816 s
> IdealLoop: 56.604 s
> AutoVectorize: 56.192 s
>
>
> With this patch:
>
> C2 Compile Time: 49.719 s
> IdealLoop: 49.509 s
> AutoVectorize: 49.106 s
>
>
> This saves us about `7 sec`, which is significant. I will have to see what it effect it has once we also apply https://github.com/openjdk/jdk/pull/18532, but I think the combined effect will be very significant.
Hi Emanuel,
I've some general questions regarding naming and Arena usage, I hope you don't mind some runtime team input.
src/hotspot/share/opto/superword.hpp line 499:
> 497:
> 498: // VLoopDependencyGraph Accessors
> 499: const VPointer& get_pointer(const MemNode* mem) const {
This can't just be called `vpointer` or `vpointer_of`?
src/hotspot/share/opto/vectorization.hpp line 458:
> 456: class VLoopPointers : public StackObj {
> 457: private:
> 458: Arena* _arena;
Will the pointer ever change? Could potentially change this to a reference.
src/hotspot/share/opto/vectorization.hpp line 483:
> 481:
> 482: void compute_and_cache();
> 483: const VPointer& get(const MemNode* mem) const;
Questioning naming of this also :-).
-------------
PR Review: https://git.openjdk.org/jdk/pull/18577#pullrequestreview-1973893348
PR Review Comment: https://git.openjdk.org/jdk/pull/18577#discussion_r1547931080
PR Review Comment: https://git.openjdk.org/jdk/pull/18577#discussion_r1547937229
PR Review Comment: https://git.openjdk.org/jdk/pull/18577#discussion_r1547934364
More information about the hotspot-compiler-dev
mailing list