Issues with loop unrolling: better pinned node
Radosław Smogura
mail at smogura.eu
Fri Aug 6 18:33:23 UTC 2021
Hi Vladimir,
So for this case, code checks if phi saw self, and if not zeroes merge width.
// This restriction is temporarily necessary to ensure termination:
if (!saw_self && adr_type() == TypePtr::BOTTOM) merge_width = 0;
Later method MemNode::optimize_memory_chain is called [1], but this one does not operate on wide memory.
I opened draft PR with code I test (I use if to split polluted cases so Unsafe will not get null and not-null at same base).
And here's a fresh XML with IR [2]
Kind regards,
Rado
[1] https://github.com/openjdk/panama-vector/blob/master/src/hotspot/share/opto/cfgnode.cpp#L2327
[https://opengraph.githubassets.com/5fde12f89c012a2abef1542ed59c7272429fa7556f6e82a5e617a293d3a5bee1/openjdk/panama-vector]<https://github.com/openjdk/panama-vector/blob/master/src/hotspot/share/opto/cfgnode.cpp#L2327>
panama-vector/cfgnode.cpp at master · openjdk/panama-vector<https://github.com/openjdk/panama-vector/blob/master/src/hotspot/share/opto/cfgnode.cpp#L2327>
Panama vector. Contribute to openjdk/panama-vector development by creating an account on GitHub.
github.com
[2] https://drive.google.com/file/d/1v8Lfqjc-k22vvA0HbIYdWDbBg3G2ZPcL/view?usp=sharing
________________________________
From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
Sent: Friday, August 6, 2021 19:41
To: Radosław Smogura <mail at smogura.eu>; panama-dev at openjdk.java.net <panama-dev at openjdk.java.net>
Subject: Re: Issues with loop unrolling: better pinned node
Can you double-check why the Phi is not split through the MemoryMerge
nodes [1]?
I suspect self-referential nature of the IR shape (it's a memory state
inside the loop after all) poses some challenges which aren't evident on
the screenshot you made.
Best regards,
Vladimir Ivanov
[1]
https://github.com/openjdk/panama-vector/blob/master/src/hotspot/share/opto/cfgnode.cpp#L2187
On 06.08.2021 18:39, Radosław Smogura wrote:
> Hi all,
>
> I've found that even if we get rid of barriers, the loop can't get unrolled, and not needed code is inside it.
>
> I've found this graph, I wonder if it's most optimal, in a partiucalry Load of ByteBuffer index / hb is from phi, could it be attached to initial memory?
>
> Here's a picture https://drive.google.com/file/d/1G7ZN0xHOVIVHmZ_5TTIUdm3F30okAzvO/view?usp=sharing
> [https://lh6.googleusercontent.com/SKgGZgfVWFpG8w4mWqguLSU4DVfa1MKYPSQhxv8EoX04XzVz8U8Kc4zHP0iwdR26Suc=w1200-h630-p]<https://drive.google.com/file/d/1G7ZN0xHOVIVHmZ_5TTIUdm3F30okAzvO/view?usp=sharing>
> bb_issues.png<https://drive.google.com/file/d/1G7ZN0xHOVIVHmZ_5TTIUdm3F30okAzvO/view?usp=sharing>
> drive.google.com
>
>
> And sample code
>
> protected void copyMemory(ByteBuffer in, ByteBuffer out) {
> var limit = SPECIES.loopBound(in.limit());
> for (int i=0; i < limit; i += SPECIES.vectorByteSize()) {
> final var v = ByteVector.fromByteBuffer(SPECIES, in, i, ByteOrder.nativeOrder());
> v.intoByteBuffer(out, i, ByteOrder.nativeOrder());
> }
> }
>
> Kind regards,
> Rado
>
More information about the panama-dev
mailing list