<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<span style="font-size:12pt;margin:0px" class="ContentPasted0">Hi Pengfei,</span>
<div style="font-size:12pt;margin:0px"><br class="ContentPasted0">
</div>
<div style="font-size:12pt;margin:0px" class="ContentPasted0">great to hear that you are spending time on SuperWord / the auto-vectorization in HotSpot. I agree with your assessment that currently SuperWord is unnecessarily convoluted and has a good bit of
legacy code. It would be nice if we could make the code more modular and extensible for future improvements.</div>
<div style="font-size:12pt;margin:0px"><br class="ContentPasted0">
</div>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<span style="font-size:12pt;margin:0px" class="ContentPasted0">Is there a chance that we could see the draft already?</span><br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<span style="font-size:12pt;margin:0px" class="ContentPasted0"><br>
</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<span style="font-size:12pt;margin:0px" class="ContentPasted0 ContentPasted1">I am also thinking about extending SuperWord in the future. I am currently trying to clean up as much dead code and bugs as possible to clear the way. I have to see how much time
I get to spend on extensions. Here you can find some of my ideas (towards the end of the PR description):<br>
</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<span style="font-size:12pt;margin:0px" class="ContentPasted0 ContentPasted1"><a href="https://github.com/openjdk/jdk/pull/14096" id="LPNoLPOWALinkPreview">https://github.com/openjdk/jdk/pull/14096</a></span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<span style="font-size:12pt;margin:0px" class="ContentPasted0"><br>
</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<span style="font-size:12pt;margin:0px" class="ContentPasted0">It would be good to coordinate a bit so that we can ensure our plans fit together.</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<span style="font-size:12pt;margin:0px" class="ContentPasted0"><br>
</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Best regards,<br>
Emanuel</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>Von:</b> Pengfei Li <Pengfei.Li@arm.com><br>
<b>Gesendet:</b> Montag, 29. Mai 2023 05:12<br>
<b>An:</b> hotspot-compiler-dev@openjdk.java.net <hotspot-compiler-dev@openjdk.java.net><br>
<b>Cc:</b> epeter@openjdk.org <epeter@openjdk.org>; Bhateja, Jatin <jatin.bhateja@intel.com>; nd <nd@arm.com><br>
<b>Betreff:</b> [Heads-up] JDK-8308994: C2: Re-implement experimental post loop vectorization</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">Hi,<br>
<br>
I'm writing to let you know that I just filed "JDK-8308994: C2: Re-implement experimental post loop vectorization".<br>
<br>
[BACKGROUND]<br>
<br>
Current post loop vectorization in the C2 compiler has a long history. It was firstly implemented in JDK-8153998 in 2016 as an experimental feature to support x86 AVX-512 vector masks. Due to insufficient maintenance, it had been broken for a very long time.
Last year, I took over JDK-8183390 to fix and re-enable this feature. Several issues were fixed and AArch64 SVE vector mask support was added in the meanwhile. We (Arm) proposed to make post loop vectorization non-experimental in future JDK releases. So early
in this year (2023), we did a lot of tests on this but found more problems inside.<br>
<br>
[PROBLEMS]<br>
<br>
Problems include stability, maintainability and performance.<br>
<br>
1) Stability issues<br>
Multiple C2 crash or mis-compilation issues were filed on JBS, including JDK-8301657, JDK-8301904, JDK-8301944, JDK-8304774, JDK-8308949 and perhaps more.<br>
<br>
2) Maintainability issue<br>
The original implementation was based on multi-versioned post loops and the logic was mixed in SuperWord. But the algorithm for post loop vectorization is actually *not* SLP. As more and more new features were added in SuperWord, legacy code for post loop vectorization
is becoming more and more difficult to maintain.<br>
<br>
3) Performance issue<br>
Post loop vectorization was expected to bring performance improvement for small-iteration vectorizable loops. But JMH tests show it doesn't. A main reason is that the vector masked post loop is skipped (not executed) if the loop trip count is small due to zero-trip
guard of the main loop. That's a major defect of current multi-versioning framework. (See JDK-8307084 for more details.)<br>
<br>
[ACTIONS]<br>
<br>
For better stability, maintainability and performance, we now propose to deprecate current multi-versioning framework and completely re-implement the experimental post loop vectorization, for both x86 AVX-512 and AArch64 SVE. Our new proposal is to add a standalone
ideal loop phase (outside SuperWord) to do vector mask transformation directly on the original scalar post loop.<br>
<br>
We have been working on this internally for a while. So far we have finished a draft patch. I will push the patch for review soon after it passes all tests and becomes polished enough.<br>
<br>
--<br>
Thanks,<br>
Pengfei<br>
</div>
</span></font></div>
</body>
</html>