<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@DengXian";
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"\@SimSun";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:10.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
span.contentpasted0
{mso-style-name:contentpasted0;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="en-CN" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">Hi Emanuel,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">Thanks for your great work of refactoring and improving SuperWord. I have seen that you have already had good cooperation with Fei Gao (she is sitting next to me and involved in this task as well)
in recent patches.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">I’m currently doing code cleanups and adding necessary comments on the draft patch. I don’t think it will take too much time. So, I tend to push the patch for review a bit later so people can
review the code more easily. In general, our patch refactors post loop related logic out from superword.[cpp|hpp]. It won’t have too much conflict with your on-going SuperWord improvements.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">We will keep you informed once the patch is ready.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">--<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">Thanks,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">Pengfei<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">Emanuel Peter <emanuel.peter@oracle.com><br>
<b>Date: </b>Tuesday, May 30, 2023 at 15:45<br>
<b>To: </b>Pengfei Li <Pengfei.Li@arm.com>, hotspot-compiler-dev@openjdk.java.net <hotspot-compiler-dev@openjdk.java.net><br>
<b>Cc: </b>epeter@openjdk.org <epeter@openjdk.org>, Bhateja, Jatin <jatin.bhateja@intel.com>, nd <nd@arm.com><br>
<b>Subject: </b>AW: [Heads-up] JDK-8308994: C2: Re-implement experimental post loop vectorization<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span class="contentpasted0"><span style="font-size:12.0pt;color:black">Hi Pengfei,</span></span><span style="font-size:12.0pt;color:black">
<o:p></o:p></span></p>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">great to hear that you are spending time on SuperWord / the auto-vectorization in HotSpot. I agree with your assessment that currently SuperWord is unnecessarily convoluted and has a good bit of
legacy code. It would be nice if we could make the code more modular and extensible for future improvements.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span class="contentpasted0"><span style="font-size:12.0pt;color:black">Is there a chance that we could see the draft already?</span></span><span style="font-size:12.0pt;color:black"><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span class="contentpasted0"><span style="font-size:12.0pt;color:black">I am also thinking about extending SuperWord in the future. I am currently trying to clean up as much dead code and bugs as possible to clear the way. I have to see
how much time I get to spend on extensions. Here you can find some of my ideas (towards the end of the PR description):</span></span><span style="font-size:12.0pt;color:black"><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span class="contentpasted0"><span style="font-size:12.0pt;color:black"><a href="https://github.com/openjdk/jdk/pull/14096">https://github.com/openjdk/jdk/pull/14096</a></span></span><span style="font-size:12.0pt;color:black"><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span class="contentpasted0"><span style="font-size:12.0pt;color:black">It would be good to coordinate a bit so that we can ensure our plans fit together.</span></span><span style="font-size:12.0pt;color:black"><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Best regards,<br>
Emanuel<o:p></o:p></span></p>
</div>
<div class="MsoNormal" align="center" style="text-align:center"><span style="font-size:11.0pt">
<hr size="0" width="100%" align="center">
</span></div>
<div id="divRplyFwdMsg">
<p class="MsoNormal"><b><span style="font-size:11.0pt;color:black">Von:</span></b><span style="font-size:11.0pt;color:black"> Pengfei Li <Pengfei.Li@arm.com><br>
<b>Gesendet:</b> Montag, 29. Mai 2023 05:12<br>
<b>An:</b> hotspot-compiler-dev@openjdk.java.net <hotspot-compiler-dev@openjdk.java.net><br>
<b>Cc:</b> epeter@openjdk.org <epeter@openjdk.org>; Bhateja, Jatin <jatin.bhateja@intel.com>; nd <nd@arm.com><br>
<b>Betreff:</b> [Heads-up] JDK-8308994: C2: Re-implement experimental post loop vectorization</span><span style="font-size:11.0pt">
<o:p></o:p></span></p>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt"> <o:p></o:p></span></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt">Hi,<br>
<br>
I'm writing to let you know that I just filed "JDK-8308994: C2: Re-implement experimental post loop vectorization".<br>
<br>
[BACKGROUND]<br>
<br>
Current post loop vectorization in the C2 compiler has a long history. It was firstly implemented in JDK-8153998 in 2016 as an experimental feature to support x86 AVX-512 vector masks. Due to insufficient maintenance, it had been broken for a very long time.
Last year, I took over JDK-8183390 to fix and re-enable this feature. Several issues were fixed and AArch64 SVE vector mask support was added in the meanwhile. We (Arm) proposed to make post loop vectorization non-experimental in future JDK releases. So early
in this year (2023), we did a lot of tests on this but found more problems inside.<br>
<br>
[PROBLEMS]<br>
<br>
Problems include stability, maintainability and performance.<br>
<br>
1) Stability issues<br>
Multiple C2 crash or mis-compilation issues were filed on JBS, including JDK-8301657, JDK-8301904, JDK-8301944, JDK-8304774, JDK-8308949 and perhaps more.<br>
<br>
2) Maintainability issue<br>
The original implementation was based on multi-versioned post loops and the logic was mixed in SuperWord. But the algorithm for post loop vectorization is actually *not* SLP. As more and more new features were added in SuperWord, legacy code for post loop vectorization
is becoming more and more difficult to maintain.<br>
<br>
3) Performance issue<br>
Post loop vectorization was expected to bring performance improvement for small-iteration vectorizable loops. But JMH tests show it doesn't. A main reason is that the vector masked post loop is skipped (not executed) if the loop trip count is small due to zero-trip
guard of the main loop. That's a major defect of current multi-versioning framework. (See JDK-8307084 for more details.)<br>
<br>
[ACTIONS]<br>
<br>
For better stability, maintainability and performance, we now propose to deprecate current multi-versioning framework and completely re-implement the experimental post loop vectorization, for both x86 AVX-512 and AArch64 SVE. Our new proposal is to add a standalone
ideal loop phase (outside SuperWord) to do vector mask transformation directly on the original scalar post loop.<br>
<br>
We have been working on this internally for a while. So far we have finished a draft patch. I will push the patch for review soon after it passes all tests and becomes polished enough.<br>
<br>
--<br>
Thanks,<br>
Pengfei<o:p></o:p></span></p>
</div>
</div>
</div>
</body>
</html>