<div dir="ltr"><div>Hi Per,</div><div><br></div><div>Thanks very much. Looks like there's lots of good info out there but I didn't really know where to start so I appreciate the references.</div><div><br></div><div>-Archie</div><br></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Fri, May 23, 2025 at 3:31 AM Per-Ake Minborg <<a href="mailto:per-ake.minborg@oracle.com">per-ake.minborg@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg370227504441734040">
<div dir="ltr">
<div style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Hi Archie!<br>
<br>
The C2 compiler can use <i>automatic vectorization</i> [1] to speed up such operations. You can watch my explanation of the concept in some of my recent presentations, e.g, [2]<br>
<br>
Best, Per</div>
<div style="width:100%;display:inline-block">
<div id="m_370227504441734040LPBorder_GTaHR0cHM6Ly95b3V0dS5iZS9yWHYyLWxONVhnaz90PTE1MTg." style="width:100%;margin-top:16px;margin-bottom:16px;max-width:800px;min-width:424px">
<table id="m_370227504441734040LPContainer585086" role="presentation" style="padding:12px 36px 12px 12px;width:100%;border-width:1px;border-style:solid;border-color:rgb(200,200,200);border-radius:2px">
<tbody>
<tr valign="top" style="border-spacing:0px">
<td style="width:100%">
<div id="m_370227504441734040LPTitle585086" style="font-size:21px;font-weight:300;margin-right:8px;font-family:wf_segoe-ui_light,"Segoe UI Light","Segoe WP Light","Segoe UI","Segoe WP",Tahoma,Arial,sans-serif;margin-bottom:12px">
<a id="m_370227504441734040LPUrlAnchor585086" href="https://youtu.be/rXv2-lN5Xgk?t=1518" style="text-decoration:none" target="_blank">YouTube</a></div>
<div id="m_370227504441734040LPDescription585086" style="font-size:14px;max-height:100px;color:rgb(102,102,102);font-family:wf_segoe-ui_normal,"Segoe UI","Segoe WP",Tahoma,Arial,sans-serif;margin-bottom:12px;margin-right:8px;overflow:hidden">
Share your videos with friends, family, and the world</div>
<div id="m_370227504441734040LPMetadata585086" style="font-size:14px;font-weight:400;color:rgb(166,166,166);font-family:wf_segoe-ui_normal,"Segoe UI","Segoe WP",Tahoma,Arial,sans-serif">
<a href="http://youtu.be" target="_blank">youtu.be</a></div>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="width:100%;display:inline-block">
<div id="m_370227504441734040LPBorder_GTaHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvQXV0b21hdGljX3ZlY3Rvcml6YXRpb24." style="width:100%;margin-top:16px;margin-bottom:16px;max-width:800px;min-width:424px">
<table id="m_370227504441734040LPContainer382363" role="presentation" style="padding:12px 36px 12px 12px;width:100%;border-width:1px;border-style:solid;border-color:rgb(200,200,200);border-radius:2px">
<tbody>
<tr valign="top" style="border-spacing:0px">
<td style="width:100%">
<div id="m_370227504441734040LPTitle382363" style="font-size:21px;font-weight:300;margin-right:8px;font-family:wf_segoe-ui_light,"Segoe UI Light","Segoe WP Light","Segoe UI","Segoe WP",Tahoma,Arial,sans-serif;margin-bottom:12px">
<a id="m_370227504441734040LPUrlAnchor382363" href="https://en.wikipedia.org/wiki/Automatic_vectorization" style="text-decoration:none" target="_blank">Automatic vectorization - Wikipedia</a></div>
<div id="m_370227504441734040LPDescription382363" style="font-size:14px;max-height:100px;color:rgb(102,102,102);font-family:wf_segoe-ui_normal,"Segoe UI","Segoe WP",Tahoma,Arial,sans-serif;margin-bottom:12px;margin-right:8px;overflow:hidden">
Automatic vectorization, in parallel computing, is a special case of automatic parallelization, where a computer program is converted from a scalar implementation, which processes a single pair of operands at a time, to a vector implementation, which processes
one operation on multiple pairs of operands at once. For example, modern conventional computers, including specialized supercomputers ...</div>
<div id="m_370227504441734040LPMetadata382363" style="font-size:14px;font-weight:400;color:rgb(166,166,166);font-family:wf_segoe-ui_normal,"Segoe UI","Segoe WP",Tahoma,Arial,sans-serif">
<a href="http://en.wikipedia.org" target="_blank">en.wikipedia.org</a></div>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
[1] <a href="https://en.wikipedia.org/wiki/Automatic_vectorization" id="m_370227504441734040LPlnk162055" target="_blank">
https://en.wikipedia.org/wiki/Automatic_vectorization</a></div>
<div style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
[2] <a href="https://youtu.be/rXv2-lN5Xgk?t=1518" id="m_370227504441734040LPlnk773235" target="_blank">https://youtu.be/rXv2-lN5Xgk?t=1518</a></div>
<div id="m_370227504441734040appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="m_370227504441734040divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> core-libs-dev <<a href="mailto:core-libs-dev-retn@openjdk.org" target="_blank">core-libs-dev-retn@openjdk.org</a>> on behalf of Archie Cobbs <<a href="mailto:archie.cobbs@gmail.com" target="_blank">archie.cobbs@gmail.com</a>><br>
<b>Sent:</b> Thursday, May 22, 2025 10:58 PM<br>
<b>To:</b> John R Rose <<a href="mailto:jrose@openjdk.org" target="_blank">jrose@openjdk.org</a>><br>
<b>Cc:</b> <a href="mailto:core-libs-dev@openjdk.org" target="_blank">core-libs-dev@openjdk.org</a> <<a href="mailto:core-libs-dev@openjdk.org" target="_blank">core-libs-dev@openjdk.org</a>><br>
<b>Subject:</b> Re: RFR: 8357531: The `SegmentBulkOperations::fill` method can be improved using overlaps [v5]</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div dir="ltr">On Thu, May 22, 2025 at 3:31 PM John R Rose <<a href="mailto:jrose@openjdk.org" target="_blank">jrose@openjdk.org</a>> wrote:</div>
<div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> Update benchmark to reflect new fill method<br>
<br>
Related discussion at the hardware level:<br>
<br>
<a href="https://github.com/openjdk/jdk/pull/25147#issuecomment-2902463076" rel="noreferrer" target="_blank">https://github.com/openjdk/jdk/pull/25147#issuecomment-2902463076</a>
<br>
</blockquote>
</div>
<div><br clear="all">
</div>
<div>This discussion spurred me to ask a dumb question. Apologies in advance, just trying to learn here...
<div><br>
</div>
<div>If I do this:</div>
<div><br>
</div>
<div style="margin-left:40px"><span style="font-family:monospace">import java.util.Arrays;<br>
public class ArrayFiller {<br>
public static void main(String[] args) {<br>
while (true) {<br>
final byte[] array = new byte[1000000];<br>
Arrays.fill(array, (byte)0x42);<br>
}<br>
}<br>
}<br>
</span></div>
<div><br>
</div>
<div>Will C2 compile <span style="font-family:monospace"><a href="https://github.com/openjdk/jdk/blob/139a05d05959a84541a29dfae6151f92ce579ae6/src/java.base/share/classes/java/util/Arrays.java#L3275-L3308" target="_blank">Arrays.fill()</a></span> into something that is more
efficient than a byte-at-a-time loop like what appears in the source code?</div>
<div><br>
</div>
<div>Thanks,</div>
<div>-Archie</div>
</div>
<div><br>
</div>
<span>-- </span><br>
<div dir="ltr">Archie L. Cobbs<br>
</div>
</div>
</div>
</div>
</div></blockquote></div><div><br clear="all"></div><br><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature">Archie L. Cobbs<br></div>