<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: "Calibri Light", "Helvetica Light", sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Hi all,</div>
<div style="font-family: "Calibri Light", "Helvetica Light", sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
I think the key takeaway here is that we should reduce the number of intrinsics for easier maintenance. With the same number of unsafe Java methods, it is still feasible to reduce the number of distinct intrinsics simply for the reduced maintenance cost.</div>
<div style="font-family: "Calibri Light", "Helvetica Light", sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
For example, the toBytes and getChars of StringUTF16 both have intrinsics. However, in essence, they are just two array copy functions - it would be more reasonable for hotspot to implement a generic array copy intrinsic that StringUTF16 can use instead. Such
an intrinsic may take place on <code>Unsafe.copyMemory</code> itself, or may be somewhere else.</div>
<div style="font-family: "Calibri Light", "Helvetica Light", sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: "Calibri Light", "Helvetica Light", sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Chen</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> core-libs-dev <core-libs-dev-retn@openjdk.org> on behalf of wenshao <shaojin.wensj@alibaba-inc.com><br>
<b>Sent:</b> Wednesday, July 30, 2025 9:45 PM<br>
<b>To:</b> Roger Riggs <roger.riggs@oracle.com>; core-libs-dev <core-libs-dev@openjdk.org><br>
<b>Subject:</b> 回复:Reuse the StringUTF16::putCharsSB method instead of the Intrinsic in the StringUTF16::toBytes</font>
<div> </div>
</div>
<div>
<div class="x___aliyun_email_body_block">
<div style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun">
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span>Thanks to Roger Riggs for suggesting that the code should not be called with Unsafe.uninitializedArray.</span></span></div>
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span><br>
</span></span></div>
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span>After replacing it with `new byte[]` and running `StringConstructor.newStringFromCharsMixedBegin`, I verified that performance
remained consistent on x64. On aarch64, performance improved by 8% for size = 7, but decreased by 7% for size = 64.</span></span></div>
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span><br>
</span></span></div>
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span><span>For detailed performance data, see the Markdown data in the draft pull request I submitted.</span><span><a href="https://github.com/openjdk/jdk/pull/26553#issuecomment-3138357748" target="_blank">https://github.com/openjdk/jdk/pull/26553#issuecomment-3138357748</a></span></span></span></div>
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span><br>
</span></span></div>
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span>-</span></span></div>
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span>Shaojin Wen</span></span></div>
<div style="clear:both; font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><br>
</span></div>
<blockquote style="margin-right:0px; margin-top:0px; margin-bottom:0px; font-family:Tahoma,Arial,STHeiti,SimSun; font-size:14px; color:rgb(0,0,0)">
<div class="x_alimail-quote">
<div style="clear:both">------------------------------------------------------------------</div>
<div style="clear:both">发件人:Roger Riggs <roger.riggs@oracle.com></div>
<div style="clear:both">发送时间:2025年7月31日(周四) 03:17</div>
<div style="clear:both">收件人:"core-libs-dev"<core-libs-dev@openjdk.org></div>
<div style="clear:both">主 题:Re: Reuse the StringUTF16::putCharsSB method instead of the Intrinsic in the StringUTF16::toBytes</div>
<div style="clear:both"><br>
</div>
Hi,<br>
<br>
Unsafe.uninitializedArray and StringConcatHelper.newArray was created for the exclusive use of StringConcatHelper and by HotSpot optimizations. Unsafe.uninitializedArray and StringConcatHelper.newArray area very sensitive APIs and should NOT be used anywhere
except in StringConcatHelper and HotSpot.<br>
<br>
Regards, Roger<br>
<br>
<br>
<div class="x_moz-cite-prefix">On 7/30/25 11:40 AM, <a class="x_moz-txt-link-abbreviated" href="mailto:jaikiran.pai@oracle.com" target="_blank">
jaikiran.pai@oracle.com</a> wrote:<br>
</div>
<div style="margin:14px 0px">
<p style="margin-top:14px; margin-bottom:14px">I'll let others knowledgeable in this area to comment and provide inputs to this proposal. I just want to say thank you for bringing up this discussion to the mailing list first, providing the necessary context
and explanation and seeking feedback, before creating a JBS issue or a RFR PR.</p>
<p style="margin-top:14px; margin-bottom:14px">-Jaikiran<br>
</p>
<div class="x_moz-cite-prefix">On 30/07/25 7:48 pm, wenshao wrote:<br>
</div>
<div style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun">
<div style="clear:both"><span>In the discussion of `8355177: Speed up StringBuilder::append(char[]) via Unsafe::copyMemory` (<a class="x_moz-txt-link-freetext" href="https://github.com/openjdk/jdk/pull/24773" target="_blank">https://github.com/openjdk/jdk/pull/24773</a>),
@liach (<span>Chen Liang)</span> suggested reusing the StringUTF16::putCharsSB method introduced in PR #24773 instead of the Intrinsic implementation in the StringUTF16::toBytes method.</span>
<div style="clear:both"><br>
</div>
<div style="clear:both"><span>Original:</span>
<div style="clear:both">```java</div>
<div style="clear:both"> @IntrinsicCandidate</div>
<div style="clear:both"> public static byte[] toBytes(char[] value, int off, int len) {</div>
<div style="clear:both"> byte[] val = newBytesFor(len);</div>
<div style="clear:both"> for (int i = 0; i < len; i++) {</div>
<div style="clear:both"> putChar(val, i, value[off]);</div>
<div style="clear:both"> off++;</div>
<div style="clear:both"> }</div>
<div style="clear:both"> return val;</div>
<div style="clear:both"> }</div>
<div style="clear:both">```</div>
<div style="clear:both"><br>
</div>
<div style="clear:both">After:</div>
<div style="clear:both">```java</div>
<div style="clear:both"> public static byte[] toBytes(char[] value, int off, int len) {</div>
<div style="clear:both"> byte[] val = (byte[]) Unsafe.getUnsafe().allocateUninitializedArray(byte.class, newBytesLength(len));</div>
<div style="clear:both"> putCharsSB(val, 0, value, off, off + len);</div>
<div style="clear:both"> return val;</div>
<div style="clear:both"> }</div>
<span>```</span></div>
<div style="clear:both"><br>
</div>
<div style="clear:both">This replacement does not degrade performance. Running StringConstructor.newStringFromCharsMixedBegin verified that performance is consistent with the original on x64 and slightly improved on aarch64.</div>
<div style="clear:both"><br>
</div>
<span>The implementation after replacing the Intrinsic implementation removed 100 lines of C++ code, leaving only Java and Unsafe code, no Intrinsic or C++ code, which makes the code more maintainable.</span></div>
<div style="clear:both"><span><br>
</span></div>
<div style="clear:both"><span><span>I've submitted a draft PR </span><span><a class="x_moz-txt-link-freetext" href="https://github.com/openjdk/jdk/pull/26553" target="_blank">https://github.com/openjdk/jdk/pull/26553</a> , please give me some feedback.</span></span></div>
<div style="clear:both"><span>-</span></div>
<div style="clear:both"><span>Shaojin Wen</span></div>
</div>
</div>
<br>
</div>
</blockquote>
<div style="line-height:20px; clear:both"><br>
</div>
</div>
</div>
</div>
</body>
</html>