<div class="__aliyun_email_body_block"><div style="font-family: Tahoma, Arial, STHeitiSC-Light, SimSun"><div style="clear: both; font-family: Tahoma, Arial, STHeitiSC-Light, SimSun;"><span style="font-family: Tahoma, Arial, STHeitiSC-Light, SimSun;"><br ></span></div><div style="clear: both; font-family: Tahoma, Arial, STHeitiSC-Light, SimSun;"><span style="font-family: Tahoma, Arial, STHeitiSC-Light, SimSun;"><span >The BufferedWriterBench I added in PR 26022 can be configured to see the performance of different encodings.</span></span><div style="clear: both;">We can run the following command to see the performance of different encodings on different content. The test numbers show that the performance of BufferedWriter::write(String) is improved by about 30%~70% in UTF8. The performance of non-UTF8 encodings is the same as before.</div><div style="clear: both;"><br ></div><div style="clear: both;">```sh</div><div style="clear: both;">git remote add wenshao git@github.com:wenshao/jdk.git</div><div style="clear: both;">git fetch wenshao</div><div style="clear: both;"><br ></div><div style="clear: both;">#baseline</div><div style="clear: both;">git checkout 2758d6ad7767832db004d28f10cc764f33fa438e</div><div style="clear: both;">make test TEST="micro:java.io.BufferedWriterBench" MICRO="OPTIONS=-p charset=UTF8,UTF16,GB18060"</div><div style="clear: both;"><br ></div><div style="clear: both;"># current</div><div style="clear: both;">git checkout e1e9e25b3d2f99192fc2706dd7df846016452bae</div><div style="clear: both;">make test TEST="micro:java.io.BufferedWriterBench" MICRO="OPTIONS=-p charset=UTF8,UTF16,GB18060"</div><div ><span >```</span></div><span style="font-family: Tahoma, Arial, STHeitiSC-Light, SimSun;"><br ></span></div><blockquote style="margin-right: 0px; margin-top: 0px; margin-bottom: 0px; font-family: Tahoma, Arial, STHeiti, SimSun; font-size: 14px; color: rgb(0, 0, 0);"><div class="alimail-quote"><div style="clear: both;">------------------------------------------------------------------</div><div style="clear: both;">发件人:Alan Bateman <alan.bateman@oracle.com></div><div style="clear: both;">发送时间:2025年6月30日(周一) 13:38</div><div style="clear: both;">收件人:"温绍锦(高铁)"<shaojin.wensj@alibaba-inc.com>; "core-libs-dev"<core-libs-dev@openjdk.org></div><div style="clear: both;">主 题:Re: Eliminate unnecessary buffering and encoding conversion in BufferedWriter</div><div style="clear: both;"><br ></div><br >
<br >
<div class="moz-cite-prefix">On 29/06/2025 17:03, wenshao wrote:<br >
</div>
<div style="margin: 14px 40px;">
<div style="font-family: Tahoma, Arial, STHeitiSC-Light, SimSun;">
<div style="clear: both;"><span >BufferedWriter ->
OutputStreamWriter -> StreamEncoder</span>
<div style="clear: both;"><br >
</div>
<div style="clear: both;">In this call chain, BufferedWriter
has a char[] buffer, and StreamEncoder has a ByteBuffer.
There are two layers of cache here, or the BufferedWriter
layer can be removed. </div>
<div style="clear: both;"><br >
</div>
<div style="clear: both;"><span class=" __aliyun_node_has_color __aliyun_node_has_bgcolor" style="color: rgb(0, 0, 0); font-family: Tahoma, Arial, STHeitiSC-Light, SimSun; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; white-space: normal; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;">LATIN1
(byte[]) -> UTF16 (char[]) -> UTF8 (byte[])</span></div>
<div style="clear: both;"><span class=" __aliyun_node_has_color __aliyun_node_has_bgcolor" style="color: rgb(0, 0, 0); font-family: Tahoma, Arial, STHeitiSC-Light, SimSun; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; white-space: normal; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;"><br >
</span></div>
<div style="clear: both;">And when charset is UTF8, if the
content of write(String) is LATIN1, a conversion from
LATIN1 to UTF16 and then to LATIN1 will occur here.</div>
<span >
<div style="clear: both;"><span ><br >
</span></div>
We can improve BufferedWriter. When the parameter Writer
instanceof OutputStreamWriter is passed in, remove the
cache and call it directly. In addition, improve
write(String) in StreamEncoder to avoid unnecessary
encoding conversion.</span></div>
<br >
</div>
</div>
I see you've already proposed a PR. Most of this code goes back to
JDK 1.4 so we need to be very careful, any changes will require a
lot of Reviewer cycles.<br >
<br >
Have you surveyed the tests to ensure that there are good tests with
different charsets and usage patterns? I think we need to be
confidence in the tests before touching anything.<br >
<br >
-Alan<br >
</div></blockquote></div></div>