<div dir="auto">Maybe another option would be to implement BufferedWriter with a StringBuilder rather than a char[]. This would remove the force to utf-16</div><div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Sun, Jun 29, 2025 at 10:36 PM Brett Okken <<a href="mailto:brett.okken.os@gmail.com">brett.okken.os@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)"><div dir="auto">Is StreamEncoder buffering content to only write to the underlying OutputStream when some threshold is hit? While the layers of conversions are unfortunate, it seems there could be negative performance implications of having many extremely small writes (such as 1 character/byte) at a time to the underlying OutputStream.</div><div dir="auto"><br></div><div dir="auto">Presumably this is a common pattern, as it is recommended:</div><div dir="auto"><div><a href="https://github.com/openjdk/jdk/blob/4dd1b3a6100f9e379c7cee3c699d63d0d01144a7/src/java.base/share/classes/java/io/OutputStreamWriter.java#L45" target="_blank">https://github.com/openjdk/jdk/blob/4dd1b3a6100f9e379c7cee3c699d63d0d01144a7/src/java.base/share/classes/java/io/OutputStreamWriter.java#L45</a></div><br></div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Jun 29, 2025 at 11:04 AM wenshao <<a href="mailto:shaojin.wensj@alibaba-inc.com" target="_blank">shaojin.wensj@alibaba-inc.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)"><div><div style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun">BufferedWriter -> OutputStreamWriter -> StreamEncoder</span><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><br></div><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun">In this call chain, BufferedWriter has a char[] buffer, and StreamEncoder has a ByteBuffer. There are two layers of cache here, or the BufferedWriter layer can be removed. </div><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><br></div><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline!important;background-color:rgb(255,255,255);color:rgb(0,0,0)">LATIN1 (byte[]) -> UTF16 (char[]) -> UTF8 (byte[])</span></div><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline!important;background-color:rgb(255,255,255);color:rgb(0,0,0)"><br></span></div><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun">And when charset is UTF8, if the content of write(String) is LATIN1, a conversion from LATIN1 to UTF16 and then to LATIN1 will occur here.</div><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><br></span></div>We can improve BufferedWriter. When the parameter Writer instanceof OutputStreamWriter is passed in, remove the cache and call it directly. In addition, improve write(String) in StreamEncoder to avoid unnecessary encoding conversion.</span></div><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><br></span></div><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun">-</span></div><div style="clear:both;font-family:Tahoma,Arial,STHeitiSC-Light,SimSun"><span style="font-family:Tahoma,Arial,STHeitiSC-Light,SimSun">Shaojin Wen</span></div></div></div></blockquote></div></div>
</blockquote></div></div>