<div dir="ltr"><div>Updating to have different test methods for each representation did remove the difference for the non-ascii String case for the jdk 21+ releases.</div><div>However, the ascii (latin) strings are still slower with String than StringBuilder.</div><div><br></div><div>How does C2 then handle something like StringCharBuffer wrapping a CharSequence for all of it's get operations:<br><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/nio/StringCharBuffer.java#L88-L97">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/nio/StringCharBuffer.java#L88-L97</a></div><div><br></div><div>Which is then used by CharBufferSpliterator<br><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/nio/CharBufferSpliterator.java">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/nio/CharBufferSpliterator.java</a></div><div><br>And by many CharsetEncoder impls when either source or destination is not backed by array (which would be the case if StringCharBuffer used):<br><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/cs/UTF_8.java#L517">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/cs/UTF_8.java#L517</a></div><div><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/cs/UnicodeEncoder.java#L81">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/cs/UnicodeEncoder.java#L81</a></div><div><span style="font-family:monospace"><br></span></div><div><span style="font-family:monospace"><br></span></div><div><span style="font-family:monospace"><br></span></div><div><span style="font-family:monospace">jdk 17<br>Benchmark (data) Mode Cnt Score Error Units<br>CharSequenceCharAtBenchmark.testString ascii avgt 3 1429.358 ± 623.424 ns/op<br>CharSequenceCharAtBenchmark.testString non-ascii avgt 3 705.282 ± 233.453 ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder ascii avgt 3 724.138 ± 267.346 ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder non-ascii avgt 3 718.357 ± 864.066 ns/op<br><br>jdk 21<br>Benchmark (data) Mode Cnt Score Error Units<br>CharSequenceCharAtBenchmark.testString ascii avgt 3 1087.024 ┬▒ 235.082 ns/op<br>CharSequenceCharAtBenchmark.testString non-ascii avgt 3 687.520 ┬▒ 747.532 ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder ascii avgt 3 672.802 ┬▒ 29.740 ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder non-ascii avgt 3 689.964 ┬▒ 791.175 ns/op<br><br>jdk 25<br>Benchmark (data) Mode Cnt Score Error Units<br>CharSequenceCharAtBenchmark.testString ascii avgt 3 1176.057 ┬▒ 1157.979 ns/op<br>CharSequenceCharAtBenchmark.testString non-ascii avgt 3 697.382 ┬▒ 231.144 ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder ascii avgt 3 692.970 ┬▒ 105.112 ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder non-ascii avgt 3 703.178 ┬▒ 446.019 ns/op<br><br>jdk 26<br>Benchmark (data) Mode Cnt Score Error Units<br>CharSequenceCharAtBenchmark.testString ascii avgt 3 1132.971 ┬▒ 350.786 ns/op<br>CharSequenceCharAtBenchmark.testString non-ascii avgt 3 688.201 ┬▒ 175.797 ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder ascii avgt 3 704.380 ┬▒ 101.763 ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder non-ascii avgt 3 673.622 ┬▒ 51.462 ns/op</span></div><div><br></div><div><br></div><div><span style="font-family:monospace">@Warmup(iterations = 2, time = 7, timeUnit = TimeUnit.SECONDS)<br>@BenchmarkMode(Mode.AverageTime)<br>@OutputTimeUnit(TimeUnit.NANOSECONDS)<br>@State(Scope.Benchmark)<br>@Fork(value = 1, jvmArgsPrepend = {"-Xms512M", "-Xmx512M"})<br>public class CharSequenceCharAtBenchmark {<br><br> @Param(value = {"ascii", "non-ascii"})<br> public String data;<br><br> private String string;<br> <br> private StringBuilder stringBuilder;<br><br> @Setup(Level.Trial)<br> public void setup() throws Exception {<br> StringBuilder sb = new StringBuilder(3152);<br> for (int i=0; i<3152; ++i) {<br> char c = (char) i;<br> if ("ascii".equals(data)) {<br> c = (char) (i & 0x7f);<br> }<br> sb.append(c);<br> }<br><br> string = sb.toString();<br> stringBuilder = sb;<br> }<br><br> @Benchmark<br> public int testString() {<br> String sequence = this.string;<br> int sum = 0;<br> for (int i=0, j=sequence.length(); i<j; ++i) {<br> sum += sequence.charAt(i);<br> }<br> return sum;<br> }<br><br> @Benchmark<br> public int testStringBuilder() {<br> StringBuilder sequence = this.stringBuilder;<br> int sum = 0;<br> for (int i=0, j=sequence.length(); i<j; ++i) {<br> sum += sequence.charAt(i);<br> }<br> return sum;<br> }<br>}</span></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Mon, Jul 21, 2025 at 1:12 PM Roger Riggs <<a href="mailto:roger.riggs@oracle.com">roger.riggs@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
Hi Brett,<br>
<br>
I'd suggest separate initialization and test methods for the two
cases to get more reliable numbers.<br>
<br>
By using @Trial and using a common field for the test data, I think
you have handicapped C2.<br>
The training runs JMH does to warm up C2 are 'seeing' two different
types for the value of sequence.<br>
Making the test runs independent will remov doubt about interactions
due to the test setup.<br>
<br>
Roger<br>
<br>
<div>On 7/21/25 1:43 PM, Brett Okken wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>>
output labeled as StringBuffer but the jmh creates
StringBuilder.<br>
</div>
<div><br>
</div>
<div>Ugh - sorry about that. But yes - this is about
StringBuilder vs String.</div>
<div><br>
</div>
<div>> I would not be surprised that C2 has more
optimizations for String than for StringBuilder.</div>
<div><br>
</div>
<div>If that were true, it would not surprise me. However, these
tests show the opposite. String is /slower/ than
StringBuilder.</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Jul 21, 2025 at
12:34 PM Roger Riggs <<a href="mailto:roger.riggs@oracle.com" target="_blank">roger.riggs@oracle.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div> Hi Brett,<br>
<br>
The labeling of the output is confusing, the test output
labeled as StringBuffer but the jmh creates StringBuilder.<br>
(StringBuffer methods are all synchronized and could explain
why they are slower).<br>
<br>
Also, I would not be surprised that C2 has more
optimizations for String than for StringBuilder.<br>
<br>
Regards, Roger<br>
<br>
<div>On 7/19/25 6:09 PM, Brett Okken wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Making sequence a local variable does
improve things (especially for ascii), but a substantial
difference remains. It appears that the performance
difference for ascii goes all the way back to jdk 11.
The difference for non-ascii showed up in jdk 21. I
wonder if this is related to the index checks?<br>
<br>
<span style="font-family:monospace">jdk 11<br>
<br>
Benchmark (data) (source) Mode Cnt Score
Error Units<br>
test ascii String avgt 3 1137.348 ±
12.835 ns/op<br>
test ascii StringBuffer avgt 3 712.874 ±
509.320 ns/op<br>
test non-ascii String avgt 3 668.657 ±
246.550 ns/op<br>
test non-ascii StringBuffer avgt 3 897.344 ±
4353.414 ns/op<br>
<br>
<br>
jdk 17<br>
Benchmark (data) (source) Mode Cnt Score
Error Units<br>
test ascii String avgt 3 1321.497 ±
2107.466 ns/op<br>
test ascii StringBuffer avgt 3 715.936 ±
412.189 ns/op<br>
test non-ascii String avgt 3 722.986 ±
443.389 ns/op<br>
test non-ascii StringBuffer avgt 3 722.787 ±
771.816 ns/op</span><span style="font-family:monospace"><br>
<br>
<br>
jdk 21<br>
Benchmark (data) (source) Mode Cnt Score
Error Units<br>
test ascii String avgt 3 1150.301
┬▒ 918.549 ns/op<br>
test ascii StringBuffer avgt 3 713.183
┬▒ 543.850 ns/op<br>
test non-ascii String avgt 3 4642.667
┬▒ 11481.029 ns/op<br>
test non-ascii StringBuffer avgt 3 728.027
┬▒ 936.521 ns/op<br>
<br>
<br>
jdk 25<br>
Benchmark (data) (source) Mode Cnt Score
Error Units<br>
test ascii String avgt 3 1184.513
┬▒ 2057.498 ns/op<br>
test ascii StringBuffer avgt 3 786.611
┬▒ 411.657 ns/op<br>
test non-ascii String avgt 3 4197.585
┬▒ 2761.388 ns/op<br>
test non-ascii StringBuffer avgt 3 716.375
┬▒ 815.349 ns/op<br>
<br>
<br>
jdk 26<br>
Benchmark (data) (source) Mode Cnt Score
Error Units<br>
test ascii String avgt 3 1107.207
┬▒ 423.072 ns/op<br>
test ascii StringBuffer avgt 3 742.780
┬▒ 178.890 ns/op<br>
test non-ascii String avgt 3 4043.914
┬▒ 498.439 ns/op<br>
test non-ascii StringBuffer avgt 3 712.535
┬▒ 583.255 ns/op</span><br>
</div>
<br>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Sat, Jul 19, 2025
at 4:17 PM Chen Liang <<a href="mailto:liangchenblue@gmail.com" target="_blank">liangchenblue@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div dir="ltr">Without looking at C2 IRs, I think
there are a few potential culprits we can look
into:</div>
<div>1. JDK-8351000 and JDK-8351443 updated
StringBuilder</div>
<div>2. Sequence field is read in the loop; I wonder
if making it an explicit immutable local variable
changes anything here.</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Sat, Jul 19,
2025 at 2:34 PM Brett Okken <<a href="mailto:brett.okken.os@gmail.com" target="_blank">brett.okken.os@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I
was looking at the performance of
StringCharBuffer for various<br>
backing CharSequence types and was surprised to
see a significant<br>
performance difference between String and
StringBuffer. I wrote a<br>
small jmh which shows that the String
implementation of charAt is<br>
significantly slower than StringBuilder. Is this
expected?<br>
<br>
Benchmark (data)
(source) Mode Cnt<br>
Score Error Units<br>
CharSequenceCharAtBenchmark.test ascii
String avgt 3<br>
2537.311 ┬▒ 8952.197 ns/op<br>
CharSequenceCharAtBenchmark.test ascii
StringBuffer avgt 3<br>
852.004 ┬▒ 2532.958 ns/op<br>
CharSequenceCharAtBenchmark.test non-ascii
String avgt 3<br>
5115.381 ┬▒ 13822.592 ns/op<br>
CharSequenceCharAtBenchmark.test non-ascii
StringBuffer avgt 3<br>
836.230 ┬▒ 1154.191 ns/op<br>
<br>
<br>
<br>
@Measurement(iterations = 3, time = 5, timeUnit
= TimeUnit.SECONDS)<br>
@Warmup(iterations = 2, time = 7, timeUnit =
TimeUnit.SECONDS)<br>
@BenchmarkMode(Mode.AverageTime)<br>
@OutputTimeUnit(TimeUnit.NANOSECONDS)<br>
@State(Scope.Benchmark)<br>
@Fork(value = 1, jvmArgsPrepend = {"-Xms512M",
"-Xmx512M"})<br>
public class CharSequenceCharAtBenchmark {<br>
<br>
@Param(value = {"ascii", "non-ascii"})<br>
public String data;<br>
<br>
@Param(value = {"String", "StringBuffer"})<br>
public String source;<br>
<br>
private CharSequence sequence;<br>
<br>
@Setup(Level.Trial)<br>
public void setup() throws Exception {<br>
StringBuilder sb = new
StringBuilder(3152);<br>
for (int i=0; i<3152; ++i) {<br>
char c = (char) i;<br>
if ("ascii".equals(data)) {<br>
c = (char) (i & 0x7f);<br>
}<br>
sb.append(c);<br>
}<br>
<br>
switch(source) {<br>
case "String":<br>
sequence = sb.toString();<br>
break;<br>
case "StringBuffer":<br>
sequence = sb;<br>
break;<br>
default:<br>
throw new
IllegalArgumentException(source);<br>
}<br>
}<br>
<br>
@Benchmark<br>
public int test() {<br>
int sum = 0;<br>
for (int i=0, j=sequence.length();
i<j; ++i) {<br>
sum += sequence.charAt(i);<br>
}<br>
return sum;<br>
}<br>
}<br>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
</blockquote>
<br>
</div>
</blockquote></div>