<div dir="ltr"><div>Updating to have different test methods for each representation did remove the difference for the non-ascii String case for the jdk 21+ releases.</div><div>However, the ascii (latin) strings are still slower with String than StringBuilder.</div><div><br></div><div>How does C2 then handle something like StringCharBuffer wrapping a CharSequence for all of it's get operations:<br><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/nio/StringCharBuffer.java#L88-L97">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/nio/StringCharBuffer.java#L88-L97</a></div><div><br></div><div>Which is then used by CharBufferSpliterator<br><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/nio/CharBufferSpliterator.java">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/nio/CharBufferSpliterator.java</a></div><div><br>And by many CharsetEncoder impls when either source or destination is not backed by array (which would be the case if StringCharBuffer used):<br><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/cs/UTF_8.java#L517">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/cs/UTF_8.java#L517</a></div><div><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/cs/UnicodeEncoder.java#L81">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/cs/UnicodeEncoder.java#L81</a></div><div><span style="font-family:monospace"><br></span></div><div><span style="font-family:monospace"><br></span></div><div><span style="font-family:monospace"><br></span></div><div><span style="font-family:monospace">jdk 17<br>Benchmark                                         (data)  Mode  Cnt     Score     Error  Units<br>CharSequenceCharAtBenchmark.testString             ascii  avgt    3  1429.358 ± 623.424  ns/op<br>CharSequenceCharAtBenchmark.testString         non-ascii  avgt    3   705.282 ± 233.453  ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder      ascii  avgt    3   724.138 ± 267.346  ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder  non-ascii  avgt    3   718.357 ± 864.066  ns/op<br><br>jdk 21<br>Benchmark                                         (data)  Mode  Cnt     Score     Error  Units<br>CharSequenceCharAtBenchmark.testString             ascii  avgt    3  1087.024 ┬▒ 235.082  ns/op<br>CharSequenceCharAtBenchmark.testString         non-ascii  avgt    3   687.520 ┬▒ 747.532  ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder      ascii  avgt    3   672.802 ┬▒  29.740  ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder  non-ascii  avgt    3   689.964 ┬▒ 791.175  ns/op<br><br>jdk 25<br>Benchmark                                         (data)  Mode  Cnt     Score      Error  Units<br>CharSequenceCharAtBenchmark.testString             ascii  avgt    3  1176.057 ┬▒ 1157.979  ns/op<br>CharSequenceCharAtBenchmark.testString         non-ascii  avgt    3   697.382 ┬▒  231.144  ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder      ascii  avgt    3   692.970 ┬▒  105.112  ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder  non-ascii  avgt    3   703.178 ┬▒  446.019  ns/op<br><br>jdk 26<br>Benchmark                                         (data)  Mode  Cnt     Score     Error  Units<br>CharSequenceCharAtBenchmark.testString             ascii  avgt    3  1132.971 ┬▒ 350.786  ns/op<br>CharSequenceCharAtBenchmark.testString         non-ascii  avgt    3   688.201 ┬▒ 175.797  ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder      ascii  avgt    3   704.380 ┬▒ 101.763  ns/op<br>CharSequenceCharAtBenchmark.testStringBuilder  non-ascii  avgt    3   673.622 ┬▒  51.462  ns/op</span></div><div><br></div><div><br></div><div><span style="font-family:monospace">@Warmup(iterations = 2, time = 7, timeUnit = TimeUnit.SECONDS)<br>@BenchmarkMode(Mode.AverageTime)<br>@OutputTimeUnit(TimeUnit.NANOSECONDS)<br>@State(Scope.Benchmark)<br>@Fork(value = 1, jvmArgsPrepend = {"-Xms512M", "-Xmx512M"})<br>public class CharSequenceCharAtBenchmark {<br><br>    @Param(value = {"ascii", "non-ascii"})<br>    public String data;<br><br>    private String string;<br>    <br>    private StringBuilder stringBuilder;<br><br>    @Setup(Level.Trial)<br>    public void setup() throws Exception {<br>        StringBuilder sb = new StringBuilder(3152);<br>        for (int i=0; i<3152; ++i) {<br>            char c = (char) i;<br>            if ("ascii".equals(data)) {<br>                c = (char) (i & 0x7f);<br>            }<br>            sb.append(c);<br>        }<br><br>        string = sb.toString();<br>        stringBuilder = sb;<br>    }<br><br>    @Benchmark<br>    public int testString() {<br>        String sequence = this.string;<br>        int sum = 0;<br>        for (int i=0, j=sequence.length(); i<j; ++i) {<br>            sum += sequence.charAt(i);<br>        }<br>        return sum;<br>    }<br><br>    @Benchmark<br>    public int testStringBuilder() {<br>        StringBuilder sequence = this.stringBuilder;<br>        int sum = 0;<br>        for (int i=0, j=sequence.length(); i<j; ++i) {<br>            sum += sequence.charAt(i);<br>        }<br>        return sum;<br>    }<br>}</span></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Mon, Jul 21, 2025 at 1:12 PM Roger Riggs <<a href="mailto:roger.riggs@oracle.com">roger.riggs@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>

  
  <div>
    Hi Brett,<br>
    <br>
    I'd suggest separate initialization and test methods for the two
    cases to get more reliable numbers.<br>
    <br>
    By using @Trial and using a common field for the test data, I think
    you have handicapped C2.<br>
    The training runs JMH does to warm up C2 are 'seeing' two different
    types for the value of sequence.<br>
    Making the test runs independent will remov doubt about interactions
    due to the test setup.<br>
    <br>
    Roger<br>
    <br>
    <div>On 7/21/25 1:43 PM, Brett Okken wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div>> 
          output labeled as StringBuffer but the jmh creates
          StringBuilder.<br>
        </div>
        <div><br>
        </div>
        <div>Ugh - sorry about that. But yes - this is about
          StringBuilder vs String.</div>
        <div><br>
        </div>
        <div>> I would not be surprised that C2 has more
          optimizations for String than for StringBuilder.</div>
        <div><br>
        </div>
        <div>If that were true, it would not surprise me. However, these
          tests show the opposite. String is /slower/ than
          StringBuilder.</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Mon, Jul 21, 2025 at
          12:34 PM Roger Riggs <<a href="mailto:roger.riggs@oracle.com" target="_blank">roger.riggs@oracle.com</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div> Hi Brett,<br>
            <br>
            The labeling of the output is confusing, the test output
            labeled as StringBuffer but the jmh creates StringBuilder.<br>
            (StringBuffer methods are all synchronized and could explain
            why they are slower).<br>
            <br>
            Also, I would not be surprised that C2 has more
            optimizations for String than for StringBuilder.<br>
            <br>
            Regards, Roger<br>
            <br>
            <div>On 7/19/25 6:09 PM, Brett Okken wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">Making sequence a local variable does
                improve things (especially for ascii), but a substantial
                difference remains. It appears that the performance
                difference for ascii goes all the way back to jdk 11.
                The difference for non-ascii showed up in jdk 21. I
                wonder if this is related to the index checks?<br>
                <br>
                <span style="font-family:monospace">jdk 11<br>
                  <br>
                  Benchmark  (data)      (source)  Mode  Cnt     Score  
                     Error  Units<br>
                  test        ascii        String  avgt    3  1137.348 ±
                    12.835  ns/op<br>
                  test        ascii  StringBuffer  avgt    3   712.874 ±
                   509.320  ns/op<br>
                  test    non-ascii        String  avgt    3   668.657 ±
                   246.550  ns/op<br>
                  test    non-ascii  StringBuffer  avgt    3   897.344 ±
                  4353.414  ns/op<br>
                  <br>
                  <br>
                  jdk 17<br>
                  Benchmark  (data)      (source)  Mode  Cnt     Score  
                     Error  Units<br>
                  test        ascii        String  avgt    3  1321.497 ±
                  2107.466  ns/op<br>
                  test        ascii  StringBuffer  avgt    3   715.936 ±
                   412.189  ns/op<br>
                  test    non-ascii        String  avgt    3   722.986 ±
                   443.389  ns/op<br>
                  test    non-ascii  StringBuffer  avgt    3   722.787 ±
                   771.816  ns/op</span><span style="font-family:monospace"><br>
                  <br>
                  <br>
                  jdk 21<br>
                  Benchmark  (data)      (source)  Mode  Cnt     Score  
                      Error  Units<br>
                  test        ascii        String  avgt    3  1150.301
                  ┬▒   918.549  ns/op<br>
                  test        ascii  StringBuffer  avgt    3   713.183
                  ┬▒   543.850  ns/op<br>
                  test    non-ascii        String  avgt    3  4642.667
                  ┬▒ 11481.029  ns/op<br>
                  test    non-ascii  StringBuffer  avgt    3   728.027
                  ┬▒   936.521  ns/op<br>
                  <br>
                  <br>
                  jdk 25<br>
                  Benchmark  (data)      (source)  Mode  Cnt     Score  
                     Error  Units<br>
                  test        ascii        String  avgt    3  1184.513
                  ┬▒ 2057.498  ns/op<br>
                  test        ascii  StringBuffer  avgt    3   786.611
                  ┬▒  411.657  ns/op<br>
                  test    non-ascii        String  avgt    3  4197.585
                  ┬▒ 2761.388  ns/op<br>
                  test    non-ascii  StringBuffer  avgt    3   716.375
                  ┬▒  815.349  ns/op<br>
                  <br>
                  <br>
                  jdk 26<br>
                  Benchmark  (data)      (source)  Mode  Cnt     Score  
                    Error  Units<br>
                  test        ascii        String  avgt    3  1107.207
                  ┬▒ 423.072  ns/op<br>
                  test        ascii  StringBuffer  avgt    3   742.780
                  ┬▒ 178.890  ns/op<br>
                  test    non-ascii        String  avgt    3  4043.914
                  ┬▒ 498.439  ns/op<br>
                  test    non-ascii  StringBuffer  avgt    3   712.535
                  ┬▒ 583.255  ns/op</span><br>
              </div>
              <br>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Sat, Jul 19, 2025
                  at 4:17 PM Chen Liang <<a href="mailto:liangchenblue@gmail.com" target="_blank">liangchenblue@gmail.com</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                  <div dir="ltr">
                    <div dir="ltr">Without looking at C2 IRs, I think
                      there are a few potential culprits we can look
                      into:</div>
                    <div>1. JDK-8351000 and JDK-8351443 updated
                      StringBuilder</div>
                    <div>2. Sequence field is read in the loop; I wonder
                      if making it an explicit immutable local variable
                      changes anything here.</div>
                    <br>
                    <div class="gmail_quote">
                      <div dir="ltr" class="gmail_attr">On Sat, Jul 19,
                        2025 at 2:34 PM Brett Okken <<a href="mailto:brett.okken.os@gmail.com" target="_blank">brett.okken.os@gmail.com</a>>
                        wrote:<br>
                      </div>
                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I
                        was looking at the performance of
                        StringCharBuffer for various<br>
                        backing CharSequence types and was surprised to
                        see a significant<br>
                        performance difference between String and
                        StringBuffer. I wrote a<br>
                        small jmh which shows that the String
                        implementation of charAt is<br>
                        significantly slower than StringBuilder. Is this
                        expected?<br>
                        <br>
                        Benchmark                            (data)     
                        (source)  Mode  Cnt<br>
                          Score       Error  Units<br>
                        CharSequenceCharAtBenchmark.test      ascii     
                          String  avgt    3<br>
                        2537.311 ┬▒  8952.197  ns/op<br>
                        CharSequenceCharAtBenchmark.test      ascii 
                        StringBuffer  avgt    3<br>
                        852.004 ┬▒  2532.958  ns/op<br>
                        CharSequenceCharAtBenchmark.test  non-ascii     
                          String  avgt    3<br>
                        5115.381 ┬▒ 13822.592  ns/op<br>
                        CharSequenceCharAtBenchmark.test  non-ascii 
                        StringBuffer  avgt    3<br>
                        836.230 ┬▒  1154.191  ns/op<br>
                        <br>
                        <br>
                        <br>
                        @Measurement(iterations = 3, time = 5, timeUnit
                        = TimeUnit.SECONDS)<br>
                        @Warmup(iterations = 2, time = 7, timeUnit =
                        TimeUnit.SECONDS)<br>
                        @BenchmarkMode(Mode.AverageTime)<br>
                        @OutputTimeUnit(TimeUnit.NANOSECONDS)<br>
                        @State(Scope.Benchmark)<br>
                        @Fork(value = 1, jvmArgsPrepend = {"-Xms512M",
                        "-Xmx512M"})<br>
                        public class CharSequenceCharAtBenchmark {<br>
                        <br>
                            @Param(value = {"ascii", "non-ascii"})<br>
                            public String data;<br>
                        <br>
                            @Param(value = {"String", "StringBuffer"})<br>
                            public String source;<br>
                        <br>
                            private CharSequence sequence;<br>
                        <br>
                            @Setup(Level.Trial)<br>
                            public void setup() throws Exception {<br>
                                StringBuilder sb = new
                        StringBuilder(3152);<br>
                                for (int i=0; i<3152; ++i) {<br>
                                    char c = (char) i;<br>
                                    if ("ascii".equals(data)) {<br>
                                        c = (char) (i & 0x7f);<br>
                                    }<br>
                                    sb.append(c);<br>
                                }<br>
                        <br>
                                switch(source) {<br>
                                    case "String":<br>
                                        sequence = sb.toString();<br>
                                        break;<br>
                                    case "StringBuffer":<br>
                                        sequence = sb;<br>
                                        break;<br>
                                    default:<br>
                                        throw new
                        IllegalArgumentException(source);<br>
                                }<br>
                            }<br>
                        <br>
                            @Benchmark<br>
                            public int test() {<br>
                                int sum = 0;<br>
                                for (int i=0, j=sequence.length();
                        i<j; ++i) {<br>
                                    sum += sequence.charAt(i);<br>
                                }<br>
                                return sum;<br>
                            }<br>
                        }<br>
                      </blockquote>
                    </div>
                  </div>
                </blockquote>
              </div>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </div>

</blockquote></div>