<!DOCTYPE html><html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 10/11/2025 14:20, Liam Miller-Cushon

      wrote:<br>

    </div>

    <blockquote type="cite" cite="mid:CAL4QsgseX-p_03RSkisrrrOH6nUZ=zpjFZmAc8F+vM48n9w_Pw@mail.gmail.com">

      

      <div dir="ltr">

        <div dir="ltr">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

              <div>I hope this comment was not in my doc?<br>

              </div>

            </blockquote>

            <div><br>

            </div>

            It's a parenthetical in the paragraph starting with

            "Finally, ultimately, the user is probably the most happy

            with an API that directly accepts the units in which they

            are already measuring their string"</div>

        </div>

      </div>

    </blockquote>

    Apologies for the confusion, that was a leftover from a previous

    version. Removed now.<br>

    <blockquote type="cite" cite="mid:CAL4QsgseX-p_03RSkisrrrOH6nUZ=zpjFZmAc8F+vM48n9w_Pw@mail.gmail.com">

      <div dir="ltr">

        <div dir="ltr">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

              <div>

                <blockquote type="cite">

                  <div dir="ltr"> </div>

                </blockquote>

                <p>You mean the _byte size_ of the encoded string

                  (rather than number of code units?)</p>

              </div>

            </blockquote>

            <div>Yes, exactly.</div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

              <div>

                <p>Something like this might be interesting. That

                  said... if the charset matches, then creating the

                  segment view, then obtaining its byte size is O(1)

                  (e.g. no decoding). And if the charset doesn't match,

                  you'll need to decode anyway -- at which point I'm not

                  sure the array creation is really the bottleneck?</p>

              </div>

            </blockquote>

            <div>Thanks, yes, MemorySegment.ofString seemingly solves

              the case where the charset matches, so it's more a

              question of whether there are performance gains to be had

              for the case where the charset doesn't match. The

              benchmarking I've seen suggests a carefully optimized loop

              over the string is outperforming getBytes(charset).length

              for that case. I can do some more analysis and report

              back.</div>

          </div>

        </div>

      </div>

    </blockquote>

    <p>I believe you. My hunch here would be to separate this one out,

      as it has more to do with the Charset/String API than it has to do

      with memory segments?</p>

    <p>E.g. you want an API like:</p>

    <p>String::getNumBytes(Charset)</p>

    <p>Whether this API exists or not seems orthogonal to the

      improvements described in the documents I shared.</p>

    <p>Cheers<br>

      Maurizio<br>

    </p>

    <p><br>

    </p>

  </body>

</html>