<!DOCTYPE html><html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 12/11/2025 10:02, Liam Miller-Cushon

      wrote:<br>

    </div>

    <blockquote type="cite" cite="mid:CAL4Qsgu_53bsnuG0efEQYyQKsCT0xuMdsEDq=kt0507XbkzOWA@mail.gmail.com">

      

      <div dir="ltr">Thanks, yes, I think string concat is a good

        analogy.<br>

        <br>

        Thinking about this more, isn't this use-case an example where

        the proposed MemorySegment::ofString approach wouldn't always

        offer the best possible performance? In the case where the

        internal string buffer isn't compatible with the requested

        charset it has to make an intermediate copy. In theory with the

        alternative of a setString or copy method that took a String and

        directly wrote it to the output, the intermediate copy could be

        avoided.</div>

    </blockquote>

    <p>Let's leave MS::ofString aside for this discussion (as I agree

      that wouldn't be optimal for this use case).</p>

    <p>I believe what you mean here is that if I have a string, and I

      want to copy to a destination segment I could either:</p>

    <p>* if the string buffer is compatible, just bulk-copy that buffer

      into the target segment<br>

      * if the string buffer is not compatible, encode the string

      _directly_ into the target segment</p>

    <p>Correct? If so, I tend to agree this would be slightly

      preferrable, as we'd be touching things only once. And, I believe

      this can be also done to the existing setString method?</p>

    <p>Cheers<br>

      Maurizio<br>

    </p>

    <p><br>

    </p>

    <p><br>

    </p>

    <blockquote type="cite" cite="mid:CAL4Qsgu_53bsnuG0efEQYyQKsCT0xuMdsEDq=kt0507XbkzOWA@mail.gmail.com"><br>

      <div class="gmail_quote gmail_quote_container">

        <div dir="ltr" class="gmail_attr">On Tue, Nov 11, 2025 at

          6:18 PM Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" moz-do-not-send="true" class="moz-txt-link-freetext">maurizio.cimadamore@oracle.com</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div>

            <p>Thanks for the detailed reply.</p>

            <p>For resizing, I tend to agree with you -- the problem is

              that if you don't size correctly upfront, then you will

              have to pay the cost (potentially multiple times) to

              allocate a bigger buffer and move all the contents over

              there.</p>

            <p>A bit like how string concat has evolved, where we now

              have ways to "guess" the size of each of the concatenation

              arguments so we can correctly size the byte[] buffer we

              create to hold the result of the concatenation.</p>

            <p>In those cases, I agree, paying a small-ish cost to be

              able to estimate the size of a sub-element of an

              allocation goes a long way in making everything less

              dynamic and more deterministic.</p>

            <p>Maurizio</p>

            <div>On 11/11/2025 17:04, Liam Miller-Cushon wrote:<br>

            </div>

            <blockquote type="cite">

              <div dir="ltr">

                <div class="gmail_quote">

                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                    <div>

                      <p>It seems to me that in this case encoding and

                        length travels together? E.g. you need to encode

                        anyway, at which point you also know the byte

                        size?</p>

                      <p>(I'm a bit unsure that here there's anything to

                        be gained by the method you proposed?)</p>

                      <p>Do you have use cases where you don't want to

                        decode, you just want to know the byte length?</p>

                    </div>

                  </blockquote>

                  <div>The main use-cases I've seen do want both the

                    encoding and the length.</div>

                  <div><br>

                  </div>

                  <div>I think there is still a benefit to a fast way to

                    get the length first. An alternative is to

                    accumulate into a temporary buffer, and potentially

                    have to resize it. If there are gigabytes of data

                    it's expensive to have to make another copy. Knowing

                    the encoded length up-front allows exactly sizing

                    the output buffer and avoids the temporary buffer.</div>

                  <div><br>

                  </div>

                  <div>Some slightly more concrete examples:</div>

                  <div><br>

                  </div>

                  <div>Building a byte[] with all of the content of a

                    lot of data, sizing the byte[] requires knowing the

                    sum of all the lengths you want to put into it first

                    and then encoding the strings into it.<br>

                    <br>

                    Streaming serialization to the network: the top

                    level has to know the length of the transitive

                    contents that it's going to be writing out in the

                    nested structures. The actual output is streamed, it

                    never constructs a byte[] of the complete data in

                    this scenario.</div>

                  <div><br>

                  </div>

                  <div>(There are also some public protobuf APIs that

                    just return an encoded byte length for the data, but

                    that is a less performance sensitive use-case.)</div>

                </div>

              </div>

            </blockquote>

          </div>

        </blockquote>

      </div>

    </blockquote>

  </body>

</html>