<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 02/10/2024 23:58, Anastasiya
      Lisitskaya wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:CAD4WG-tpRrN037wY+A2agiqSZTjkR3vjkNkw9ejDrQ+iK80s6g@mail.gmail.com">
      
      <div dir="ltr">
        <div dir="ltr">Hi,
          <div><br>
          </div>
          <div>It is very helpful! </div>
          <div><br>
          </div>
          So, if I want to use data from the heap without extra copying
          to off-heap (native MemorySegment), should using String be
          avoided? It seems there is no way to use a String without
          copying, as we can't guarantee a trailing null terminator.<br>
        </div>
      </div>
    </blockquote>
    I'm afraid that's the case. The Java String API does not concern
    with string terminators because, in Java, all strings have a size.
    In C that's not the case - so in general you need to append a
    terminator, and that will involve some degree of copying.<br>
    <blockquote type="cite" cite="mid:CAD4WG-tpRrN037wY+A2agiqSZTjkR3vjkNkw9ejDrQ+iK80s6g@mail.gmail.com">
      <div dir="ltr">
        <div dir="ltr"><br>
          One thing still concerns me: is processing an unterminated
          string unpredictable? Only one test from my suite fails
          (returning this extra symbol or crashing).</div>
      </div>
    </blockquote>
    <p>Processing an unterminated string leads to undefined behavior.
      Effectively, your program is scanning _past_ the contents of your
      string, in search for a zero. Because of the way some system calls
      work (e.g. malloc) it is likely that a zero will be found more or
      less where expected. But that behavior is OS/platform dependent
      and absolutely cannot be relied upon.</p>
    <p>Maurizio<br>
    </p>
    <blockquote type="cite" cite="mid:CAD4WG-tpRrN037wY+A2agiqSZTjkR3vjkNkw9ejDrQ+iK80s6g@mail.gmail.com">
      <div dir="ltr">
        <div dir="ltr">
          <div><br>
          </div>
          <div>Many thanks!<br>
          </div>
        </div>
        <br>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr">ср, 2 окт. 2024 г. в 13:11,
            Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" moz-do-not-send="true" class="moz-txt-link-freetext">maurizio.cimadamore@oracle.com</a>>:<br>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            <div>
              <p>Hi, some replies below:<br>
              </p>
              <div>On 01/10/2024 20:40, Anastasiya Lisitskaya wrote:<br>
              </div>
              <blockquote type="cite">
                <div dir="ltr">
                  <div><span style="font-family:arial,sans-serif;color:rgb(33,37,41)">Hi,</span><br>
                  </div>
                  <div><span style="color:rgb(33,37,41)"><font face="arial, sans-serif"><br>
                      </font></span></div>
                  <div><font face="arial, sans-serif"><span style="color:rgb(33,37,41)">I'm trying to use
                        the FFM API </span></font>(jdk 22)<font face="arial, sans-serif"><span style="color:rgb(33,37,41)"> to call my C++
                        method and I need to pass a text</span></font><span style="color:rgb(33,37,41);font-family:arial,sans-serif"> </span><span style="color:rgb(33,37,41);font-family:arial,sans-serif">(java String)</span><font face="arial, sans-serif"><span style="color:rgb(33,37,41)"> and receive a text
                        response</span></font><span style="color:rgb(33,37,41);font-family:arial,sans-serif">. While
                      implementing this, I encountered several issues:</span>
                    <ol style="box-sizing:border-box;padding-left:2rem;margin-top:0px;margin-bottom:1rem;color:rgb(33,37,41)">
                      <li style="box-sizing:border-box">
                        <p style="box-sizing:border-box;margin-top:0px;margin-bottom:1rem"><font face="arial, sans-serif">What are the best
                            practices for defining <code style="box-sizing:border-box">newSize</code> for
                            use in the <code style="box-sizing:border-box">reinterpret(long
                              newSize)</code> method? Can I use
                            constants like <code style="box-sizing:border-box">Long.MAX_VALUE</code> or <code style="box-sizing:border-box">Integer.MAX_VALUE</code> as <code style="box-sizing:border-box">newSize</code>,
                            or could that cause some problems?</font></p>
                      </li>
                    </ol>
                  </div>
                </div>
              </blockquote>
              <p><font face="arial, sans-serif">If the size of the
                  returned string (I assume it's a char*) is known, then
                  use that size. Otherwise, use Long.MAX_VALUE.
                  MemorySegment::getString will read the string bytes up
                  to the null terminator.</font></p>
              <p><font face="arial, sans-serif"><br>
                </font></p>
              <blockquote type="cite">
                <div dir="ltr">
                  <div>
                    <div>
                      <ol style="box-sizing:border-box;padding-left:2rem;margin-top:0px;margin-bottom:1rem;color:rgb(33,37,41)">
                        <li style="box-sizing:border-box">
                          <p style="box-sizing:border-box;margin-top:0px;margin-bottom:1rem"><font face="arial, sans-serif">When I tried to
                              use in-heap <code style="box-sizing:border-box">MemorySegment</code> with
                              the <code style="box-sizing:border-box">Linker.Option.critical(true)</code> 
                              and passed <code style="box-sizing:border-box">MemorySegment.ofArray(text.getBytes())</code>,
                              I started getting extra symbol like SOH in
                              the response. What am I doing wrong?
                              (Sample snippets listed below). Changing </font><span style="font-family:monospace">newSize</span> value
                            in <span style="font-family:monospace">reinterpret(long
                              newSize)</span> doesn't help</p>
                        </li>
                      </ol>
                    </div>
                  </div>
                </div>
              </blockquote>
              <blockquote type="cite">
                <div dir="ltr">
                  <div>
                    <div>
                      <ol style="box-sizing:border-box;padding-left:2rem;margin-top:0px;margin-bottom:1rem;color:rgb(33,37,41)">
                        <li style="box-sizing:border-box">
                          <div>If I inline <span style="color:rgb(0,0,0);font-family:"JetBrains Mono",monospace">MemorySegment.</span><span style="color:rgb(0,0,0);font-family:"JetBrains Mono",monospace;font-style:italic">ofArray</span><span style="color:rgb(0,0,0);font-family:"JetBrains Mono",monospace">(text.getBytes())</span> into <font color="#000000"><font face="JetBrains Mono, monospace">invokeExact, </font><font face="arial, sans-serif">I </font></font><span style="color:rgb(34,34,34)">expected : </span><span style="color:rgb(34,34,34);font-family:"JetBrains Mono",monospace"><font color="#000000">"мое все 123 аи92", but</font></span> got:</div>
                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">uncaught
                            exception:<br>
                                address -> 0x60000120d710<br>
                                what() -> "util/charset/wide.h:366:
                            failed to decode UTF-8 string at pos 25 in
                            string "\xD0\x9C\xD0\xBE\xD1\x91
                            \xD0\xB2\xD1\x81\xD1\x91 123
                            \xD0\x90\xD0\23092\1\xCF\xFD\xBD_""<br>
                                type -> yexception</blockquote>
                        </li>
                      </ol>
                    </div>
                    <div>I'm definitely doing something wrong. Please
                      help me figure it out and understand. Thanks! <br>
                    </div>
                  </div>
                </div>
              </blockquote>
              <p>I think your problem is that the segment you are
                creating has no NULL terminator in the end?</p>
              <p>E.g. you take a Java string, get its byte array, and
                turn the byte array into a segment.</p>
              <p>To work with string safely, I suggest you use
                String-accepting allocation/accessor methods. Either
                Arena::allocateFrom(String), or
                MemorySegment::setString. Those will add the required
                terminator.</p>
              <p>I think even your first example looks incorrect (where
                you use `allocateFrom(JAVA_BYTE, text.getBytes()`), but
                you are probably saved there by the fact that malloc
                allocated a bigger chunk of memory and a zero just
                happens to be at the end of the string bytes?</p>
              <p>You can't pass the byte array of a Java string to a
                C/C++ function expecting a null-terminated string w/o
                performing some sort of copy and adding the required
                trailing terminator. Some C/C++ APIs might work with
                unterminated strings, in which case they will probably
                accept a size - e.g. how many characters are expected in
                the char*. But this doesn't seem to be the case here.</p>
              <p>Hope this helps<br>
                Maurizio<br>
                <br>
              </p>
              <br>
              <p><br>
              </p>
            </div>
          </blockquote>
        </div>
        <br clear="all">
        <div><br>
        </div>
        <span class="gmail_signature_prefix">-- </span><br>
        <div dir="ltr" class="gmail_signature">С уважением, Лисицкая
          Настя</div>
      </div>
    </blockquote>
  </body>
</html>