<div dir="ltr"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">we do have a mirror internal API (VectorSupport) which perhaps can be used <br></blockquote><div><br></div><div>It's interesting, I'll take a look at it. </div><div><br></div><div>Maybe I can create a PR that optimizes getUtf8String in the near future, but I can't guarantee it.</div><div><br></div><div>Glavo</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Mar 27, 2023 at 4:59 PM Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com">maurizio.cimadamore@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">


  <div>

    <p>Hi Glavo,<br>

      I agree that, from an architectural perspective, doing something

      like this would be preferrable to using a native method. There are

      some complications with using the Vector API from java.base (as

      vector is an incubating API), but we do have a mirror internal API

      (VectorSupport) which perhaps can be used - at a lower level - to

      achieve the same thing. I'll ask around.</p>

    <p>Maurizio<br>

    </p>

    <div>On 26/03/2023 18:26, Glavo wrote:<br>

    </div>

    <blockquote type="cite">

      
      <div dir="ltr">

        <div>I made an attempt: <br>

        </div>

        <div><br>

        </div>

        <div>I implemented a method using the 128-bit SSE/AVX

          instruction (via the vector api) to find bytes less than or

          equal to 0.<br>

        </div>

        <div>Unlike strlen, which only looks for null terminators, it

          also looks for negative bytes to determine whether the string

          contains non-ASCII characters.<br>

        </div>

        <div>If a string contains only ASCII characters, it can use a

          fast path to directly call the constructor of the String

          without having to decode and copy the array again (thanks to

          compact strings).<br>

        </div>

        <div><br>

        </div>

        <div>I ran the JMH benchmark and the results were satisfactory:</div>

        <div>

          <ul>

            <li>There is only a slight performance regression (<

              10%) for non-ASCII strings smaller than 16 bytes;</li>

            <li>Although SIMD is not used for ASCII strings smaller than

              16 bytes, throughput has increased by 33% due to the new

              fast path;</li>

            <li>For non-ASCII strings larger than 16 bytes, the

              throughput increased by 5%~465% due to SIMD;<br>

            </li>

            <li>For ASCII strings larger than 16 bytes, the throughput

              increased by 104%~2207%.<br>

            </li>

          </ul>

        </div>

        <div>For 4KiB ASCII strings, the new implementation is 22 times

          faster! Even small ASCII strings of only 16 bytes have double

          the performance.<br>

        </div>

        <div>This is a big victory, and even using strlen won't achieve

          such a significant improvement.<br>

        </div>

        <div><br>

        </div>

        <div>Here is the source code:<br>

        </div>

        <div><br>

        </div>

        <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><a href="https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/GetStringUTF8Benchmark.java__;!!ACWV5N9M2RV99hQ!LnjvyYma_gs26cNG3q5GisIhlUP8XXxxEaQeDTw0bDNxaOFlWP3Kf4ZWFnAu9kO6yFCT-V7IaKQfQexp6FjTGsQ3Zw$" target="_blank">https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/GetStringUTF8Benchmark.java</a><br>

          <br>

        </blockquote>

        It's just a simple implementation for experimental purposes. For

        simplicity, I used 128-bit SIMD instructions. 

        <div>In the future, we can consider choosing AVX-2 or AVX-512 at

          runtime, and maybe we can get more gains.<br>

        </div>

        <div><br>

        </div>

        <div>Glavo</div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Sun, Mar 26, 2023 at

          9:18 PM Sebastian Stenzel <<a href="mailto:sebastian.stenzel@gmail.com" target="_blank">sebastian.stenzel@gmail.com</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

          > Am 26.03.2023 um 14:46 schrieb Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" target="_blank">maurizio.cimadamore@oracle.com</a>>:<br>

          > <br>

          > Forgot: another problem is that just offloading to

          external "strlen" will not respect the memory segment

          boundaries (e.g. the underlying strlen will keep going even

          past the spatial boundaries of the memory segment).<br>

          <br>

          How about using strnlen? At least for native segments?<br>

          <br>

          Improving string conversion efficiency would make a huge

          difference in my FUSE bindings, where virtually every call

          contains a file path.</blockquote>

      </div>

    </blockquote>

  </div>


</blockquote></div>