<div dir="ltr"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">we do have a mirror internal API (VectorSupport) which perhaps can be used <br></blockquote><div><br></div><div>It's interesting, I'll take a look at it. </div><div><br></div><div>Maybe I can create a PR that optimizes getUtf8String in the near future, but I can't guarantee it.</div><div><br></div><div>Glavo</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Mar 27, 2023 at 4:59 PM Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com">maurizio.cimadamore@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>Hi Glavo,<br>
I agree that, from an architectural perspective, doing something
like this would be preferrable to using a native method. There are
some complications with using the Vector API from java.base (as
vector is an incubating API), but we do have a mirror internal API
(VectorSupport) which perhaps can be used - at a lower level - to
achieve the same thing. I'll ask around.</p>
<p>Maurizio<br>
</p>
<div>On 26/03/2023 18:26, Glavo wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>I made an attempt: <br>
</div>
<div><br>
</div>
<div>I implemented a method using the 128-bit SSE/AVX
instruction (via the vector api) to find bytes less than or
equal to 0.<br>
</div>
<div>Unlike strlen, which only looks for null terminators, it
also looks for negative bytes to determine whether the string
contains non-ASCII characters.<br>
</div>
<div>If a string contains only ASCII characters, it can use a
fast path to directly call the constructor of the String
without having to decode and copy the array again (thanks to
compact strings).<br>
</div>
<div><br>
</div>
<div>I ran the JMH benchmark and the results were satisfactory:</div>
<div>
<ul>
<li>There is only a slight performance regression (<
10%) for non-ASCII strings smaller than 16 bytes;</li>
<li>Although SIMD is not used for ASCII strings smaller than
16 bytes, throughput has increased by 33% due to the new
fast path;</li>
<li>For non-ASCII strings larger than 16 bytes, the
throughput increased by 5%~465% due to SIMD;<br>
</li>
<li>For ASCII strings larger than 16 bytes, the throughput
increased by 104%~2207%.<br>
</li>
</ul>
</div>
<div>For 4KiB ASCII strings, the new implementation is 22 times
faster! Even small ASCII strings of only 16 bytes have double
the performance.<br>
</div>
<div>This is a big victory, and even using strlen won't achieve
such a significant improvement.<br>
</div>
<div><br>
</div>
<div>Here is the source code:<br>
</div>
<div><br>
</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><a href="https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/GetStringUTF8Benchmark.java__;!!ACWV5N9M2RV99hQ!LnjvyYma_gs26cNG3q5GisIhlUP8XXxxEaQeDTw0bDNxaOFlWP3Kf4ZWFnAu9kO6yFCT-V7IaKQfQexp6FjTGsQ3Zw$" target="_blank">https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/GetStringUTF8Benchmark.java</a><br>
<br>
</blockquote>
It's just a simple implementation for experimental purposes. For
simplicity, I used 128-bit SIMD instructions.
<div>In the future, we can consider choosing AVX-2 or AVX-512 at
runtime, and maybe we can get more gains.<br>
</div>
<div><br>
</div>
<div>Glavo</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Sun, Mar 26, 2023 at
9:18 PM Sebastian Stenzel <<a href="mailto:sebastian.stenzel@gmail.com" target="_blank">sebastian.stenzel@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
> Am 26.03.2023 um 14:46 schrieb Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" target="_blank">maurizio.cimadamore@oracle.com</a>>:<br>
> <br>
> Forgot: another problem is that just offloading to
external "strlen" will not respect the memory segment
boundaries (e.g. the underlying strlen will keep going even
past the spatial boundaries of the memory segment).<br>
<br>
How about using strnlen? At least for native segments?<br>
<br>
Improving string conversion efficiency would make a huge
difference in my FUSE bindings, where virtually every call
contains a file path.</blockquote>
</div>
</blockquote>
</div>
</blockquote></div>