Comparing the performance of Panama with JNI, JNA, and JNR - based on Java 21

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Mon Mar 27 08:59:17 UTC 2023


Hi Glavo,
I agree that, from an architectural perspective, doing something like 
this would be preferrable to using a native method. There are some 
complications with using the Vector API from java.base (as vector is an 
incubating API), but we do have a mirror internal API (VectorSupport) 
which perhaps can be used - at a lower level - to achieve the same 
thing. I'll ask around.

Maurizio

On 26/03/2023 18:26, Glavo wrote:
> I made an attempt:
>
> I implemented a method using the 128-bit SSE/AVX instruction (via the 
> vector api) to find bytes less than or equal to 0.
> Unlike strlen, which only looks for null terminators, it also looks 
> for negative bytes to determine whether the string contains non-ASCII 
> characters.
> If a string contains only ASCII characters, it can use a fast path to 
> directly call the constructor of the String without having to decode 
> and copy the array again (thanks to compact strings).
>
> I ran the JMH benchmark and the results were satisfactory:
>
>   * There is only a slight performance regression (< 10%) for
>     non-ASCII strings smaller than 16 bytes;
>   * Although SIMD is not used for ASCII strings smaller than 16 bytes,
>     throughput has increased by 33% due to the new fast path;
>   * For non-ASCII strings larger than 16 bytes, the throughput
>     increased by 5%~465% due to SIMD;
>   * For ASCII strings larger than 16 bytes, the throughput increased
>     by 104%~2207%.
>
> For 4KiB ASCII strings, the new implementation is 22 times 
> faster! Even small ASCII strings of only 16 bytes have double the 
> performance.
> This is a big victory, and even using strlen won't achieve such a 
> significant improvement.
>
> Here is the source code:
>
>     https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/GetStringUTF8Benchmark.java
>     <https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/GetStringUTF8Benchmark.java__;!!ACWV5N9M2RV99hQ!LnjvyYma_gs26cNG3q5GisIhlUP8XXxxEaQeDTw0bDNxaOFlWP3Kf4ZWFnAu9kO6yFCT-V7IaKQfQexp6FjTGsQ3Zw$>
>
> It's just a simple implementation for experimental purposes. For 
> simplicity, I used 128-bit SIMD instructions.
> In the future, we can consider choosing AVX-2 or AVX-512 at runtime, 
> and maybe we can get more gains.
>
> Glavo
>
> On Sun, Mar 26, 2023 at 9:18 PM Sebastian Stenzel 
> <sebastian.stenzel at gmail.com> wrote:
>
>
>     > Am 26.03.2023 um 14:46 schrieb Maurizio Cimadamore
>     <maurizio.cimadamore at oracle.com>:
>     >
>     > Forgot: another problem is that just offloading to external
>     "strlen" will not respect the memory segment boundaries (e.g. the
>     underlying strlen will keep going even past the spatial boundaries
>     of the memory segment).
>
>     How about using strnlen? At least for native segments?
>
>     Improving string conversion efficiency would make a huge
>     difference in my FUSE bindings, where virtually every call
>     contains a file path.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20230327/8bc2fd0d/attachment-0001.htm>


More information about the panama-dev mailing list