Comparing the performance of Panama with JNI, JNA, and JNR - based on Java 21
Glavo
zjx001202 at gmail.com
Mon Mar 27 17:25:51 UTC 2023
>
> The JDK can use CarrierThreadLocal
>
I don't think this is a good use case for CarrierThreadLocal.
But I don't think this is a big problem. We can provide a cache pool for
native stacks.
When the native stack is no longer used by the current virtual thread, we
only need to resubmit the native stack back to the cache pool.
On Tue, Mar 28, 2023 at 12:39 AM Chris Vest <mr.chrisvest at gmail.com> wrote:
> The JDK can use CarrierThreadLocal, so it's not a problem as long as the
> usage scope of the thread-local values are tightly controlled, and don't
> span across virtual thread context switches.
>
> On Mon, Mar 27, 2023 at 8:59 AM Jorn Vernee <jorn.vernee at oracle.com>
> wrote:
>
>> Also note that there are potential issues when combining ThreadLocal and
>> virtual threads, as there might be a very large number of threads,
>> resulting in a very large number of NativeStack instances (along with their
>> native memory stacks)
>>
>> This makes supporting something based on ThreadLocal directly in the JDK
>> more questionable I think, since it depends on the particular application
>> whether this will work well, or not.
>>
>> Jorn
>> On 27/03/2023 16:29, Maurizio Cimadamore wrote:
>>
>>
>> On 27/03/2023 15:18, Glavo wrote:
>>
>> Another idea related to usability and performance: Can Panama provide a
>> thread-local "native stack" to help users allocate local variables?
>>
>> I have heard from others that LWJGL is using this technology. I also
>> tried to implement it:
>>
>>
>> https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/NativeStack.java
>> <https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/NativeStack.java__;!!ACWV5N9M2RV99hQ!NGWKauzXDhIIf_DFibMiRXqyafDYTQ6yIYsFfxsvKbUzFcEhN8_U-A1W85dbMIgL3u_r0jcch-NtPT65nYWdXFdzHw$>
>>
>> I ran benchmarks where allocating a small number of local variables was
>> thirty times more efficient than using a confined arena.
>> If Panama can provide such a class, it will be more convenient and faster
>> for users to assign temporary variables.
>>
>> While the FFM API does not provide anything directly, it is easy to build
>> such an arena on top of FFM.
>>
>>
>> https://github.com/openjdk/panama-foreign/blob/foreign-memaccess%2Babi/test/micro/org/openjdk/bench/java/lang/foreign/StrLenTest.java#L178
>>
>> The above implementation is slightly simpler than what LWJGL does, but it
>> provides a large boost (because it avoids all dynamic allocations).
>>
>> More advanced implementations which allocate dynamically when out of
>> space and then remember said allocation even after a "release" are also
>> possible.
>>
>> While we might add some such allocators in the future, the main priority
>> of the FFM API, design-wise, has been to make sure that such custom arenas
>> can be defined by developers directly, when and if needed.
>>
>> And this is indeed the biggest shift from Java 19 (which doesn't allow
>> custom arenas) to Java 20. Java 21 just iterates on the API, making it a
>> little bit simpler to use again, while retaining the capability of defining
>> custom arenas.
>>
>> Maurizio
>>
>>
>> Glavo
>>
>> On Sat, Mar 25, 2023 at 3:07 AM Glavo <zjx001202 at gmail.com> wrote:
>>
>>> I have run a series of benchmarks of Panama, JNI, JNA, and JNR based on
>>> the latest JDK. Here is its GitHub repository:
>>>
>>> https://github.com/Glavo/java-ffi-benchmark
>>> <https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark__;!!ACWV5N9M2RV99hQ!NGWKauzXDhIIf_DFibMiRXqyafDYTQ6yIYsFfxsvKbUzFcEhN8_U-A1W85dbMIgL3u_r0jcch-NtPT65nYWm3ZB9qA$>
>>>
>>> Here I tested the performance of no-ops, accessing structs, string
>>> conversions, and callbacks, respectively. I also tried the new isTrivial
>>> linker option.
>>> I summarized the results in README and charted them.
>>>
>>> In this email, in addition to sharing the above results, I would also
>>> like to talk about several issues I have encountered
>>>
>>> 1. MemorySegment.getUtf8String is unexpectedly slow
>>>
>>> Panama is much faster than JNA in most cases, but the operation of
>>> converting C strings to Java strings is an exception.
>>> I checked the source code of JNA and Panama, and the suspicious
>>> difference is that JNA uses strlen from the C standard library, while
>>> Panama uses Java loops.
>>> Perhaps this method can be optimized.
>>>
>>>
>>> 2. StructLayout must manually specify all padding
>>>
>>> Can we provide a convenient method for automatically padding between
>>> fields based on alignment?
>>> The current structLayout method is annoying for situations where you
>>> need to manually simulate the layout of a C struct.
>>>
>>>
>>> Glavo
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20230328/7c7eaa32/attachment-0001.htm>
More information about the panama-dev
mailing list