[External] : Re: Comparing the performance of Panama with JNI, JNA, and JNR - based on Java 21
Jorn Vernee
jorn.vernee at oracle.com
Tue Mar 28 17:02:07 UTC 2023
Right, CarrierThreadLocal doesn't work in the presence of virtual thread
context switches. Because allocations of different virtual threads end
up getting mixed together into the same native stack.
Generally, I think an allocator implementation that has to make a bunch
of tradeoffs is not a slam dunk for adding to the JDK.
We've added the tools to define custom allocators, so that users can
implement the allocator with the tradeoffs suited to their application.
If there is some common pattern that emerges we might consider adding
that to the JDK in the future.
Jorn
On 27/03/2023 19:25, Glavo wrote:
>
> The JDK can use CarrierThreadLocal
>
>
> I don't think this is a good use case for CarrierThreadLocal.
>
> But I don't think this is a big problem. We can provide a cache pool
> for native stacks.
> When the native stack is no longer used by the current virtual thread,
> we only need to resubmit the native stack back to the cache pool.
>
> On Tue, Mar 28, 2023 at 12:39 AM Chris Vest <mr.chrisvest at gmail.com>
> wrote:
>
> The JDK can use CarrierThreadLocal, so it's not a problem as long
> as the usage scope of the thread-local values are tightly
> controlled, and don't span across virtual thread context switches.
>
> On Mon, Mar 27, 2023 at 8:59 AM Jorn Vernee
> <jorn.vernee at oracle.com> wrote:
>
> Also note that there are potential issues when combining
> ThreadLocal and virtual threads, as there might be a very
> large number of threads, resulting in a very large number of
> NativeStack instances (along with their native memory stacks)
>
> This makes supporting something based on ThreadLocal directly
> in the JDK more questionable I think, since it depends on the
> particular application whether this will work well, or not.
>
> Jorn
>
> On 27/03/2023 16:29, Maurizio Cimadamore wrote:
>>
>>
>> On 27/03/2023 15:18, Glavo wrote:
>>> Another idea related to usability and performance: Can
>>> Panama provide a thread-local "native stack" to help users
>>> allocate local variables?
>>>
>>> I have heard from others that LWJGL is using this
>>> technology. I also tried to implement it:
>>>
>>> https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/NativeStack.java
>>> <https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/NativeStack.java__;!!ACWV5N9M2RV99hQ!NGWKauzXDhIIf_DFibMiRXqyafDYTQ6yIYsFfxsvKbUzFcEhN8_U-A1W85dbMIgL3u_r0jcch-NtPT65nYWdXFdzHw$>
>>>
>>> I ran benchmarks where allocating a small number of local
>>> variables was thirty times more efficient than using a
>>> confined arena.
>>> If Panama can provide such a class, it will be more
>>> convenient and faster for users to assign temporary variables.
>>
>> While the FFM API does not provide anything directly, it is
>> easy to build such an arena on top of FFM.
>>
>> https://github.com/openjdk/panama-foreign/blob/foreign-memaccess%2Babi/test/micro/org/openjdk/bench/java/lang/foreign/StrLenTest.java#L178
>> <https://urldefense.com/v3/__https://github.com/openjdk/panama-foreign/blob/foreign-memaccess*2Babi/test/micro/org/openjdk/bench/java/lang/foreign/StrLenTest.java*L178__;JSM!!ACWV5N9M2RV99hQ!LxZnM_orprKL9ZSM6WTPQb7_gKCViSmxMhd1EgUZzxUYRKZTWK40sl9oL-4aUjSL_VqkX24sYCeGGqrvbOA$>
>>
>> The above implementation is slightly simpler than what LWJGL
>> does, but it provides a large boost (because it avoids all
>> dynamic allocations).
>>
>> More advanced implementations which allocate dynamically when
>> out of space and then remember said allocation even after a
>> "release" are also possible.
>>
>> While we might add some such allocators in the future, the
>> main priority of the FFM API, design-wise, has been to make
>> sure that such custom arenas can be defined by developers
>> directly, when and if needed.
>>
>> And this is indeed the biggest shift from Java 19 (which
>> doesn't allow custom arenas) to Java 20. Java 21 just
>> iterates on the API, making it a little bit simpler to use
>> again, while retaining the capability of defining custom arenas.
>>
>> Maurizio
>>
>>>
>>> Glavo
>>>
>>> On Sat, Mar 25, 2023 at 3:07 AM Glavo <zjx001202 at gmail.com>
>>> wrote:
>>>
>>> I have run a series of benchmarks of Panama, JNI, JNA,
>>> and JNR based on the latest JDK. Here is its GitHub
>>> repository:
>>>
>>> https://github.com/Glavo/java-ffi-benchmark
>>> <https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark__;!!ACWV5N9M2RV99hQ!NGWKauzXDhIIf_DFibMiRXqyafDYTQ6yIYsFfxsvKbUzFcEhN8_U-A1W85dbMIgL3u_r0jcch-NtPT65nYWm3ZB9qA$>
>>>
>>> Here I tested the performance of no-ops, accessing
>>> structs, string conversions, and callbacks,
>>> respectively. I also tried the new isTrivial linker option.
>>> I summarized the results in README and charted them.
>>>
>>> In this email, in addition to sharing the above results,
>>> I would also like to talk about several issues I have
>>> encountered
>>>
>>> 1. MemorySegment.getUtf8String is unexpectedly slow
>>>
>>> Panama is much faster than JNA in most cases, but
>>> the operation of converting C strings to Java
>>> strings is an exception.
>>> I checked the source code of JNA and Panama, and the
>>> suspicious difference is that JNA uses strlen from
>>> the C standard library, while Panama uses Java loops.
>>> Perhaps this method can be optimized.
>>>
>>>
>>> 2. StructLayout must manually specify all padding
>>>
>>> Can we provide a convenient method for automatically
>>> padding between fields based on alignment?
>>> The current structLayout method is annoying for
>>> situations where you need to manually simulate the
>>> layout of a C struct.
>>>
>>>
>>> Glavo
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20230328/6302091c/attachment-0001.htm>
More information about the panama-dev
mailing list