<div dir="ltr"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The JDK can use CarrierThreadLocal<br></blockquote><div><br></div><div>I don't think this is a good use case for CarrierThreadLocal.<br></div><div><br></div><div>But I don't think this is a big problem. We can provide a cache pool for native stacks. </div><div>When the native stack is no longer used by the current virtual thread, we only need to resubmit the native stack back to the cache pool.</div><div> </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Mar 28, 2023 at 12:39 AM Chris Vest <<a href="mailto:mr.chrisvest@gmail.com" target="_blank">mr.chrisvest@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">The JDK can use CarrierThreadLocal, so it's not a problem as long as the usage scope of the thread-local values are tightly controlled, and don't span across virtual thread context switches.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Mar 27, 2023 at 8:59 AM Jorn Vernee <<a href="mailto:jorn.vernee@oracle.com" target="_blank">jorn.vernee@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>Also note that there are potential issues when combining
ThreadLocal and virtual threads, as there might be a very large
number of threads, resulting in a very large number of NativeStack
instances (along with their native memory stacks)</p>
<p>This makes supporting something based on ThreadLocal directly in
the JDK more questionable I think, since it depends on the
particular application whether this will work well, or not.<br>
</p>
<p>Jorn<br>
</p>
<div>On 27/03/2023 16:29, Maurizio
Cimadamore wrote:<br>
</div>
<blockquote type="cite">
<p><br>
</p>
<div>On 27/03/2023 15:18, Glavo wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Another idea related to usability and
performance: Can Panama provide a thread-local "native stack"
to help users allocate local variables?
<div><br>
</div>
<div>I have heard from others that LWJGL is using this
technology. I also tried to implement it:</div>
<div><br>
</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div><a href="https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/NativeStack.java__;!!ACWV5N9M2RV99hQ!NGWKauzXDhIIf_DFibMiRXqyafDYTQ6yIYsFfxsvKbUzFcEhN8_U-A1W85dbMIgL3u_r0jcch-NtPT65nYWdXFdzHw$" target="_blank">https://github.com/Glavo/java-ffi-benchmark/blob/main/src/main/java/benchmark/experimental/NativeStack.java</a></div>
<div><br>
</div>
</blockquote>
<div>
<div>
<div>I ran benchmarks where allocating a small number of
local variables was thirty times more efficient than
using a confined arena.<br>
</div>
<div>If Panama can provide such a class, it will be more
convenient and faster for users to assign temporary
variables.</div>
</div>
</div>
</div>
</blockquote>
<p>While the FFM API does not provide anything directly, it is
easy to build such an arena on top of FFM.</p>
<p><a href="https://github.com/openjdk/panama-foreign/blob/foreign-memaccess%2Babi/test/micro/org/openjdk/bench/java/lang/foreign/StrLenTest.java#L178" target="_blank">https://github.com/openjdk/panama-foreign/blob/foreign-memaccess%2Babi/test/micro/org/openjdk/bench/java/lang/foreign/StrLenTest.java#L178</a></p>
<p>The above implementation is slightly simpler than what LWJGL
does, but it provides a large boost (because it avoids all
dynamic allocations).</p>
<p>More advanced implementations which allocate dynamically when
out of space and then remember said allocation even after a
"release" are also possible.</p>
<p>While we might add some such allocators in the future, the main
priority of the FFM API, design-wise, has been to make sure that
such custom arenas can be defined by developers directly, when
and if needed.</p>
<p>And this is indeed the biggest shift from Java 19 (which
doesn't allow custom arenas) to Java 20. Java 21 just iterates
on the API, making it a little bit simpler to use again, while
retaining the capability of defining custom arenas.</p>
<p>Maurizio<br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div><br>
</div>
<div>Glavo</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Sat, Mar 25, 2023 at
3:07 AM Glavo <<a href="mailto:zjx001202@gmail.com" target="_blank">zjx001202@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">I have run a series of benchmarks of Panama,
JNI, JNA, and JNR based on the latest JDK. Here is its
GitHub repository:<br>
<div><br>
</div>
<div> <a href="https://urldefense.com/v3/__https://github.com/Glavo/java-ffi-benchmark__;!!ACWV5N9M2RV99hQ!NGWKauzXDhIIf_DFibMiRXqyafDYTQ6yIYsFfxsvKbUzFcEhN8_U-A1W85dbMIgL3u_r0jcch-NtPT65nYWm3ZB9qA$" target="_blank">https://github.com/Glavo/java-ffi-benchmark</a></div>
<div><br>
</div>
<div>Here I tested the performance of no-ops, accessing
structs, string conversions, and callbacks,
respectively. I also tried the new isTrivial linker
option.</div>
<div>I summarized the results in README and charted them.<br>
</div>
<div><br>
</div>
<div>In this email, in addition to sharing the above
results, I would also like to talk about several issues
I have encountered<br>
</div>
<div><br>
</div>
<div>1. MemorySegment.getUtf8String is unexpectedly slow</div>
<div><br>
</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>Panama is much faster than JNA in most cases, but
the operation of converting C strings to Java strings
is an exception.</div>
<div>I checked the source code of JNA and Panama, and
the suspicious difference is that JNA uses strlen from
the C standard library, while Panama uses Java loops. </div>
<div>Perhaps this method can be optimized.</div>
</blockquote>
<div><br>
</div>
<div>2. StructLayout must manually specify all padding</div>
<div><br>
</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>Can we provide a convenient method for
automatically padding between fields based on
alignment?</div>
<div>The current structLayout method is annoying for
situations where you need to manually simulate the
layout of a C struct.</div>
</blockquote>
<div><br>
</div>
<div>Glavo</div>
</div>
</blockquote>
</div>
</blockquote>
</blockquote>
</div>
</blockquote></div>
</blockquote></div>