<div dir="auto">Hi Ilya,<div dir="auto"><br></div><div dir="auto">I am one of developers which have improved (by more than 30/50%) Netty/Vertx/Quarkus exactly for that benchmark (and not only, the whole http 1.1 stack) and the problem is ..people should start profiling before driving conclusions about which component to blame (no pun intended, is sadly the same for who has wrongly assumed that Loom is good using that same benchmark as main proof). Techempower plaintext is highly pipelined (in the worst way, because is http 1.1 and NOT http 2, which is designed for that) and CPU bound, due to http encoding/decoding, especially if the framework is a "proper" one (see my rant at  <a href="https://github.com/TechEmpower/FrameworkBenchmarks/discussions/7984">https://github.com/TechEmpower/FrameworkBenchmarks/discussions/7984</a>) and materialize properly the headers; which means that an improvement in that part can be the responsible to achieve better numbers in techempower. If the framework is "smart" enough (eg by cheating, not decoding the headers received) the bottleneck than can move to the syscall cost (which I have improved in Netty by using io_uring OR replacing read/write with recv/send), but even thou, you still have the physical limits of the NIC, which bound the max achievable throughput to ~7 M req/sec, making all high level frameworks to look the same (again: without profiling CPU usage they look the same).</div><div dir="auto">Helidon, as Quarkus/Vertx/Netty and Undertow (which I know fairly well the internals, and it non blocking for that test + have a very efficient http decoding/encoding, better than Netty OOTB) are maxing out the CPU and there is very few of loom in the profiling data, hence I would look elsewhere.</div><div dir="auto">You can profile it fairly easy in a single thread too (being aware to disable jvmti thread state notifications or will severely affect loom) and verify my comments, in case.</div><div dir="auto"><br></div><div dir="auto">Hope it has helped,</div><div dir="auto"><br></div><div dir="auto">Franz</div><div dir="auto"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il mar 31 ott 2023, 20:15 Ilya Starchenko <<a href="mailto:st.ilya.101@gmail.com">st.ilya.101@gmail.com</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">Hello loom-dev team,</p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;min-height:14px"><br></p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">I would like to express my gratitude for the work being done on Project Loom. It's an exciting project with a lot of potential. I have some questions related to its performance that I would appreciate some clarification on.</p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;min-height:14px"><br></p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">I recently came across a benchmark presentation by Alan Bateman at Devoxx<<a href="https://youtu.be/XF4XZlPZc_c?si=-Qp2PampTbNGj3a5" target="_blank" rel="noreferrer">https://youtu.be/XF4XZlPZc_c?si=-Qp2PampTbNGj3a5</a>>, where the Helidon Nima framework demonstrated better performance results compared to a reactive framework. However, when I examined the Plaintext benchmark (specifically focusing on Netty and Undertow, which benchmark only plaintext), I noticed that Nima, which operates entirely on virtual threads, failed to outperform even the blocking Undertow. Additionally, I conducted tests with Tomcat and Jetty using Loom's executor, and they also did not exhibit significant improvements compared to a reactive stack. Perhaps I should ask the Helidon team, but my question is, is this the expected performance level for Project Loom, or can we anticipate better performance in the future?</p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;min-height:14px"><br></p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">Furthermore, I took a closer look at the Poller implementation<<a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/ch/Poller.java#L436" target="_blank" rel="noreferrer">https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/ch/Poller.java#L436</a>>, and I noticed that it utilizes only one thread (by default) for both read and write polling. I'm curious why there's only one thread, and wouldn't it be more efficient to have pollers matching the number of CPU cores for optimal performance?</p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;min-height:14px"><br></p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">I look forward to your insights and guidance regarding these performance concerns. Your expertise and feedback would be greatly appreciated.</p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;min-height:14px"><br></p>

<p style="margin:0px;font-style:normal;font-variant-caps:normal;font-stretch:normal;font-size:12px;line-height:normal;font-family:"Helvetica Neue";font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">- Ilya</p></div>

</blockquote></div>