<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jul 19, 2024 at 6:03 AM Jorn Vernee <<a href="mailto:jorn.vernee@oracle.com">jorn.vernee@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p>Many (all?) of the nifty solutions can be built on top of the
current one. The problem is the allocation bottleneck, so let's
focus on that first. I think the most straight-forward solution is
to use a better allocator. The one in the default Arenas is quite
generic. For instance, you could create a global free list
specifically dedicated to capture state segments. (Or, as
suggested, use the built-in Java heap allocator).<br></p></div></blockquote><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">I agree that we can definitely build fancier APIs on top of the current one (in fact I'm doing so).</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The thing I struggle with though is that when we talk about this kind of thing in general programming, the best solution is consistently found to be stack allocation. For example, we allocate integer sized things on the stack all the time. If we were using C, we would be stack allocating as a matter of course. It seems like it should be possible for the downcall procedure to handle this optimally using a stack allocation which is hidden from the user. But for that to work, the downcall stub would have to eject the values before it returns and loses its frame, which is why I jumped to that solution. In retrospect I realize that it was a bit of an XY problem description. The thing I really want to solve is allocation overhead, and it feels like stack allocation should be the way.</div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>> thread locals and virtual threads do not get along very
well, for one thing<br>
</p>
<p>If the thread local was holding some kind of heavy resource,
having one per virtual thread (of which there might be millions)
would be problematic. For holding a single 4 to 12 byte memory
segment, maybe not so much? But, it might also be possible for us
to expose a per platform thread (not vthread) free list.<br></p></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">That could be interesting. For the most part when we talk about pooling things and virtual threads, the conclusion is usually to use striping to avoid contention and balance problems. I think the Loom team has discussed doing more with the carrier thread, but I believe that topic is somewhat controversial at present (or at least associated with some subtle issues that need to be taken into account).</div></div><div><br></div></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr">- DML • he/him<br></div></div></div>