Extent-Local memory sharing?
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Mon Oct 3 09:11:00 UTC 2022
Extent-local memory can be shared across multple virtual threads created
by the same "scope" (e.g. a StructuredTaskScope).
Depending on how an application is structured, this might be useful or
not. Working with extent-local memory imposes "structure" on your
application - that is, a thread creates an extent (typically with a
try-with-resources) which is then shared by all the threads that are
forked within that extent (e.g. inside the try-with-resources). Only the
original thread can close the extent, and it can only do so when all the
"nested" computation has completed.
So, if your program has some kind of structure, extents help you manage
that structure, and give you guarantees in return (which we can use to
implement more efficient memory sessions). But if your program is
non-structured, using extents will probably feel like fitting a round
peg in a square hole.
To bring up a C/C++ analogy, if a program does a lot of stack
allocation, then it's likely that using extent-local memory would be a
good choice: an extent-local memory session is very similar,
semantically (probably even better) to a RAII block in C++. But if your
program does a lot of "loose" malloc/free (maybe relying on some other
implicit invariants to enforce correctness), extent-local memory is
probably not going to help.
So, whether extent-local memory helps or not will depend a lot, in
practice, on how the framework implementation works.
Maurizio
On 01/10/2022 21:49, Gavin Ray wrote:
> I was reading this guide today and noticed this section:
>
> https://quarkus.io/guides/virtual-threads#the-netty-problem
> <https://urldefense.com/v3/__https://quarkus.io/guides/virtual-threads*the-netty-problem__;Iw!!ACWV5N9M2RV99hQ!LWNwgj4tyfm1u8uYlVxh5eVSrcZC2FC8Jrp6zKlGUKctiqOjyay9mhi2_WbT9yzgkiqhkYQLuTo1o4cnTu_B0pSsDLkq$>
>
> Wouldn't extent-local Memory solve this?
> -----
>
> "For JSON serialization, Netty uses their custom implementation of
> thread locals, FastThreadLocal to store buffers. When using virtual
> threads in quarkus, then number of virtual threads simultaneously
> living in the service is directly related to the incoming traffic. It
> is possible to get hundreds of thousands, if not millions, of them.
>
> If they need to serialize some data to JSON they will end up creating
> as many instances of FastThreadLocal, resulting on a massive memory
> consumption as well as exacerbated pressure on the garbage collector.
> This will eventually affect the performance of the application and
> inhibit its scalability."
>
> On Mon, Sep 26, 2022, 1:55 AM Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> wrote:
>
> Hi Gavin,
> I think such a memory session would work as follows:
>
> * it will create a _fresh_ extent local on creation, and bind it
> * when accessing memory, it will check whether the scope local is
> bound in the thread (meaning it's one of the thread that inherited
> it, e.g. via StructuredTaskScope)
> * when closing, it will try to unbind the extent local, and if a
> structure violation arises (e.g. there are threads still running
> which inherited the extent local), an exception is thrown.
>
> Last time I checked, this did not require a lot of code, and could
> be done entirely in Java (although using a few non exported
> functionalities of extent locals).
>
> I think the trick is that, by leaning hard on extent locals and
> their inheritance across threads, things will "just work".
>
> The main unknown is how much does it take to check that a scope
> local has been inherited; I was afraid the cost of this was going
> to be prohibitive in a use case such as memory access, but from
> some quick testing I did at the time (which I did when extent
> local was called ScopeLocal), it all seemed to unroll, hoist and
> inline quite nicely (probably thanks to the C2 optimizations baked
> in for extent locals). Of course we will have to retest again as
> the implementation for extent locals becomes more mature.
>
> Maurizio
>
> On 23/09/2022 21:26, Gavin Ray wrote:
>> Maurizio,
>>
>> Going through my inbox after sending the last message, somehow I
>> missed this email.
>>
>> Thanks a ton for such a comprehensive answer
>> (I ought to start collecting these in a scrapbook for good measure!)
>>
>> /Finally, it is possible that we will introduce a new kind of
>> memory
>> session that is confined not to a single thread, but to an
>> _extent_
>> instead. This means that all threads created in a single
>> extent will be
>> able to access a given memory segment, but threads outside
>> that extent
>> will not be able to do so./
>>
>>
>> This sounds like it would be fantastic, especially if you can
>> piggyback off of
>> work & guarantees already provided by the Loom scheduler/executors.
>>
>> It feels like there are a _lot_ of usecases where such
>> "extent-local" memory sharing would be beneficial,
>> maybe it even unlocks some uses that the JVM wasn't viable for
>> before -- who knows?
>>
>> This would be an integration with the compiler, and not just a
>> Java library-side feature if I understand it correctly?
>>
>> On Mon, Sep 12, 2022 at 8:57 AM Maurizio Cimadamore
>> <maurizio.cimadamore at oracle.com> wrote:
>>
>> Hi Gavin,
>> whether you can access a memory segment from multiple threads
>> or not
>> generally depends on the memory session attached to the
>> segment. If you
>> create a segment with a "shared" memory session, then the
>> resulting
>> segment will be accessible in a "racy" way from multiple
>> threads. If the
>> session associated with the segment is confined, only one
>> thread can
>> access. When accessing memory in a segment that is shared, some
>> synchronization has to occur between the accessing threads to
>> make sure
>> they don't step on each other toes. With Panama, if you have
>> VarHandle
>> for memory access, you have a big set of memory access
>> operations at
>> your disposal, dependning on the level of synchronization
>> required (e.g.
>> plain access, acquire/relese access, volatile access, atomic
>> access).
>>
>> Then there's Loom. From Panama perspective, a virtual thread
>> is just a
>> thread, so if you want to grant a memory segment access from
>> multiple
>> virtual threads you need a shared session.
>>
>> Then there's List itself. Reading and writing on that list
>> concurrently
>> from multiple thread can itself lead to issues (e.g. missing
>> updates).
>> So, in general, when accessing data structures from multiple
>> threads,
>> you either need a data structure that is concurrent by design
>> (e.g.
>> concurrent hash map, or blocking queue, etc.). Or you need to
>> roll in
>> your synchronization code. What the right answer is often
>> depends on the
>> nature of your application.
>>
>> Finally, it is possible that we will introduce a new kind of
>> memory
>> session that is confined not to a single thread, but to an
>> _extent_
>> instead. This means that all threads created in a single
>> extent will be
>> able to access a given memory segment, but threads outside
>> that extent
>> will not be able to do so. This would be a good addition when
>> working
>> with virtual threads, because in the case of virtual threads
>> some
>> additional bookkeeping is set up by the JDK runtime so that,
>> e.g. when
>> using a StructuredTaskScope, it is not possible to close the
>> task scope
>> before all the threads forked by that scope have completed.
>> This gives
>> the memory session API a very nice semantics for its close()
>> operation,
>> that we'd like to take advantage of at some point.
>>
>> Thanks
>> Maurizio
>>
>> On 10/09/2022 16:11, Gavin Ray wrote:
>> > Reading through the docs for the Extent-Local preview, I
>> was trying to
>> > understand whether
>> > this would be usable for sharing a buffer/memory pool
>> across virtual
>> > threads?
>> >
>> > Suppose you have some class:
>> >
>> > class BufferPool {
>> > private List<MemorySegment> buffers;
>> > }
>> >
>> > The document says that the data must be immutable
>> > But there is "interior" immutability, and "surface"
>> immutability
>> >
>> > If multiple virtual threads shared the memory and some
>> potentially
>> > perform write operations on the MemorySegment's inside of
>> the list
>> > would that be valid behavior?
>> >
>> > Or does this even make sense to do?
>> > (Concurrency/parallelism are probably the things I know the
>> least
>> > about in software)
>> >
>> >
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20221003/6bee407b/attachment-0001.htm>
More information about the panama-dev
mailing list