Detecting Thread Local Support

Francesco Nigro nigro.fra at gmail.com
Mon Feb 20 23:44:54 UTC 2023


Last time I've checked on the OpenJDK src there's something that could help
there i.e.
https://github.com/openjdk/jdk/blob/861cc671e2e4904d94f50710be99a511e2f9bb68/src/java.base/share/classes/sun/nio/ch/Util.java#L56

but probably will need to be exposed in the right way, and currently thread
local pooling is outside the scoped value purposes, meaning that we would
need something "different".
Per-carrier thread locals aren't free and due to the work stealing nature
of v threads, a release of the pooled instance can land in a different
carrier from the originating one, that's something to be considered.
But still, I agree with Carl M that other solutions are worse, on the
paper, due to the presence of atomic operations, likely contended (even if
wait-free as xchg or others, or by using multiplicative methods to
distribute the contention).


Il mar 21 feb 2023, 00:33 robert engels <rengels at ix.netcom.com> ha scritto:

> Why do you think that VirtualThreads do not support ThreadLocal? From the
> docs:
>
> Finally, another aspect that works correctly in virtual threads but
> deserves being revisited for better scalability is thread-local variables,
> both regular and inheritable. Virtual threads support thread-local behavior
> the same way as platform threads, but because virtual threads can be very
> numerous, thread locals should be used only after careful consideration.
>
>
> If you expect every thread to have this trace then the memory will be used
> regardless.
>
>
> On Feb 20, 2023, at 5:17 PM, Carl M <java at rkive.org> wrote:
>
> Re sending without HTML
>
> On 02/20/2023 3:15 PM PST Carl M <java at rkive.org> wrote:
>
>
> While testing out Virtual Threads with project Loom, I encountered some
> challenges that I was hoping this mailing list could provide guidance on.
>
> I have a tracing library that uses ThreadLocals for recording events and
> timing info. The concurrency is structured so that each thread is the sole
> writer to it's own trace buffer, but separate threads can come in and read
> that data asynchronously. I am using ThreadLocals to avoid contention
> between multiple tracing threads. Secondarily, I depend on threads exiting
> for automatic clean up of the trace data per thread.
>
> Virtual threads present a hard to overcome challenge, because I can't find
> a way to tell if ThreadLocals are supported. One of the value propositions
> of my library is that it has a consistent and low overhead. Specifically,
> calling ThreadLocal.set() throws an UnsupportedOperationException in the
> event that they are not allowed. In the case of using Virtual threads, the
> likelihood of this happening is much higher, since users are now able to
> create threads cheaply. I have explored several work-arounds, but not being
> able to tell is one I can't seem to cleanly overcome. Some ideas that did
> not pan out:
>
> * Use a ConcurrentHashMap to implement my own "threadlocal" like solution.
> Two problems come up: 1. It's easy to accidentally keep the thread alive,
> and 2. When Thread Locals are supported, my library doesn't get the speedup
> from them.
>
> * Use an AtomicReferenceArray and hash into a fixed size of buckets. This
> avoids using the Thread as a Key, and pays a minor cost of synchronizing on
> the bucket for recording trace data. In effect it's a poor man's
> ThreadLocal. However, If I get unlucky there will be contention on a bucket
> that doesn't naturally shard itself like CHM does.
>
> * Do Nothing. This causes callers to allocate a ton of memory since the
> ThreadLocal.initialValue() gets called a ton, leading to unpredictable
> tracer overhead. There is a small but noticeable amount of overhead for
> creating the initial value (like registering with the reader) so this ends
> up not being practical.
>
> * A Hybrid of ThreadLocal when supported and fallback to CHM or ARA as
> mentioned above. This is the solution I came up with, where my ThreadLocal
> calls get() but has no initialValue() override. If the value is null, I
> attempt to set it. If there is an exception, I write the value to the
> CHM/ARA and then check there first for future get() calls. The problem with
> this is that the exception from set() causes an unacceptable amount of
> overhead for something that should have been very cheap. It isn't
> sufficient to check if the thread is virtual to see if TLs are supported,
> so I can't check the class name of the thread apriori. And, since multiple
> types of threads are calling into my library, I can't require callers to
> use TLs.
>
>
> I'm kind of at a loss as to how to efficiently fallback to a slower
> implementation when TLs aren't supported, since I can't tell if they are or
> not. (e.g. can't tell if the electric fence is on without touching it).
> Again, I'd prefer to keep the fast ThreadLocals if they are supported
> though.
>
>
> I'm looking for ideas (or just to register feedback) with this email, and
> have been otherwise very happy with the progress on project Loom.
>
> Carl
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20230221/cd2918b2/attachment.htm>


More information about the loom-dev mailing list