Another virtual threads migration story: ReentrantReadWriteLock

Thu Jan 30 18:23:03 UTC 2025

On 2025-01-29 20:19, Dr Heinz M. Kabutz wrote:
>
> Once the write lock has been requested, no new read locks will be 
> issued (since Java 6, in Java 5 there was an issue with starvation), 
> so it could take a bit of time, depending on how long each of the 
> operations is, but eventually it should do the write.
>
> I'd investigate using StampedLock with tryOptimisticRead() and then 
> writeLock(). The idioms are a bit more complicated, but this will 
> hopefully work.
>
> Regards
>
> Heinz
> -- 
> Dr Heinz M. Kabutz (PhD CompSci)
> Author of "The Java™ Specialists' Newsletter" -www.javaspecialists.eu
> Java Champion -www.javachampions.org
> JavaOne Rock Star Speaker
> Tel: +30 69 75 595 262
> Skype: kabutz
> On 2025-01-29 20:12, robert engels wrote:
>> Given that, it still seems the writer (configuration changer I 
>> assume) is going to be potentially stalled a long time.
>>
>> In my experience, copy on write is ideal for configuration change 
>> management - it doesn’t work for things like db transactions - but I 
>> am not sure you would ever have millions of connections to a db - 
>> more like a request queue would be used by the clients, so it 
>> wouldn’t be an issue.
>>
>> Interestingly, Go doesn’t even have a reentrant lock in their stdlib.
>>
>>> On Jan 29, 2025, at 11:34 AM, Matthew Swift 
>>> <matthew.swift at gmail.com> wrote:
>>>
>>> Just to be clear, the threads are not blocked on the write lock 
>>> here. They have all successfully acquired the read lock.
>>>
>>> But I agree, copy on write is an alternative approach when 
>>> available, otherwise it's semaphores all the way down...
>>>
>>>
>>> On Wed 29 Jan 2025, 18:30 robert engels, <rengels at ix.netcom.com> wrote:
>>>
>>>     But tbh, blocking that many threads seems doesn’t seem efficient
>>>     or performant. It is isn’t cheap. I would think that a copy on
>>>     write for the configuration change would be a better solution.
>>>
>>>     > On Jan 29, 2025, at 11:23 AM, robert engels
>>>     <rengels at ix.netcom.com> wrote:
>>>     >
>>>     > Nice catch! I am not sure you are going to get a resolution on
>>>     this other than using your own implementation.
>>>     >
>>>     > The AbstractQueuedSynchronizer needs to be changed to use a
>>>     long to hold the state - which will break subclasses - so it
>>>     probably won’t happen.
>>>     >
>>>     >> On Jan 29, 2025, at 10:58 AM, Matthew Swift
>>>     <matthew.swift at gmail.com> wrote:
>>>     >>
>>>     >> Hi folks,
>>>     >>
>>>     >> As you may remember from a few months ago, we converted our
>>>     LDAP Directory server/proxy[1] over to using virtual threads.
>>>     It's been going pretty well and we're lucky enough to be able to
>>>     leverage JDK21 as we have full control over most (not all) of
>>>     the code-base, which puts us in the enviable position where we
>>>     can convert code to avoid thread pinning issues. That being
>>>     said, we regularly test using the latest JDK24 EA builds as well.
>>>     >>
>>>     >> We recently hit what I feel is quite a major limitation in
>>>     ReentrantReadWriteLock, which was somewhat hidden before in the
>>>     old world of large-but-not-super-large platform thread pools:
>>>     >>
>>>     >>    Error: Maximum lock count exceeded at
>>>     ReentrantReadWriteLock.java:535,494
>>>     AbstractQueuedSynchronizer.java:1078
>>>     ReentrantReadWriteLock.java:738 ...
>>>     >>
>>>     >> I'm sure that we're not alone in making extensive use of RW
>>>     locks for synchronizing configuration changes to runtime
>>>     components: the write lock ensures that regular processing is
>>>     paused while the configuration change is applied. The component
>>>     in this case could be something that talks to a remote
>>>     microservice over HTTP, a logging backend, etc. In this case,
>>>     there is no configuration change - just a few 100s millisecond
>>>     latency in the remote service for some reason (e.g. GC pause?),
>>>     which has caused many virtual threads to get blocked inside the
>>>     component while holding the read lock. The RW lock then fails
>>>     with the above error once there are 64K concurrent threads
>>>     holding the read lock.
>>>     >>
>>>     >> Given that scaling IO to millions of concurrent IO bound
>>>     tasks was one of the key motivations for vthreads, it seems a
>>>     bit surprising to me that a basic concurrency building block of
>>>     many applications is constrained to 64K concurrent accesses. Are
>>>     you aware of this limitation and its implications? A workaround
>>>     now is to go hunting for RW locks in our application and using
>>>     alternative approaches OR, where the lock is in a third party
>>>     library (e.g. logging / telemetry), wrapping the library calls
>>>     in a Semaphore limited to <64K permits. It seems a bit
>>>     unsatisfactory to me. What do you think? Are there plans to
>>>     implement a RW lock based on AbstractQueuedLongSynchronizer?
>>>     >>
>>>     >> Kind regards,
>>>     >> Matt
>>>     >>
>>>     >> [1] those unfamiliar with the tech, think of it is a
>>>     distributed database for storing identities
>>>     >
>>>
Following on from my suggestion to consider using StampedLock instead of 
ReentrantReadWriteLock - a word of warning that we cannot use it as a 
drop-in replacement with new StampedLock().asReadWriteLock(), because it 
does not have writer starvation protection. RRWL does. In your example, 
with millions of readers and one occasionally writer, the write lock 
might never become available. Here is a small example:

import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.locks.*;

// Based on email discussion in loom-dev on 2025-01-29 entitled:
// Another virtual threads migration story: ReentrantReadWriteLock
public class StampedLockRWLockStarvation {
     public static void main(String... args) throws InterruptedException {
         var rwlocks = List.of(new ReentrantReadWriteLock(),
                 new StampedLock().asReadWriteLock());

         for (ReadWriteLock rwlock : rwlocks) {
             if (checkForWriterStarvation(rwlock) > 1_000_000_000) {
                 throw new AssertionError("Writer starvation occurred!!!");
             } else {
                 System.out.println("No writer starvation");
             }
         }
     }

     private static long checkForWriterStarvation(ReadWriteLock rwlock) 
throws InterruptedException {
         System.out.println("Checking " + rwlock.getClass());
         try (var mainPool = Executors.newVirtualThreadPerTaskExecutor()) {
             mainPool.submit(() -> {
                 System.out.println("Going to start readers ...");
                 try (var pool = 
Executors.newVirtualThreadPerTaskExecutor()) {
                     for (int i = 0; i < 10; i++) {
                         int readerNumber = i;
                         pool.submit(() -> {
                             rwlock.readLock().lock();
                             try {
                                 System.out.println("Reader " + 
readerNumber + " is reading ...");
                                 Thread.sleep(1000);
                             } catch (InterruptedException e) {
                                 throw new 
CancellationException("interrupted");
                             } finally {
                                 rwlock.readLock().unlock();
                             }
                             System.out.println("Reader " + readerNumber 
+ " is done");
                         });
                         try {
                             Thread.sleep(500);
                         } catch (InterruptedException e) {
                             throw new RuntimeException(e);
                         }
                     }
                 }
             });
             Thread.sleep(1800);
             System.out.println("Going to try to write now ...");
             long timeToAcquireWriteLock = System.nanoTime();
             rwlock.writeLock().lock();
             try {
                 timeToAcquireWriteLock = System.nanoTime() - 
timeToAcquireWriteLock;
                 System.out.printf("time to acquire write lock = %dms%n",
                         (timeToAcquireWriteLock / 1_000_000));
                 System.out.println("Writer is writing ...");
                 Thread.sleep(1000);
             } catch (InterruptedException e) {
                 throw new CancellationException("interrupted");
             } finally {
                 rwlock.writeLock().unlock();
             }
             System.out.println("Writer is done");
             return timeToAcquireWriteLock;
         }
     }
}

With ReentrantReadWriteLock, once we ask for the write lock, no more 
read locks are issued until that has been serviced.

Using the correct idioms for StampedLock with tryOptimisticRead() should 
avoid this starvation, but we do have to be careful that we might be 
reading in-progress writes.

StampedLock would not have a practical limit on number of concurrent 
reads AFAIK.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20250130/9991e6b3/attachment-0001.htm>