all EventHandlerTasks in EPollPort waiting on queue

Jeremiah Ness jness at proofpoint.com
Mon Jan 9 17:16:32 UTC 2017


Using jre8u112 on CentOs7.

I have an application which uses AsynchronousChannelGroup.withFixedThreadPool
with 5 threads. When the application in under high load creating 1000s of
AsynchronousSocketChannels per second, simultaneously closing many
AsynchronousSocketChannels, on occasion all of the EventHandlerTasks in
EPollPort become stuck waiting for events on the EPollPort.queue. All the
threads have the following stack:

"CompletionThread-5" #33 daemon prio=5 os_prio=0 tid=0x00007f0a2829f800 nid=0x5c41 waiting on condition [0x00007f0a11ee1000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000006c4fe8868> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
        at sun.nio.ch.EPollPort$EventHandlerTask.run(EPollPort.java:262)
        at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)


They all appear to be waiting for the queue to be filled (calling
BlockingQueue.take) however there are no threads calling the native method
epollWait to get more IO events. This condition persists until the application
is restarted.

By examining the EPollPort class I have the understanding that one of the
threads should be polling. It this correct?

By examining the EPollPort.EventHandlerTask.poll method I am wondering if there
is a code path that would allow all threads to be waiting on the queue. In
particular is the following possible within EPollPort.EventHandlerTask.poll:

1. The native method epollWait returns 512 events.
2. Before the fdToChannelLock.readLock is acquired the channel associated with
   the 512th event is closed.
3. The fixed size EPollPort.queue is filled to size 511.
4. The 512th event is processed, however because it has been closed it is no
   longer in the fdToChannel map
5. The thread loops around the for(;;) loop and calls epollWait again
6. epollWait returns 2 more events
7. The fixed size queue is filled to its maximum capacity of 512.
8. The finally queue.offer(NEED_TO_POLL) call fails because the queue is full.

If this occurs would all the EventHandlerTasks then eventually be stuck as per
the stack trace above?

Thanks,
Jeremiah Ness






More information about the nio-dev mailing list