all EventHandlerTasks in EPollPort waiting on queue
Jeremiah Ness
jness at proofpoint.com
Wed Jan 11 22:11:58 UTC 2017
(Sorry is this is the wrong place to discuss this matter. If you know of a
better place, please let me know.)
I suspect there is a bug in EPollPort.java (and KQueuePort.java) and am looking
for a more informed opinion.
Please reference the source code below from
src//solaris/classes/sun/nio/ch/EPollPort.java.
On line 244 below the poll() method attempts to insert the NEED_TO_POLL event
into the queue. The queue is a fixed size ArrayBlockingQueue of size
MAX_EPOLL_EVENTS. On line 244, if the queue is full, then the event silently
fails to be inserted. The NEED_TO_POLL event is critical for the operation of
the EPollPort as it is the event which signals one of the threads to poll
again.
I believe this is what is causing all of my application’s completion threads to
get stuck.
The queue can become full if:
- the inner loop (lines 203 -> 235) is processing MAX_EPOLL_EVENTS events (512
events)
- the last inner loop iteration gets a null channel on line 223
- in this case we loop around to line 193 instead of returning from the method
- we get 2 more events when calling epollWait again on line 194
- these events fill the queue to its maximum size
- we return from the method however fail to insert NEED_TO_POLL in the queue
Am I perhaps missing something?
Thanks for your thoughts.
191 private Event poll() throws IOException {
192 try {
193 for (;;) {
194 int n = epollWait(epfd, address, MAX_EPOLL_EVENTS);
195 /*
196 * 'n' events have been read. Here we map them to their
197 * corresponding channel in batch and queue n-1 so that
198 * they can be handled by other handler threads. The last
199 * event is handled by this thread (and so is not queued).
200 */
201 fdToChannelLock.readLock().lock();
202 try {
203 while (n-- > 0) {
204 long eventAddress = getEvent(address, n);
205 int fd = getDescriptor(eventAddress);
206
207 // wakeup
208 if (fd == sp[0]) {
209 if (wakeupCount.decrementAndGet() == 0) {
210 // no more wakeups so drain pipe
211 drain1(sp[0]);
212 }
213
214 // queue special event if there are more events
215 // to handle.
216 if (n > 0) {
217 queue.offer(EXECUTE_TASK_OR_SHUTDOWN);
218 continue;
219 }
220 return EXECUTE_TASK_OR_SHUTDOWN;
221 }
222
223 PollableChannel channel = fdToChannel.get(fd);
224 if (channel != null) {
225 int events = getEvents(eventAddress);
226 Event ev = new Event(channel, events);
227
228 // n-1 events are queued; This thread handles
229 // the last one except for the wakeup
230 if (n > 0) {
231 queue.offer(ev);
232 } else {
233 return ev;
234 }
235 }
236 }
237 } finally {
238 fdToChannelLock.readLock().unlock();
239 }
240 }
241 } finally {
242 // to ensure that some thread will poll when all events have
243 // been consumed
244 queue.offer(NEED_TO_POLL);
245 }
On 1/9/17, 12:16 PM, "nio-dev on behalf of Jeremiah Ness" <nio-dev-bounces at openjdk.java.net on behalf of jness at proofpoint.com> wrote:
>Using jre8u112 on CentOs7.
>
>I have an application which uses AsynchronousChannelGroup.withFixedThreadPool
>with 5 threads. When the application in under high load creating 1000s of
>AsynchronousSocketChannels per second, simultaneously closing many
>AsynchronousSocketChannels, on occasion all of the EventHandlerTasks in
>EPollPort become stuck waiting for events on the EPollPort.queue. All the
>threads have the following stack:
>
>"CompletionThread-5" #33 daemon prio=5 os_prio=0 tid=0x00007f0a2829f800 nid=0x5c41 waiting on condition [0x00007f0a11ee1000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000006c4fe8868> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
> at sun.nio.ch.EPollPort$EventHandlerTask.run(EPollPort.java:262)
> at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
>
>They all appear to be waiting for the queue to be filled (calling
>BlockingQueue.take) however there are no threads calling the native method
>epollWait to get more IO events. This condition persists until the application
>is restarted.
>
>By examining the EPollPort class I have the understanding that one of the
>threads should be polling. It this correct?
>
>By examining the EPollPort.EventHandlerTask.poll method I am wondering if there
>is a code path that would allow all threads to be waiting on the queue. In
>particular is the following possible within EPollPort.EventHandlerTask.poll:
>
>1. The native method epollWait returns 512 events.
>2. Before the fdToChannelLock.readLock is acquired the channel associated with
> the 512th event is closed.
>3. The fixed size EPollPort.queue is filled to size 511.
>4. The 512th event is processed, however because it has been closed it is no
> longer in the fdToChannel map
>5. The thread loops around the for(;;) loop and calls epollWait again
>6. epollWait returns 2 more events
>7. The fixed size queue is filled to its maximum capacity of 512.
>8. The finally queue.offer(NEED_TO_POLL) call fails because the queue is full.
>
>If this occurs would all the EventHandlerTasks then eventually be stuck as per
>the stack trace above?
>
>Thanks,
>Jeremiah Ness
>
>
>
>
More information about the nio-dev
mailing list