all EventHandlerTasks in EPollPort waiting on queue

Jeremiah Ness jness at proofpoint.com
Tue Jun 19 12:35:35 UTC 2018


Hi Brian and Alan,

Thanks for considering my report.

Is there anything I could do to help triage or prioritize an investigation?

Couple thoughts on my mind:

  *   Is the issue understood?
  *   Would it be useful to summarize?
  *   A few of my colleagues are able to reproduce the problem with the provided test program. Would seeing the log output or stack traces be helpful?

Thanks,
Miah

From: Jeremiah Ness <jness at proofpoint.com>
Date: Thursday, June 14, 2018 at 10:56 AM
To: Brian Burkhalter <brian.burkhalter at oracle.com>
Cc: Alan Bateman <Alan.Bateman at oracle.com>, "nio-dev at openjdk.java.net" <nio-dev at openjdk.java.net>
Subject: Re: all EventHandlerTasks in EPollPort waiting on queue

Thank you for filing the ticket Brian.

Recently we have seen an increase in the occurrences of this bug. Roughly 2-3 times per week for some of our applications/roles/servers. I have attempted to make the test program a little simpler and removed the reflection hacks. I have attached the new version here. I am able to reproduce this issue on the following platforms:


  *   Linux - java version "1.8.0_172" - build 1.8.0_172-b11
  *   Linux - java version "9.0.4" - build 9.0.4+11
  *   Linux - java version "10" 2018-03-20 - build 10+46
  *   Mac - java version "1.8.0_131" - build 1.8.0_131-b11

Additionally I have taken a closer look at EPollPort.java and feel like the following change may address is issue:

--- a/src/java.base/linux/classes/sun/nio/ch/EPollPort.java
+++ b/src/java.base/linux/classes/sun/nio/ch/EPollPort.java
@@ -85,6 +85,7 @@ final class EPollPort
     private final ArrayBlockingQueue<Event> queue;
     private final Event NEED_TO_POLL = new Event(null, 0);
     private final Event EXECUTE_TASK_OR_SHUTDOWN = new Event(null, 0);
+    private final Event NOOP = new Event(null, 0);

     EPollPort(AsynchronousChannelProvider provider, ThreadPool pool)
         throws IOException
@@ -194,7 +195,6 @@ final class EPollPort
     private class EventHandlerTask implements Runnable {
         private Event poll() throws IOException {
             try {
-                for (;;) {
                     int n;
                     do {
                         n = EPoll.wait(epfd, address, MAX_EPOLL_EVENTS, -1);
@@ -250,7 +250,8 @@ final class EPollPort
                     } finally {
                         fdToChannelLock.readLock().unlock();
                     }
-                }
+                    // There is no real event to return. So return NOOP.
+                    return NOOP;
             } finally {
                 // to ensure that some thread will poll when all events have
                 // been consumed
@@ -288,6 +289,11 @@ final class EPollPort
                         continue;
                     }

+                    // there is nothing to do
+                    if (ev == NOOP) {
+                        continue;
+                    }
+
                     // handle wakeup to execute task or shutdown
                     if (ev == EXECUTE_TASK_OR_SHUTDOWN) {
                         Runnable task = pollTask();

Removal of the for(;;) loop I believe is the key important change because with that loop it is possible to overflow the queue, and to lose IO events. What do you think?

Thanks,
Miah


From: Brian Burkhalter <brian.burkhalter at oracle.com>
Date: Thursday, January 12, 2017 at 8:06 PM
To: Jeremiah Ness <jness at proofpoint.com>
Cc: Alan Bateman <Alan.Bateman at oracle.com>, "nio-dev at openjdk.java.net" <nio-dev at openjdk.java.net>
Subject: Re: all EventHandlerTasks in EPollPort waiting on queue

On Jan 12, 2017, at 8:03 AM, Jeremiah Ness <jness at proofpoint.com<mailto:jness at proofpoint.com>> wrote:



There does appear to be an issue here. Can you create a standalone test
case to tickle test so that we can include it in a bug report? That
would really help get to a regression test to include with the fix.

I do have a test program that can trigger the condition for me. I have attached the source code for PortTest.java. PortTest has successfully demonstrated the issue in the following configurations:

- OSX 10.11.6 with java version "1.8.0_112"
- CentOs7 with kernel 3.10.0-514.2.2.el7.x86_64 and with openjdk version "1.8.0_111"

I have filed https://bugs.openjdk.java.net/browse/JDK-8172750<https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8172750&d=DwMFAg&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=P6hDcvKJR-XgpCXh6ZenEnMYTyDmTBSvESHlaoeZnug&m=OBtwHBe_CbECkxajzDac8d2fEXQoiSRM8ZOOYYGBCts&s=6tmxao5LJNtHp_ow8xE9XWclmF6481abYh4VBflMxso&e=> to track this problem. Thank you for providing a test case.

Thanks,

Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20180619/2b0208d8/attachment-0001.html>


More information about the nio-dev mailing list