Thread hangs reading from process output streams, even though process has terminated. (possible JDK bug?)
I have some code where I start an external process (ProcessBuilder.start() ,etc.) and then I spawn two worker threads to read the stdout and stderr of the external process. I directly read the streams provided by process.getInputStream() and process.getErrorStream() , I'm not wrapping them with my own streams or anything. Rather, the worker threads are calling java.io.InputStream.read(byte[]) in a loop. I've encountered a situation, where the worker threads hang despite the process having been terminated already! ( Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) , Windows 7) I'm able to caught this whilst running the Java program under the debugger. I invoked process.exitValue() under the debugger to see if the JVM has indeed realized the process has terminated. It returned 0, so it seems it knows the process has terminated. Yet the streams are still blocked, in a native method: The stdout worker thread is stuck here: Daemon Thread [ExternalProcessEclipseHelper.MainWorker] (Suspended) owns: BufferedInputStream (id=145) FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[], int, int) line: 272 BufferedInputStream.fill() line: 235 BufferedInputStream.read1(byte[], int, int) line: 275 BufferedInputStream.read(byte[], int, int) line: 334 BufferedInputStream(FilterInputStream).read(byte[]) line: 107 ExternalProcessNotifyingHelper$1(ExternalProcessHelper$ReadAllBytesTask).doRun() line: 73 The stderr worker thread is similarly stuck : Daemon Thread [ExternalProcessEclipseHelper.StdErrWorker] (Suspended) FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[]) line: 243 ExternalProcessNotifyingHelper$2(ExternalProcessHelper$ReadAllBytesTask).doRun() line: 73 Could this be a JVM bug? I don't see that this scenario should ever be happening, unless some other part of my code somehow did some violation and messed up the JVM state. I've added a sample of the relevant code I'm using here: https://github.com/bruno-medeiros/Scratchpad/tree/jvm-processio-issue However, I haven't yet been able to replicate this bug using the isolated code from there. At the moment, I can only replicate it when I run my full application. The sample code could be simplified further, but I haven't done it yet since I couldn't replicate the bug using that. One interesting bit, is that I can only replicate it when I run the application for the first time, per computer session. That is, apparently I need to reboot my computer for the bug to manifest again! I'd like to narrow this down, but I would appreciate some help or suggestions for that. What could affect the JVM, such that subsequent invocations apparently don't cause the bug? Some code cache issue? I also wonder if the OSGi runtime could be a factor here. -- Bruno Medeiros https://twitter.com/brunodomedeiros
After exploring this bug when running my full application, I have a lead on what seems to be a necessary condition/cause for it, and possibly a way to create a short reproducible case. The isolated code I posted originally is not enough. Here is what I found out. First lets call the process my Java application starts, process A, the one that terminates, and yet the stream reader threads hang upon. A necessary condition for that bug to happen, is that *another process* is started by the Java application, and similarly some worker threads are spawned to read the streams of that process. Let's call this process B. Process B doesn't not terminate because it is a server program. Here's an interesting bit: if process B is forcibly killed, the reader threads of process A become unstuck! (to be clear the processes are not related. They are not even the same program.) I should be able to reduce this to a short reproducible example, as soon as I have more time. Also, I tried JDK 8, but was not able to reproduce the issue. But given the fickle nature of this bug, it's no guarantee the bug is not present in JDK 8. So I still want to find the cause of this and see it resolved. On Wed, Apr 16, 2014 at 5:23 PM, Bruno Medeiros <bruno.do.medeiros@gmail.com
wrote:
I have some code where I start an external process (ProcessBuilder.start() ,etc.) and then I spawn two worker threads to read the stdout and stderr of the external process. I directly read the streams provided by process.getInputStream() and process.getErrorStream() , I'm not wrapping them with my own streams or anything. Rather, the worker threads are calling java.io.InputStream.read(byte[]) in a loop.
I've encountered a situation, where the worker threads hang despite the process having been terminated already! ( Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) , Windows 7)
I'm able to caught this whilst running the Java program under the debugger. I invoked process.exitValue() under the debugger to see if the JVM has indeed realized the process has terminated. It returned 0, so it seems it knows the process has terminated. Yet the streams are still blocked, in a native method:
The stdout worker thread is stuck here: Daemon Thread [ExternalProcessEclipseHelper.MainWorker] (Suspended) owns: BufferedInputStream (id=145) FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[], int, int) line: 272 BufferedInputStream.fill() line: 235 BufferedInputStream.read1(byte[], int, int) line: 275 BufferedInputStream.read(byte[], int, int) line: 334 BufferedInputStream(FilterInputStream).read(byte[]) line: 107
ExternalProcessNotifyingHelper$1(ExternalProcessHelper$ReadAllBytesTask).doRun() line: 73
The stderr worker thread is similarly stuck : Daemon Thread [ExternalProcessEclipseHelper.StdErrWorker] (Suspended) FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[]) line: 243
ExternalProcessNotifyingHelper$2(ExternalProcessHelper$ReadAllBytesTask).doRun() line: 73
Could this be a JVM bug? I don't see that this scenario should ever be happening, unless some other part of my code somehow did some violation and messed up the JVM state.
I've added a sample of the relevant code I'm using here: https://github.com/bruno-medeiros/Scratchpad/tree/jvm-processio-issue However, I haven't yet been able to replicate this bug using the isolated code from there. At the moment, I can only replicate it when I run my full application. The sample code could be simplified further, but I haven't done it yet since I couldn't replicate the bug using that.
One interesting bit, is that I can only replicate it when I run the application for the first time, per computer session. That is, apparently I need to reboot my computer for the bug to manifest again!
I'd like to narrow this down, but I would appreciate some help or suggestions for that. What could affect the JVM, such that subsequent invocations apparently don't cause the bug? Some code cache issue? I also wonder if the OSGi runtime could be a factor here.
-- Bruno Medeiros https://twitter.com/brunodomedeiros
-- Bruno Medeiros https://twitter.com/brunodomedeiros
Very high level: Ensuring that streams get EOF when subprocesses terminate is very tricky, and we did a bunch of work on Linux and Solaris to try to make it reliable. And even now we're not quite there - If grandchildren linger, they might keep file descriptors open. I'm not aware of similar problems on Windows, but it's not at all surprising if the same kinds of effects are seen. On Wed, Apr 23, 2014 at 2:31 PM, Bruno Medeiros <bruno.do.medeiros@gmail.com
wrote:
After exploring this bug when running my full application, I have a lead on what seems to be a necessary condition/cause for it, and possibly a way to create a short reproducible case. The isolated code I posted originally is not enough.
Here is what I found out. First lets call the process my Java application starts, process A, the one that terminates, and yet the stream reader threads hang upon. A necessary condition for that bug to happen, is that *another process* is started by the Java application, and similarly some worker threads are spawned to read the streams of that process. Let's call this process B. Process B doesn't not terminate because it is a server program. Here's an interesting bit: if process B is forcibly killed, the reader threads of process A become unstuck! (to be clear the processes are not related. They are not even the same program.) I should be able to reduce this to a short reproducible example, as soon as I have more time.
Also, I tried JDK 8, but was not able to reproduce the issue. But given the fickle nature of this bug, it's no guarantee the bug is not present in JDK 8. So I still want to find the cause of this and see it resolved.
On Wed, Apr 16, 2014 at 5:23 PM, Bruno Medeiros < bruno.do.medeiros@gmail.com
wrote:
I have some code where I start an external process (ProcessBuilder.start() ,etc.) and then I spawn two worker threads to read the stdout and stderr of the external process. I directly read the streams provided by process.getInputStream() and process.getErrorStream() , I'm not wrapping them with my own streams or anything. Rather, the worker threads are calling java.io.InputStream.read(byte[]) in a loop.
I've encountered a situation, where the worker threads hang despite the process having been terminated already! ( Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) , Windows 7)
I'm able to caught this whilst running the Java program under the debugger. I invoked process.exitValue() under the debugger to see if the JVM has indeed realized the process has terminated. It returned 0, so it seems it knows the process has terminated. Yet the streams are still blocked, in a native method:
The stdout worker thread is stuck here: Daemon Thread [ExternalProcessEclipseHelper.MainWorker] (Suspended) owns: BufferedInputStream (id=145) FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[], int, int) line: 272 BufferedInputStream.fill() line: 235 BufferedInputStream.read1(byte[], int, int) line: 275 BufferedInputStream.read(byte[], int, int) line: 334 BufferedInputStream(FilterInputStream).read(byte[]) line: 107
ExternalProcessNotifyingHelper$1(ExternalProcessHelper$ReadAllBytesTask).doRun()
line: 73
The stderr worker thread is similarly stuck : Daemon Thread [ExternalProcessEclipseHelper.StdErrWorker] (Suspended) FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[]) line: 243
ExternalProcessNotifyingHelper$2(ExternalProcessHelper$ReadAllBytesTask).doRun()
line: 73
Could this be a JVM bug? I don't see that this scenario should ever be happening, unless some other part of my code somehow did some violation and messed up the JVM state.
I've added a sample of the relevant code I'm using here: https://github.com/bruno-medeiros/Scratchpad/tree/jvm-processio-issue However, I haven't yet been able to replicate this bug using the isolated code from there. At the moment, I can only replicate it when I run my full application. The sample code could be simplified further, but I haven't done it yet since I couldn't replicate the bug using that.
One interesting bit, is that I can only replicate it when I run the application for the first time, per computer session. That is, apparently I need to reboot my computer for the bug to manifest again!
I'd like to narrow this down, but I would appreciate some help or suggestions for that. What could affect the JVM, such that subsequent invocations apparently don't cause the bug? Some code cache issue? I also wonder if the OSGi runtime could be a factor here.
-- Bruno Medeiros https://twitter.com/brunodomedeiros
-- Bruno Medeiros https://twitter.com/brunodomedeiros
After much chopping, I have finally narrowed it down to a single source file, and no external dependencies (other than starting Windows' cmd.exe process) : https://github.com/bruno-medeiros/Scratchpad/blob/jvm-processio-issue/jvm_pr... In this final form, I am now able to replicated the bug in my machine on many different runs of the program even beyond the first one after booting. But it doesn't occur every time, only about two thirds the time. Tried running it again with JVM 8 several times but never got it to reproduce there. Maybe it really isn't present in JVM 8, only 7. Hopefully this should be enough for JVM developers to replicate it. -- Bruno Medeiros https://twitter.com/brunodomedeiros
Any update on this? Has it been placed in the JVM bug database and/or have people tried to replicated it? (again note, it only appears to happen on JVM 7) On Tue, May 6, 2014 at 9:20 PM, Bruno Medeiros <bruno.do.medeiros@gmail.com>wrote:
After much chopping, I have finally narrowed it down to a single source file, and no external dependencies (other than starting Windows' cmd.exe process) :
https://github.com/bruno-medeiros/Scratchpad/blob/jvm-processio-issue/jvm_pr...
In this final form, I am now able to replicated the bug in my machine on many different runs of the program even beyond the first one after booting. But it doesn't occur every time, only about two thirds the time. Tried running it again with JVM 8 several times but never got it to reproduce there. Maybe it really isn't present in JVM 8, only 7. Hopefully this should be enough for JVM developers to replicate it.
-- Bruno Medeiros https://twitter.com/brunodomedeiros
-- Bruno Medeiros https://twitter.com/brunodomedeiros
Hi Bruno, Create an issue: JDK-8044321 <https://bugs.openjdk.java.net/browse/JDK-8044321> EOF does not occur reading input from spawned cmd.exe process I was able to reproduce it on java 7 but not java 8. The implementation of Process does not mix handling of one Process instance with another so it may be an interaction specific to cmd.exe. Roger On 5/29/2014 7:52 AM, Bruno Medeiros wrote:
Any update on this? Has it been placed in the JVM bug database and/or have people tried to replicated it? (again note, it only appears to happen on JVM 7)
On Tue, May 6, 2014 at 9:20 PM, Bruno Medeiros <bruno.do.medeiros@gmail.com>wrote:
After much chopping, I have finally narrowed it down to a single source file, and no external dependencies (other than starting Windows' cmd.exe process) :
https://github.com/bruno-medeiros/Scratchpad/blob/jvm-processio-issue/jvm_pr...
In this final form, I am now able to replicated the bug in my machine on many different runs of the program even beyond the first one after booting. But it doesn't occur every time, only about two thirds the time. Tried running it again with JVM 8 several times but never got it to reproduce there. Maybe it really isn't present in JVM 8, only 7. Hopefully this should be enough for JVM developers to replicate it.
-- Bruno Medeiros https://twitter.com/brunodomedeiros
On Thu, May 29, 2014 at 2:32 PM, roger riggs <roger.riggs@oracle.com> wrote:
Hi Bruno,
Create an issue: JDK-8044321 <https://bugs.openjdk.java.net/browse/JDK-8044321> EOF does not occur reading input from spawned cmd.exe process
I was able to reproduce it on java 7 but not java 8.
Cool, thanks.
The implementation of Process does not mix handling of one Process instance with another so it may be an interaction specific to cmd.exe.
Roger
I don't think so: This bug originally occurred with processes other than cmd.exe (the two processes where not even from the same executable, BTW). I only changed it to cmd.exe for the sake of having a simpler example that anyone could try, since cmd.exe is included with Windows. -- Bruno Medeiros https://twitter.com/brunodomedeiros
Thanks, feel free to add any additional information to the issue. Roger On 5/29/2014 9:41 AM, Bruno Medeiros wrote:
On Thu, May 29, 2014 at 2:32 PM, roger riggs <roger.riggs@oracle.com <mailto:roger.riggs@oracle.com>> wrote:
Hi Bruno,
Create an issue: JDK-8044321 <https://bugs.openjdk.java.net/browse/JDK-8044321> EOF does not occur reading input from spawned cmd.exe process
I was able to reproduce it on java 7 but not java 8.
Cool, thanks.
The implementation of Process does not mix handling of one Process instance with another so it may be an interaction specific to cmd.exe.
Roger
I don't think so: This bug originally occurred with processes other than cmd.exe (the two processes where not even from the same executable, BTW). I only changed it to cmd.exe for the sake of having a simpler example that anyone could try, since cmd.exe is included with Windows.
-- Bruno Medeiros https://twitter.com/brunodomedeiros
On 29/05/2014 14:32, roger riggs wrote:
Hi Bruno,
Create an issue: JDK-8044321 <https://bugs.openjdk.java.net/browse/JDK-8044321> EOF does not occur reading input from spawned cmd.exe process
I was able to reproduce it on java 7 but not java 8. Bruno - I just had a brief chat with Roger about this one as I suspect it's a dup of JDK-7147084, fixed in 7u60. It would be good if you could check that update.
-Alan.
Indeed, I gave it a try with 7u60 and I couldn't replicated it. Seems like it was a dup of that one. On Thu, May 29, 2014 at 3:38 PM, Alan Bateman <Alan.Bateman@oracle.com> wrote:
On 29/05/2014 14:32, roger riggs wrote:
Hi Bruno,
Create an issue: JDK-8044321 <https://bugs.openjdk.java.net/browse/JDK-8044321> EOF does not occur reading input from spawned cmd.exe process
I was able to reproduce it on java 7 but not java 8.
Bruno - I just had a brief chat with Roger about this one as I suspect it's a dup of JDK-7147084, fixed in 7u60. It would be good if you could check that update.
-Alan.
-- Bruno Medeiros https://twitter.com/brunodomedeiros
participants (4)
-
Alan Bateman
-
Bruno Medeiros
-
Martin Buchholz
-
roger riggs