SelectableChannels and Process API

Wed Apr 15 18:14:58 UTC 2015

Hi Martin,

That sounds like finalization of the FileDescriptor, but perhaps a 
mechanism built on PhantomRef
would be easier enough to maintain and can leave the data in the pipe.

Roger

On 4/15/2015 1:59 PM, Martin Buchholz wrote:
> I was at least partly responsible for the pipe buffer cleanup code.
>
> Subprocess terminates, but may have written some data to the pipe 
> buffer (typically 4k on Linux).  Usually the pipe buffer is empty, but 
> in case it's not, you don't want to lose the straggler data, you want 
> to drain it and close the file descriptor, because it's easier to 
> manage the memory than the fd.  Messy, but I didn't see a better way.
>
> On Tue, Apr 14, 2015 at 11:31 PM, Peter Levart <peter.levart at gmail.com 
> <mailto:peter.levart at gmail.com>> wrote:
>
>     Hi Roger,
>
>     So I started new thread...
>
>
>     On 04/14/2015 11:33 PM, Roger Riggs wrote:
>
>
>         On 4/14/2015 11:47 AM, Peter Levart wrote:
>
>             I have been thinking of another small Process API update.
>             Some people find it odd how redirected in/out/err streams
>             are exposed:
>
>             http://blog.headius.com/2013/06/the-pain-of-broken-subprocess.html
>
>         yep, I've read that several times.
>
>
>     To be fair, it's mostly, but not entirely correct. The part that says:
>
>     " So when the child process exits, the any data waiting to be read
>     from its output stream is drained into a buffer. All of it. In memory.
>
>     Did you launch a process that writes a gigabyte of data to its
>     output stream and then terminates? Well, friend, I sure hope you
>     have a gigabyte of memory, because the JDK is going to read that
>     sucker in and there's nothing you can do about it. And let's hope
>     there's not more than 2GB of data, since this code basically just
>     grows a byte[], which in Java can only grow to 2GB. If there's
>     more than 2GB of data on that stream, this logic errors out and
>     the data is lost forever."
>
>     ...is exaggeration. This does not happen as the pipe has a bounded
>     buffer. When subprocess exits, there is at most that much data
>     left in the buffer (64k typically) and only that much is sucked
>     into the Java process and the underlying handle closed.
>
>
>             They basically don't like:
>
>             - that exposed Input/Output streams are buffered
>             - that underlying streams are File(Input/Output)Streams
>             which, although the backing OS implementation are not
>             files but pipes, don't expose selectable channels so that
>             non-blocking event-based IO could be performed on them.
>             - that exposed IO streams are automatically "managed" in
>             UNIX variants of ProcessImpl which needs subtle "hacks" to
>             do it in a perceptively transparent way (delayed close,
>             draining input on exit and making it available after the
>             underlying handle is already closed, ...)
>
>             So I've been playing with the idea of exposing the "real"
>             pipe channels in last couple of days. Here's the prototype
>             I came up with:
>
>             http://cr.openjdk.java.net/~plevart/jdk9-sandbox/JDK-8046092-branch/Process.PipeChannel/webrev.01/
>             <http://cr.openjdk.java.net/%7Eplevart/jdk9-sandbox/JDK-8046092-branch/Process.PipeChannel/webrev.01/>
>
>
>             This adds new Redirect type to the API and 3 new methods
>             to Process that return Pipe channels when this new
>             Redirect type is used. It's interesting that no native
>             code changes were necessary. The behavior of pipes on
>             Windows is a little different (perhaps because the Pipe
>             NIO API uses sockets under the hood on Windows - why is
>             that? Windows does have a pipe equivalent). What bothers
>             me is that file handles opened on files (when redirecting
>             to/from File) can be closed as soon as the subprocess is
>             started and the subprocess is still able to read/write
>             from the files (like with UNIX). It's not the same with
>             pipe (i.e. socket) handles on Windows. They must be closed
>             only after subprocess exits.
>
>             If this subtle difference between file handles and socket
>             handles on Windows could be dealt with (perhaps some
>             options exist that affect subprocess spawning), then the
>             extra waiting thread would not be needed on Windows.
>
>             So what do you think of this API update?
>
>         Definitely worthy of a separate thread.  It looks promising
>         and addresses some of the issues
>         raised, while moving other problems from the implementation to
>         the application.
>         Such as closing of the channels and cleanup.  I worry about
>         how the resources are freed
>         if the code spawning the app doesn't do the cleanup.  Will it
>         require hooks (like a finalizer)
>         to do the cleanup?
>         Also, it doesn't help with Martin's goal of being able to
>         implement
>         emacs in Java since it doesn't provide pty control.
>         As you are aware the complexity in Process is to ensure a
>         timely cleanup and
>         allowing the Process to terminate and release the process
>         resources
>         when it was done and not having to wait for the stdout/stderr
>         consumer.
>
>
>     I wonder how this automatic stream cleanup really helps in
>     real-world programs. It doesn't help the Process to terminate and
>     release the process resources any sooner as the process terminates
>     on it's own (unless killed) and OS releases it's resources without
>     the outside help anyway. Draining and closing the stream after the
>     process has already exited just releases one file handle (the
>     consuming side of the pipe) in a promptly manner. This could be
>     left to the user and/or finalizer. Draining after the process has
>     already exited does not help the process to exit any sooner as it
>     happens after the fact. A program that doesn't consume the stream
>     can cause the process to hang forever as the pipe's buffer is
>     bounded (64k typically). So draining and closing after the process
>     has exited only potentially helps for the last 64k of the stream
>     and only to release one file handle in a potentially more timely
>     manner.
>
>     OTOH now that ProcessImpl for UNIX does that (and why does Windows
>     implementation not do that?) sloppy programs might exist that
>     would potentially break if the status quo is not maintained.
>
>     But new functionality need not be so permissive. I'll take a look
>     at how and if Channel(s) do any kind of automatic cleanup based on
>     reachability and whether this can be bolted on for Process use. I
>     doubt it is possible to drain and close a Channel without
>     disturbing the ongoing Selector IO processing...
>
>     Regards, Peter
>
>
>         Thanks, Roger
>
>
>
>