Library enhancement proposal for copying data from/to Reader/Writer InputStream/OutputStream

Fri Nov 21 07:29:13 UTC 2014

Hi Stuart,

> Am 21.11.2014 um 04:05 schrieb Stuart Marks <stuart.marks at oracle.com>:
> 
> Hi Patrick,
> 
> Good to meet you in Antwerp!

The pleasure was all mine :-)

> On 11/20/14 10:30 AM, Patrick Reinhart wrote:
>>> Am 20.11.2014 um 10:22 schrieb Pavel Rappo <pavel.rappo at oracle.com>:
>>> 
>>> There is at least one method in the JDK with similar characteristics: java.nio.file.Files#copy(java.io.InputStream, java.io.OutputStream)
>>> But, (1) it has a private access (2) even if we made it public I doubt java.nio.file.Files would be a good place for it
>>> 
>> 
>> I would suggest to introduce a separate IOUtil class holding such utility methods. Additionally to copyTo() and copyFrom() method could be added later for more intuitive usage. Also the copy method within the Files method could be replaced with a static reference to the IOUtil class.
> 
> Thanks to Pavel for pointing out the existing copy operations in the nio Files class. I think there's a case for the InputStream => OutputStream copy method to be placed there too, although I admit it is somewhat unusual in that it doesn't actually have anything to do with files.
> 
> At my first encounter with the nio.Files class some years ago I saw the following copying methods:
> 
>    copy(istream, targetfile)
>    copy(sourcefile, ostream)
>    copy(sourcefile, targetfile)
> 
> and I immediately thought, "Where is copy(istream, ostream)?" So to me at least, it makes more sense to be in the Files class than in some newly created IOUtils class. (I’d like to hear further from Alan on this.)
> 

The reason to not put those copy operations in the Files class in the first place, is the fact that this operation is not specific to files. If I would like to copy some database blob stream to the servlet output stream for example, there is no relation to a file at all. 

> As Pavel pointed out, the logic is also already there in the Files class. Probably the way to proceed would be to rename the existing (private) method to be something like copyInternal() and then create a new, public copy(istream, ostream) method that does argument checking before calling copyInternal().
> 

My first idea was to even in fact, to some sort of copy function on the InputStream like „copyFrom(OutputStream)“ and on the OutputStream something like „copyTo(InputStream)“.  Those methods would use the "copyInternal“ method on Files if the one would be made default access.

The same would the also apply to Reader/Writer of course, for which there would be the need to have a similar copy method.

>>> P.S. The thing that confuses me though, is the progress consumer. I believe this feature is rarely needed (at least in a way you described it).
>>> If you want to do something like this, you should probably go asynchronous with full blown solution like what they have in javax.swing.SwingWorker.
>>> 
>>> -Pavel
>> 
>> The method having a IntConsumer I have already discussed with some other colleagues and they suggested to better use a IntPredicate in order to have the possibility to interrupt the copy process, without having to interrupt any threads. Additionally there is still the possibility to use such a Predicate for counting the total or using the information for other possibilities.
> 
> I'd suggest starting simple with a copy(istream, ostream) operation and considering some kind of interruptible, progress-reporting operation separately. It would seem quite limiting if the *only* progress-reporting operation in the system were the stream copier. We’d want a way to apply such a mechanism to other long-running operations.

Agreed. I could implement some Input-/OutputFilterStream that could implement such a behavior, without having to do that on the API.

> I think the progress update reports also need to be decoupled from the actual I/O operations. For example, the current buffer size in nio.Files is 8192 bytes. If the streams are connected to a fast network connection, this will result in thousands of calls to the predicate per second. On the other hand, if the buffer size were raised, and the streams are connected to a slow network connection -- like my home internet connection :-) -- that might result in too few callbacks per second.

I see your point for decoupling there. So it would be more practical to have the actual buffer size being optionally specified?  

- by the way: I also don’t have a real fast internet connection ;-)

> How to report progress from a running operation is an interesting problem and it’s worthy of considering, but a copying utility doesn't seem like quite the right place for it.

I see your point there.

Patrick