RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for reading/writing a string from/to a file

Joe Wang huizhe.wang at oracle.com
Tue Jun 12 16:52:09 UTC 2018

Hi all,

It's been a while since the 1st round of reviews. Thank you all for the 
valuable comments and suggestions!  Note that Roger's last response went 
to nio-dev, but not core-libs-dev, I've therefore attached it below.

CSR [2]: the CSR is now approved. Note the write method has been changed 
to writeString.

Impl [3]: for performance reason and the different requirement from 
byte-string conversion for malformed/unmappable character handing, the 
implementation uses a specific method separate from the existing ones. 
Both Sherman and Alan preferred specific method than adding parameters 
to the APIs that might be error prone. Thanks Sherman for the code!

JMH benchmark tests showed the new read method performs better than an 
operation that reads the bytes and convert to string. The gains start to 
be significant even at quite reasonable sizes (< 10K).

[1] JBS: https://bugs.openjdk.java.net/browse/JDK-8201276
[2] CSR: https://bugs.openjdk.java.net/browse/JDK-8202055
[3] webrevs: http://cr.openjdk.java.net/~joehw/jdk11/8201276/webrev/

Please review.


On 5/15/18, 7:51 AM, Roger Riggs wrote:
> Hi Joe, et. al.
> My $.02 on line separators:
>  - We should avoid clever tricks trying to solve problems that are 
> infrequent such as cross OS file line operations.
>     Most files will be read/written on a single system so the line 
> separators will be as expected.
>  - Have/use APIs that split lines consistently accepting both line 
> separators so developer code can be agnostic to line separators.
>     aka BufferedReader.readLine for developers that are processing the 
> contents *as lines*.
>    Those other methods already exist; if there are any gaps in line 
> oriented processing that's a separate task.
>  - These new file methods are defined to handle Charset 
> encoding/decoding and buffering.
>     Since there are other methods to deal with files as lines these 
> methods should not look for or break to lines.
>  - Performance: adding code to look for line characters will slow it 
> down and in most cases would have no impact
>     since the line endings will match the system.
>  - The strings passed to writeString (CharSequence) should have been 
> constructed using the proper line separators.
>     There are any number of methods that insert the os specific line 
> terminator and its easy to write File.separator as needed.
> As for the method naming:
> I'd prefer "writeString" as a familiar term that isn't stretched too 
> far by making the argument be a CharSequence.
> its fine to call a CharSequence a string (with a lower case s).
> $.02, Roger
> On 4/27/18 6:33 PM, Joe Wang wrote:
>> On 4/27/2018 12:50 PM, forax at univ-mlv.fr wrote:
>>> ----- Mail original -----
>>>> De: "Joe Wang" <huizhe.wang at oracle.com>
>>>> À: "Remi Forax" <forax at univ-mlv.fr>, "Alan Bateman" 
>>>> <Alan.Bateman at oracle.com>
>>>> Cc: "nio-dev" <nio-dev at openjdk.java.net>, "core-libs-dev" 
>>>> <core-libs-dev at openjdk.java.net>
>>>> Envoyé: Vendredi 27 Avril 2018 21:21:01
>>>> Objet: Re: RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for 
>>>> reading/writing a string from/to a file
>>>> Hi Rémi, Alan,
>>> Hi Joe,
>>>> I'm not sure we'd want to replace system dependent line separators 
>>>> with
>>>> '\n'. There are cases as you described where the replacement makes
>>>> sense. However,  there are other cases too. For example, the 
>>>> purpose is
>>>> to read, edit a text file and then write it back. If this is on 
>>>> Windows,
>>>> and if line separators are replaced with '\n', that would cause the
>>>> resulting file not formatted properly in for example Notepad. There 
>>>> are
>>>> cases where users write text files on Windows using Java, and only 
>>>> found
>>>> the lines were not separated in Notepad.
>>> I agree that why the counterpart of readString() that write the 
>>> string should inserts the platform's line separator.
>>> BTW, i just found that those methods are not named writeString, or 
>>> writeCharSequence, i think they should.
>> While readString() does not modify the original content (e.g. by 
>> replacing the platform's line separator with '\n'), write(String) 
>> won't either, by adding extra characters such as the line separator.
>> I would think interfaces shall read along with the parameters.
>>     readString(Path) == read as a String from the specified Path (one 
>> could argue for readToString, readAsString, but we generally omit the 
>> preps)
>>     write(Path, CharSequence) == write the CharSequence to the file, 
>> since CharSequence is already in the method signature as a parameter, 
>> we probably don't want to add that to the name, otherwise it would 
>> read like repeating the word CharSequence.
>> It is in a similar situation as write(Path, Iterable) where it was 
>> defined as writeLines(Path, Iterable).
>>>> Files.write(Path, Iterable) is also specified to terminate each line
>>>> with the platform's line separator. If readString does the 
>>>> replacement,
>>>> it would be inconsistent.
>>> Anyway, if we look for consistency the methods writeCharSequence 
>>> should transform the \n in the CharSequence to the platform's line 
>>> separator.
>>> Files.write(Path, Iterable) is is not a counterpart of readString(), 
>>> it's consistent with Files.lines() or Files.readLines() (or 
>>> BufferedReader.readLine()) that all suppress the line separators. 
>>> Anyway, Files.write(path, readString(path)::line) will be consistent 
>>> if you replace the line separators or not because String.line() 
>>> suppresses the line separators.
>> readString pairs with write(String), therefore it's more like 
>> Files.write(path, readString(path)) than readString(path)::line. The 
>> use case:
>>     String s = Files.read(path);
>>     Files.write(path, s.replace("/config/location", "/new/location"));
>> would then work as expected.
>> These two methods are one-off (open/read/write/close file) operation. 
>> write(String) therefore is not intended for adding/appending multiple 
>> Strings from other operations such as String.line(). If an app needs 
>> to put the result of String.line() or any other processes into a file 
>> using this method, it needs to take care of adding the line separator 
>> itself when necessary. "when necessary" would be a judgement call by 
>> the developer.
>> That said, if there's a consensus on the idea of terminating the 
>> string with a line separator, then readString method would need to 
>> strip it off to go with the write method.
>>>> I would think readString behaves like readAllBytes, that would 
>>>> simply read all content faithfully.
>>> readString does an interpolation due to the Charset so it's not the 
>>> real content, again, the idea is that developers should not have to 
>>> care about the platform's line separators (they have more important 
>>> things to think).
>>> The other solution is to just not introduce those new methods, after 
>>> all Files.lines().collect(Collectors.joining("\n")) already does the 
>>> job, no ?
>> While there are many ways to do it, none is as straight-forward. 
>> "Read (entire) file to String"/"Write String to file" are popular 
>> requests from users. Read to string -> do some String manipulation -> 
>> write it back is such a simple use case, it really needs to be very 
>> easy to do as illustrated in the above code snippet.
>> Cheers,
>> Joe
>>>> Best,
>>>> Joe
>>> regards,
>>> Rémi
>>>> On 4/27/2018 4:43 AM, forax at univ-mlv.fr wrote:
>>>>> Hi Alan,
>>>>> People do not read the documentation.
>>>>> So adding something in the documentation about when a method 
>>>>> should be used or
>>>>> not is never a solution.
>>>>> Here the user want a String and provides a charset so you have no 
>>>>> way but to
>>>>> decode the content to substitute the line separator.
>>>>> cheers,
>>>>> Rémi
>>>>> ----- Mail original -----
>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>> À: "Remi Forax" <forax at univ-mlv.fr>, "Joe Wang" 
>>>>>> <huizhe.wang at oracle.com>
>>>>>> Cc: "nio-dev" <nio-dev at openjdk.java.net>, "core-libs-dev"
>>>>>> <core-libs-dev at openjdk.java.net>
>>>>>> Envoyé: Vendredi 27 Avril 2018 13:34:12
>>>>>> Objet: Re: RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for
>>>>>> reading/writing a string from/to a file
>>>>>> On 27/04/2018 12:29, Remi Forax wrote:
>>>>>>> I think that having a readString that includes OS dependent line 
>>>>>>> separators is a
>>>>>>> mistake,
>>>>>>> Java does a great job to try to shield the developer from the 
>>>>>>> kind of things
>>>>>>> that makes programs behave differently on different platforms.
>>>>>>> readString should subtitute (\r)?\n to \n otherwise either 
>>>>>>> people will do a call
>>>>>>> replace() which is less efficient or will learn the lesson the 
>>>>>>> hard way.
>>>>>>> raw string literal does the same substitution for the same reason.
>>>>>> Yes, there are several discussion points around this and somewhat 
>>>>>> timely
>>>>>> with multi-string support.
>>>>>> One thing that I think Joe will need in this API is some note to 
>>>>>> make it
>>>>>> clearer what the intended usage is (as I think the intention is 
>>>>>> simple
>>>>>> cases with mostly single lines of text).
>>>>>> -Alan.

More information about the core-libs-dev mailing list