RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for reading/writing a string from/to a file

Roger Riggs Roger.Riggs at Oracle.com
Tue Jun 12 17:37:56 UTC 2018


Hi Joe,

Looks good to me.

Typo: StringCoding.java:1026 "unmappble"  (no new webrev needed)

Regards, Roger


On 6/12/2018 12:52 PM, Joe Wang wrote:
> Hi all,
>
> It's been a while since the 1st round of reviews. Thank you all for 
> the valuable comments and suggestions!  Note that Roger's last 
> response went to nio-dev, but not core-libs-dev, I've therefore 
> attached it below.
>
> CSR [2]: the CSR is now approved. Note the write method has been 
> changed to writeString.
>
> Impl [3]: for performance reason and the different requirement from 
> byte-string conversion for malformed/unmappable character handing, the 
> implementation uses a specific method separate from the existing ones. 
> Both Sherman and Alan preferred specific method than adding parameters 
> to the APIs that might be error prone. Thanks Sherman for the code!
>
> JMH benchmark tests showed the new read method performs better than an 
> operation that reads the bytes and convert to string. The gains start 
> to be significant even at quite reasonable sizes (< 10K).
>
>
> [1] JBS: https://bugs.openjdk.java.net/browse/JDK-8201276
> [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8202055
> [3] webrevs: http://cr.openjdk.java.net/~joehw/jdk11/8201276/webrev/
>
> Please review.
>
> Thanks,
> Joe
>
> On 5/15/18, 7:51 AM, Roger Riggs wrote:
>> Hi Joe, et. al.
>>
>> My $.02 on line separators:
>>
>>  - We should avoid clever tricks trying to solve problems that are 
>> infrequent such as cross OS file line operations.
>>     Most files will be read/written on a single system so the line 
>> separators will be as expected.
>>
>>  - Have/use APIs that split lines consistently accepting both line 
>> separators so developer code can be agnostic to line separators.
>>     aka BufferedReader.readLine for developers that are processing 
>> the contents *as lines*.
>>    Those other methods already exist; if there are any gaps in line 
>> oriented processing that's a separate task.
>>
>>  - These new file methods are defined to handle Charset 
>> encoding/decoding and buffering.
>>     Since there are other methods to deal with files as lines these 
>> methods should not look for or break to lines.
>>
>>  - Performance: adding code to look for line characters will slow it 
>> down and in most cases would have no impact
>>     since the line endings will match the system.
>>
>>  - The strings passed to writeString (CharSequence) should have been 
>> constructed using the proper line separators.
>>     There are any number of methods that insert the os specific line 
>> terminator and its easy to write File.separator as needed.
>>
>> As for the method naming:
>>
>> I'd prefer "writeString" as a familiar term that isn't stretched too 
>> far by making the argument be a CharSequence.
>> its fine to call a CharSequence a string (with a lower case s).
>>
>> $.02, Roger
>>
>>
>> On 4/27/18 6:33 PM, Joe Wang wrote:
>>>
>>>
>>> On 4/27/2018 12:50 PM, forax at univ-mlv.fr wrote:
>>>> ----- Mail original -----
>>>>> De: "Joe Wang" <huizhe.wang at oracle.com>
>>>>> À: "Remi Forax" <forax at univ-mlv.fr>, "Alan Bateman" 
>>>>> <Alan.Bateman at oracle.com>
>>>>> Cc: "nio-dev" <nio-dev at openjdk.java.net>, "core-libs-dev" 
>>>>> <core-libs-dev at openjdk.java.net>
>>>>> Envoyé: Vendredi 27 Avril 2018 21:21:01
>>>>> Objet: Re: RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for 
>>>>> reading/writing a string from/to a file
>>>>> Hi Rémi, Alan,
>>>> Hi Joe,
>>>>
>>>>> I'm not sure we'd want to replace system dependent line separators 
>>>>> with
>>>>> '\n'. There are cases as you described where the replacement makes
>>>>> sense. However,  there are other cases too. For example, the 
>>>>> purpose is
>>>>> to read, edit a text file and then write it back. If this is on 
>>>>> Windows,
>>>>> and if line separators are replaced with '\n', that would cause the
>>>>> resulting file not formatted properly in for example Notepad. 
>>>>> There are
>>>>> cases where users write text files on Windows using Java, and only 
>>>>> found
>>>>> the lines were not separated in Notepad.
>>>> I agree that why the counterpart of readString() that write the 
>>>> string should inserts the platform's line separator.
>>>> BTW, i just found that those methods are not named writeString, or 
>>>> writeCharSequence, i think they should.
>>>
>>> While readString() does not modify the original content (e.g. by 
>>> replacing the platform's line separator with '\n'), write(String) 
>>> won't either, by adding extra characters such as the line separator.
>>>
>>> I would think interfaces shall read along with the parameters.
>>>     readString(Path) == read as a String from the specified Path 
>>> (one could argue for readToString, readAsString, but we generally 
>>> omit the preps)
>>>     write(Path, CharSequence) == write the CharSequence to the file, 
>>> since CharSequence is already in the method signature as a 
>>> parameter, we probably don't want to add that to the name, otherwise 
>>> it would read like repeating the word CharSequence.
>>>
>>> It is in a similar situation as write(Path, Iterable) where it was 
>>> defined as writeLines(Path, Iterable).
>>>
>>>>
>>>>> Files.write(Path, Iterable) is also specified to terminate each line
>>>>> with the platform's line separator. If readString does the 
>>>>> replacement,
>>>>> it would be inconsistent.
>>>>
>>>> Anyway, if we look for consistency the methods writeCharSequence 
>>>> should transform the \n in the CharSequence to the platform's line 
>>>> separator.
>>>>
>>>> Files.write(Path, Iterable) is is not a counterpart of 
>>>> readString(), it's consistent with Files.lines() or 
>>>> Files.readLines() (or BufferedReader.readLine()) that all suppress 
>>>> the line separators. Anyway, Files.write(path, 
>>>> readString(path)::line) will be consistent if you replace the line 
>>>> separators or not because String.line() suppresses the line 
>>>> separators.
>>>
>>> readString pairs with write(String), therefore it's more like 
>>> Files.write(path, readString(path)) than readString(path)::line. The 
>>> use case:
>>>
>>>     String s = Files.read(path);
>>>     Files.write(path, s.replace("/config/location", "/new/location"));
>>>
>>> would then work as expected.
>>>
>>> These two methods are one-off (open/read/write/close file) 
>>> operation. write(String) therefore is not intended for 
>>> adding/appending multiple Strings from other operations such as 
>>> String.line(). If an app needs to put the result of String.line() or 
>>> any other processes into a file using this method, it needs to take 
>>> care of adding the line separator itself when necessary. "when 
>>> necessary" would be a judgement call by the developer.
>>>
>>> That said, if there's a consensus on the idea of terminating the 
>>> string with a line separator, then readString method would need to 
>>> strip it off to go with the write method.
>>>
>>>>
>>>>> I would think readString behaves like readAllBytes, that would 
>>>>> simply read all content faithfully.
>>>> readString does an interpolation due to the Charset so it's not the 
>>>> real content, again, the idea is that developers should not have to 
>>>> care about the platform's line separators (they have more important 
>>>> things to think).
>>>>
>>>> The other solution is to just not introduce those new methods, 
>>>> after all Files.lines().collect(Collectors.joining("\n")) already 
>>>> does the job, no ?
>>>
>>> While there are many ways to do it, none is as straight-forward. 
>>> "Read (entire) file to String"/"Write String to file" are popular 
>>> requests from users. Read to string -> do some String manipulation 
>>> -> write it back is such a simple use case, it really needs to be 
>>> very easy to do as illustrated in the above code snippet.
>>>
>>> Cheers,
>>> Joe
>>>
>>>>
>>>>> Best,
>>>>> Joe
>>>> regards,
>>>> Rémi
>>>>
>>>>> On 4/27/2018 4:43 AM, forax at univ-mlv.fr wrote:
>>>>>> Hi Alan,
>>>>>>
>>>>>> People do not read the documentation.
>>>>>> So adding something in the documentation about when a method 
>>>>>> should be used or
>>>>>> not is never a solution.
>>>>>>
>>>>>> Here the user want a String and provides a charset so you have no 
>>>>>> way but to
>>>>>> decode the content to substitute the line separator.
>>>>>>
>>>>>> cheers,
>>>>>> Rémi
>>>>>>
>>>>>> ----- Mail original -----
>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>> À: "Remi Forax" <forax at univ-mlv.fr>, "Joe Wang" 
>>>>>>> <huizhe.wang at oracle.com>
>>>>>>> Cc: "nio-dev" <nio-dev at openjdk.java.net>, "core-libs-dev"
>>>>>>> <core-libs-dev at openjdk.java.net>
>>>>>>> Envoyé: Vendredi 27 Avril 2018 13:34:12
>>>>>>> Objet: Re: RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for
>>>>>>> reading/writing a string from/to a file
>>>>>>> On 27/04/2018 12:29, Remi Forax wrote:
>>>>>>>> I think that having a readString that includes OS dependent 
>>>>>>>> line separators is a
>>>>>>>> mistake,
>>>>>>>> Java does a great job to try to shield the developer from the 
>>>>>>>> kind of things
>>>>>>>> that makes programs behave differently on different platforms.
>>>>>>>>
>>>>>>>> readString should subtitute (\r)?\n to \n otherwise either 
>>>>>>>> people will do a call
>>>>>>>> replace() which is less efficient or will learn the lesson the 
>>>>>>>> hard way.
>>>>>>>>
>>>>>>>> raw string literal does the same substitution for the same reason.
>>>>>>>>
>>>>>>> Yes, there are several discussion points around this and 
>>>>>>> somewhat timely
>>>>>>> with multi-string support.
>>>>>>>
>>>>>>> One thing that I think Joe will need in this API is some note to 
>>>>>>> make it
>>>>>>> clearer what the intended usage is (as I think the intention is 
>>>>>>> simple
>>>>>>> cases with mostly single lines of text).
>>>>>>>
>>>>>>> -Alan.
>>>
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20180612/44d9b945/attachment-0001.html>


More information about the nio-dev mailing list