RFR: 8288589: Files.readString ignores encoding errors for UTF-16 [v2]
Alan Bateman
alanb at openjdk.org
Tue Jun 21 08:59:54 UTC 2022
On Sat, 18 Jun 2022 00:31:06 GMT, Naoto Sato <naoto at openjdk.org> wrote:
>> This is a regression caused by the fix to [JDK-8286287](https://bugs.openjdk.org/browse/JDK-8286287), which assumed the method `String.decodeWithDecoder()` was only invoked with cs.REPLACE mode based on the comment "should not happen". Possibly this refers to the `String(byte[], int, int, Charset)` constructor, which specifically mentions the `REPLACE` mode. However, the method is invoked with `String.newStringNoRepl()` and it should NOT replace the malformed input (duh!). The fix is to throw an `Error` for the former case as before the regression, and `CharacterCodingException` for the latter via an `IllegalArgumentException`.
>> In fact, `Files.readString()` stopped throwing a `MalformedInputException` since JDK17 with the fix to JDK-8259842, which started throwing an `Error`.
>
> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision:
>
> Added c2b test
> I looked for similar test cases but ended up finding nothing. Thus I created a new test case here. Problem is that they are issued through `SharedSecrets`, which are effectively _APIs_ but treated as private methods which leads to insufficient testing. I now think that I would add not only b2c test, but also c2b test (for getBytesNoRepl() method) is needed. I will modify the test case to include it.
My comment was mostly asking if we need to add more tests for Files.writeString. I would have expected a test for that method to fail with this bug. Maybe we need to create a new issue to expand the tests for this method.
> BTW, I found a spec bug in `Files.writeString()` w/o `Charset` argument, where the `@throws` clause reads: "[IOException](https://urldefense.com/v3/__https://download.java.net/java/early_access/jdk19/docs/api/java.base/java/io/IOException.html__;!!ACWV5N9M2RV99hQ!PzgRBNWQwotDtM0GEFtu0XuT7pUqLpKjdwt6awkfFaeZEhXxhdEPL5FhuTeNGYrUHdaeM-_qWB2PccxVdZIFLQ$ ) - if an I/O error occurs writing to or creating the file, or the text cannot be encoded using the specified charset", but there is no specified charset there.
It looks like description for IOException was copied from the 4-arg writeString to the 3-arg writeString. I've created JDK-8288836 to track this.
One other thing, this is a regression in 19 so I assume the PR should be against openjdk/jdk19 rather than the main line.
-------------
PR: https://git.openjdk.org/jdk/pull/9193
More information about the core-libs-dev
mailing list