RFR: 7904021: Parsing group files using non-UTF-8 encoding fails

Christian Stein cstein at openjdk.org
Mon Jun 16 16:28:41 UTC 2025


On Thu, 5 Jun 2025 04:35:14 GMT, Pasam Soujanya <duke at openjdk.org> wrote:

> We make use of jtreg to execute openjdk tests for JDK11/17/21 releases on non-UTF-8 returning platforms. We found latest jtreg code is using Files.newBufferedReader(path) to read group files data(TEST.GROUPS) from openjdk via GroupManager (https://github.com/openjdk/jtreg/blob/master/src/share/classes/com/sun/javatest/regtest/config/GroupManager.java#L102C44-L102C61). 
> 
> This code defaults to return BufferedReader as UTF-8 instance. We see discrepancies when using this version of jtreg on non-UTF-8 platforms where defaultCharset() is non-UTF-8(JDK11 and JDK17).
> 
> Hence, we would like to propose a fix of using default.Charset() with Files.newBufferedWriter(Path path, Charset cs) instead of Files.newBufferedReader(path) and Files.readString(Path) to Files.readString(Path,Charset cs) in below jtreg files :
> https://github.com/openjdk/jtreg/blob/master/src/share/classes/com/sun/javatest/regtest/config/GroupManager.java#L102C44-L102C61
> https://github.com/openjdk/jtreg/blob/master/src/share/classes/com/sun/javatest/regtest/config/ExtraPropDefns.java#L309
> 
> We've also tested this fix on OpenJDK supported platforms like Linux, Windows, MAC. 
> 
> ---------
> ### Progress
> - [ ] Change must be properly reviewed (1 review required, with at least 1 [Reviewer](https://openjdk.org/bylaws#reviewer))
> - [x] Change must not contain extraneous whitespace
> - [x] Commit message must refer to an issue
> 
> 
> 
> ### Reviewing
> <details><summary>Using <code>git</code></summary>
> 
> Checkout this PR locally: \
> `$ git fetch https://git.openjdk.org/jtreg.git pull/267/head:pull/267` \
> `$ git checkout pull/267`
> 
> Update a local copy of the PR: \
> `$ git checkout pull/267` \
> `$ git pull https://git.openjdk.org/jtreg.git pull/267/head`
> 
> </details>
> <details><summary>Using Skara CLI tools</summary>
> 
> Checkout this PR locally: \
> `$ git pr checkout 267`
> 
> View PR using the GUI difftool: \
> `$ git pr show -t 267`
> 
> </details>
> <details><summary>Using diff file</summary>
> 
> Download this PR as a diff file: \
> <a href="https://git.openjdk.org/jtreg/pull/267.diff">https://git.openjdk.org/jtreg/pull/267.diff</a>
> 
> </details>
> <details><summary>Using Webrev</summary>
> 
> [Link to Webrev Comment](https://git.openjdk.org/jtreg/pull/267#issuecomment-2942734568)
> </details>

The two places changed in this PR aren't all places in which `jtreg` reads input from files. Not touch those other places might lead to unexpected/divergent behaviour.

With https://openjdk.org/jeps/400 UTF-8 is the default charset of the standard Java APIs. Yes, that relates to Java 18+, but did you try to store those group files in UTF-8 encoding in your local environment?

Did you try passing `file.encoding` as a system property to the `jtreg` runtime? For example: `jtreg -J-Dfile.encoding=ISO-8859-1 ...`

-------------

PR Comment: https://git.openjdk.org/jtreg/pull/267#issuecomment-2977271129


More information about the jtreg-dev mailing list