RFR: 8356893: Use "stdin.encoding" for reading System.in with InputStreamReader/Scanner
Volkan Yazici
vyazici at openjdk.org
Thu May 22 07:59:09 UTC 2025
On Wed, 21 May 2025 21:37:07 GMT, Roger Riggs <rriggs at openjdk.org> wrote:
>> There are several locations in the JDK source where `System.in` and `FileDescriptor.in` is read with `InputStreamReader` and `Scanner` using the default charset. As recommended by the recently merged [JDK-8356420](https://bugs.openjdk.org/browse/JDK-8356420), this PR replaces the default charset with the one provided by the `stdin.encoding` system property.
>>
>> ### Fixing strategy
>>
>> * Where it is obvious that `System.in` is passed to `InputStreamReader`/`Scanner` ctors, `stdin.encoding` is employed fixed.
>> * Where the `InputStream` passed to `InputStreamReader`/`Scanner` ctors is difficult to determine if it can ever be `System.in`, `assert` expressions are placed.
>> * Where the odds of receiving `System.in` are low, yet it is technically possible (e.g., `Process::getInputStream`, `URL::openConnection`, `Class::getResourceAsStream`), nothing is done.
>>
>> @naotoj was kind enough to guide me in this PR, and stated `assert` expressions can be skipped, since they are many ways one can circumvent those checks; wrapping `System.in`, usage of `System::setIn`, etc. Yet we decided to leave them as is to collect feedback from other reviewers too.
>>
>> ### Scanning strategy
>>
>> The following ~alien technology~ advanced static analysis tools are used to scan the code for potentially affected places:
>>
>>
>> # Perl is used for multi-line matching
>> find . -name "*.java" -exec perl -0777 -ne 'my $r = (/(InputStreamReader|Scanner)(\s*System.in)/) ? 0 : 1; exit $r' {} ; -print
>> git grep -H 'FileDescriptor.in' "*.java"
>>
>>
>> All calls to `InputStreamReader::new` and `Scanner::new` are checked too.
>>
>> ### Problems encountered
>>
>> 1. Due to either irregular, or non-existent license header, could not update the copyright year for following classes:
>>
>> ```
>> DOMImplementationRegistry
>> InputRC
>> ListingErrorHandler
>> PandocFilter
>> ```
>> 2. Could not employ `stdin.encoding` in `PandocFilter`, since the bootstrap VM running that class returns empty for that system property
>
> There too many changes in too many different areas to be in a single PR.
> Please break it down by review areas. Client, core libs, tools, etc.
@RogerRiggs, @AlanBateman, thanks so much for the quick review. I see your points. I will
1. withdraw this PR,
2. convert certain changes to spec clarifications,
3. and break it down to multiple PRs.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25368#issuecomment-2900262422
More information about the core-libs-dev
mailing list