RFR: 8356165: System.in in jshell replace supplementary characters with ?? [v3]
Tatsunori Uchino
duke at openjdk.org
Mon May 12 22:56:51 UTC 2025
On Mon, 12 May 2025 16:37:10 GMT, Jan Lahoda <jlahoda at openjdk.org> wrote:
>> When reading from `System.in` in a JShell snippet, JShell first reads the whole line (getting a `String`), and then converts this characters from this `String` to bytes on demand. But, it does not convert multi-surrogate code points correctly, it tries to convert each surrogate separately, which cannot work.
>>
>> The proposal herein is to, when the current character is a high surrogate, peek at the next character, and if it is a low surrogate, convert both the high and low surrogates to bytes together.
>
> Jan Lahoda has updated the pull request incrementally with one additional commit since the last revision:
>
> (Attempting to) fix the test on Windows.
I forgot to explain the context:
- Increment of the number of input bytes to be read is to assure that a byte or code unit is not read twice or skipped
- Providing extra input is to prevent the suspension due to the input starvation (especially for builds without this fix)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25079#issuecomment-2874398072
More information about the kulla-dev
mailing list