RFR: 8356165: System.in in jshell replace supplementary characters with ?? [v3]

Tatsunori Uchino duke at openjdk.org
Mon May 12 22:56:51 UTC 2025


On Mon, 12 May 2025 16:37:10 GMT, Jan Lahoda <jlahoda at openjdk.org> wrote:

>> When reading from `System.in` in a JShell snippet, JShell first reads the whole line (getting a `String`), and then converts this characters from this `String` to bytes on demand. But, it does not convert multi-surrogate code points correctly, it tries to convert each surrogate separately, which cannot work.
>> 
>> The proposal herein is to, when the current character is a high surrogate, peek at the next character, and if it is a low surrogate, convert both the high and low surrogates to bytes together.
>
> Jan Lahoda has updated the pull request incrementally with one additional commit since the last revision:
> 
>   (Attempting to) fix the test on Windows.

I forgot to explain the context: 

- Increment of the number of input bytes to be read is to assure that a byte or code unit is not read twice or skipped
- Providing extra input is to prevent  the suspension due to the input starvation (especially for builds without this fix)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25079#issuecomment-2874398072


More information about the kulla-dev mailing list