RFR: 8356165: System.in in jshell replace supplementary characters with ??

Christian Stein cstein at openjdk.org
Wed May 7 10:33:15 UTC 2025


On Wed, 7 May 2025 06:44:54 GMT, Jan Lahoda <jlahoda at openjdk.org> wrote:

> When reading from `System.in` in a JShell snippet, JShell first reads the whole line (getting a `String`), and then converts this characters from this `String` to bytes on demand. But, it does not convert multi-surrogate code points correctly, it tries to convert each surrogate separately, which cannot work.
> 
> The proposal herein is to, when the current character is a high surrogate, peek at the next character, and if it is a low surrogate, convert both the high and low surrogates to bytes together.

src/jdk.jshell/share/classes/jdk/internal/jshell/tool/ConsoleIOContext.java line 980:

> 978:         if (pendingBytes == null || pendingBytes.length <= pendingBytesPointer) {
> 979:             char userChar = readUserInputChar();
> 980:             StringBuilder dataToConvert = new StringBuilder();

Perhaps, add here the comment from the PR description for readers from the future:

> [...] when the current character is a high surrogate, peek at the next character, and if it is a low surrogate, convert both the high and low surrogates to bytes together.

The (internal) API used in the implementation doesn't express that on first sight.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25079#discussion_r2077326550


More information about the kulla-dev mailing list