RFR: 8354908: javac mishandles supplementary character in character literal [v2]

Jan Lahoda jlahoda at openjdk.org
Mon May 12 11:16:09 UTC 2025


> Some Unicode characters consist of two surrogates, i.e. two `char`s. And, such Unicode characters cannot be part of a char literal, as there's no way to represent them as a character literal. But, javac currently accepts code with such characters, and only puts the char, the high surrogate, into the literal, ignoring the second one.
> 
> For example, the JDK 24 behavior is:
> 
> $ cat /tmp/T.java 
> public class T {
>     public static void main(String... args) {
>        char c = '😊';
>        System.err.println(Integer.toHexString((int) c));
>        System.err.println(Character.isHighSurrogate(c));
>     }
> }
> $ java /tmp/T.java
> d83d
> true
> 
> 
> But, in JDK 11, such literals have been rejected:
> 
> $ java /tmp/T.java
> /tmp/T.java:3: error: unclosed character literal
>        char c = '😊';
>                 ^
> /tmp/T.java:3: error: illegal character: '\ude0a'
>        char c = '😊';
>                   ^
> /tmp/T.java:3: error: unclosed character literal
>        char c = '😊';
>                    ^
> 3 errors
> error: compilation failed
> 
> 
> The proposal in this PR is to explicitly check for this case when scanning character literal, and produce explicit error when a multi-surrogate character is used. javac will produce an error like:
> 
> $ java /tmp/T.java
> /tmp/T.java:3: error: character literal contains more than one UTF-16 code point
>        char c = '😊';
>                 ^
> 1 error
> error: compilation failed

Jan Lahoda has updated the pull request incrementally with one additional commit since the last revision:

  Reflecting review comment: using UTF-16 code unit

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24964/files
  - new: https://git.openjdk.org/jdk/pull/24964/files/ae1201b3..1c768046

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24964&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24964&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/24964.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24964/head:pull/24964

PR: https://git.openjdk.org/jdk/pull/24964


More information about the compiler-dev mailing list