RFR: 8356980: Better handling of non-breaking space

Magnus Ihse Bursie ihse at openjdk.org
Mon May 26 08:24:36 UTC 2025


On Thu, 22 May 2025 21:26:08 GMT, Phil Race <prr at openjdk.org> wrote:

>> FYI, the style guide for France [recommends](https://fr.wikipedia.org/wiki/Espace_ins%C3%A9cable#En_France):
>> 
>> - U+202F (Narrow No-Break Space NNBSP) preceding semicolon, question mark, and exclamation mark.
>> - U+00A0 (No-Break Space NBSP) preceding colon.
>> 
>> Similar conventions are used in other French speaking countries.
>
>> No, it doesn't. I still agree with that fix -- the overwhelming majority of characters should indeed be UTF-8 instead of unicode sequences.
> 
>> This is about a very specific character, that is impossible to visually tell the difference on screen from ordinary space.
> 
> 
> I didn't say it reversed that entire changeset. I am saying that the previous changeset for L10N changed
> 
> the Java unicode escape to UTF-8 for the localised message string.
> 
> You propose restoring it to Java escape.
> 
> 
> I wouldn't be surprised if the next message drop reverses what you reversed.
> 
> I don't know what tools the L10N team use but there's a chance it doesn't handle Java escapes
> 
> since that is very much a Java thing. So you are probably making the translation job harder.
> 
> 
> So I am suggesting you leave all of the translation files as is.
> Which might mean withdrawing this PR.

Fair enough. This was meant to be an improvement in readability, not a hindrance for working efficiently. I'll withdraw this PR.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25234#discussion_r2106804794


More information about the core-libs-dev mailing list