<i18n dev> RFR: 8356980: Better handling of non-breaking space
Naoto Sato
naoto at openjdk.org
Thu May 15 03:20:51 UTC 2025
On Thu, 15 May 2025 02:25:30 GMT, Sergey Bylokhov <serb at openjdk.org> wrote:
>> I believe it is OK to leave these as UTF-8 native characters, as these files are l10n resource bundles. If we wanted to replace those look-alike spaces to unicode escapes, other characters may also need the same treatment, such as hyphen-minus, quotations, etc. In fact there are lot more look alikes defined in the unicode consortium (https://www.unicode.org/Public/security/latest/confusables.txt), and I don't think we would want to convert them.
>
> maybe this is just a translation error and a simple space can be used instead, like in all the other properties in these files?
Maybe, but sometimes it is intentional. CLDR has once switched normal spaces to NBSP/NNBSP for certain locales (https://unicode-org.atlassian.net/browse/CLDR-14032). And we cannot tell if it is intentional or not.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/25234#discussion_r2090140891
More information about the i18n-dev
mailing list