<i18n dev> RFR: 8356980: Better handling of non-breaking space

Sergey Bylokhov serb at openjdk.org
Thu May 15 02:27:50 UTC 2025


On Wed, 14 May 2025 17:34:45 GMT, Naoto Sato <naoto at openjdk.org> wrote:

>> For the l10n files, they are synced by the translation team and we don't edit them. IMO, I think it's fine leaving those ones as is. Especially because language rules can cause different spacing and punctuation characters, so generally we don't ensure translations are equivalent to the original file's value in that regard. (So viewing them as a Unicode escape sequence vs UTF-8 literal may not bring much benefit.)
>
> I believe it is OK to leave these as UTF-8 native characters, as these files are l10n resource bundles. If we wanted to replace those look-alike spaces to unicode escapes, other characters may also need the same treatment, such as hyphen-minus, quotations, etc. In fact there are lot more look alikes defined in the unicode consortium (https://www.unicode.org/Public/security/latest/confusables.txt), and I don't think we would want to convert them.

maybe this is just a translation error and a simple space can be used instead, like in all the other properties in these files?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25234#discussion_r2090083320


More information about the i18n-dev mailing list