<i18n dev> RFR: 8195686: ISO-8859-8-i charset cannot be decoded, should be mapped to ISO-8859-8

Thu Oct 10 16:35:17 UTC 2024

On Mon, 16 Sep 2024 17:40:38 GMT, Naoto Sato <naoto at openjdk.org> wrote:

>> Mapping ISO-8859-8-I charset to ISO-8859-8.
>> Below mentioned 2 aliases are added as part of this:-
>> **ISO-8859-8-I**
>> **ISO8859-8-I**
>> 
>> The bug report for the same:- https://bugs.openjdk.org/browse/JDK-8195686
>
> Sorry, but I cannot speak for Jakarta Mail. If they see ISO-8859-8-I encoding important, they may introduce it as a new charset (again it is not an alias to ISO-8859-8)

> @naotoj does it make sense?

Sorry, but I still don't believe that making "ISO-8859-8-I" as an alias to "ISO-8859-8" is the right solution, per the IANA character sets definition (https://www.iana.org/assignments/character-sets/character-sets.xhtml). The current PR would make "ISO-8859-8-I" charset appear in `Charset.forName("ISO-8859-8").aliases()`, but not in `Charset.availableCharsets()` which is deemed incorrect to me.

That said, I just wonder if this issue can better be addressed exploiting the Charset SPI. This way mail servers can install "ISO-8859-8-I" charset by themselves. This means that mail servers do not need to rely on the underlying JDK which may or may not have that charset.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20690#issuecomment-2405570637