<i18n dev> RFR: 8195686: ISO-8859-8-i charset cannot be decoded, should be mapped to ISO-8859-8

Tue Aug 27 17:04:04 UTC 2024

On Fri, 23 Aug 2024 10:38:38 GMT, Pratiksha.Sawant <duke at openjdk.org> wrote:

> Mapping ISO-8859-8-I charset to ISO-8859-8.
> Below mentioned 2 aliases are added as part of this:-
> **ISO-8859-8-I**
> **ISO8859-8-I**
> 
> The bug report for the same:- https://bugs.openjdk.org/browse/JDK-8195686

I looked at this issue a bit more. Looking at the IANA Charset registry (https://www.iana.org/assignments/character-sets/character-sets.xhtml) which `Charset` class is based on, `ISO-8859-8-I` is not an alias to `ISO-8859-8`, but it is defined as a distinct `Preferred MIME name`. So I don't think current proposed solution is correct. (It would return ISO-8859-8-I as an alias to ISO-8859-8). Also, looking at the RFC-1556, in which this ISO-8859-8-I encoding is defined, there are other encodings, i.e., ISO-8859-6-I, ISO-8859-6-E, and ISO-8859-8-E. Why are they not relevant, but ISO-8859-8-I is?
Considering these, I am still not sure to introduce these new encodings now, also because there has not been any request from the time Bill Shannon worked (circa 2018), unless Arabic/Hebrew speaking communities jumped in and provide rationale to support them.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20690#issuecomment-2313086694