<i18n dev> RFR: 8195686: ISO-8859-8-i charset cannot be decoded, should be mapped to ISO-8859-8
Naoto Sato
naoto at openjdk.org
Fri Sep 13 16:34:09 UTC 2024
On Fri, 13 Sep 2024 03:12:39 GMT, Jason Mehrens <duke at openjdk.org> wrote:
>> I looked at this issue a bit more. Looking at the IANA Charset registry (https://www.iana.org/assignments/character-sets/character-sets.xhtml) which `Charset` class is based on, `ISO-8859-8-I` is not an alias to `ISO-8859-8`, but it is defined as a distinct `Preferred MIME name`. So I don't think current proposed solution is correct. (It would return ISO-8859-8-I as an alias to ISO-8859-8). Also, looking at the RFC-1556, in which this ISO-8859-8-I encoding is defined, there are other encodings, i.e., ISO-8859-6-I, ISO-8859-6-E, and ISO-8859-8-E. Why are they not relevant, but ISO-8859-8-I is?
>> Considering these, I am still not sure to introduce these new encodings now, also because there has not been any request from the time Bill Shannon worked (circa 2018), unless Arabic/Hebrew speaking communities jumped in and provide rationale to support them.
>
> @naotoj does the mapping need to be removed from:
>
> https://github.com/openjdk/jdk/blob/5e5942a282e14846404b68c65d43594d6b9226d9/src/java.xml/share/classes/com/sun/org/apache/xerces/internal/util/EncodingMap.java#L770
>
> I ask because JakartaMail /Angus Mail is a similar usecase to this code.
@jmehrens I would like to, but I don't know the possible issues that would be caused by the removal. So my take is no.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20690#issuecomment-2349349814
More information about the i18n-dev
mailing list