<i18n dev> RFR: 8195686: ISO-8859-8-i charset cannot be decoded, should be mapped to ISO-8859-8

Pratiksha.Sawant duke at openjdk.org
Mon Aug 26 13:06:03 UTC 2024


On Fri, 23 Aug 2024 10:38:38 GMT, Pratiksha.Sawant <duke at openjdk.org> wrote:

> Mapping ISO-8859-8-I charset to ISO-8859-8.
> Below mentioned 2 aliases are added as part of this:-
> **ISO-8859-8-I**
> **ISO8859-8-I**
> 
> The bug report for the same:- https://bugs.openjdk.org/browse/JDK-8195686

"ISO-8859-8-I" is a charset name for character encoding "ISO-8859-8".(https://en.wikipedia.org/wiki/ISO-8859-8-I).

We had found 2 files where the aliases for charsets are added in jdk code base.

“src/java.xml/share/classes/com/sun/org/apache/xerces/internal/util/EncodingMap.java”
“/make/data/charsetmapping/charsets”


“ISO-8859-8-I” charset is referenced in the headers as the charset of the email contents in few clients when the email is generated from Middle East and China. As it is supposed to be a duplicate of ISO-8859-8, and we are supporting this ISO-8859-8-I in EncodingMap.java, supporting this encoding in charsets file also makes the behaviour consistent through the JDK.

There is a ticket raised in angus-mail for similar issue :- https://github.com/eclipse-ee4j/angus-mail/issues/147

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20690#issuecomment-2310164523


More information about the i18n-dev mailing list