<i18n dev> RFR: 8195686: ISO-8859-8-i charset cannot be decoded, should be mapped to ISO-8859-8

Mon Aug 26 13:06:03 UTC 2024

On Fri, 23 Aug 2024 10:38:38 GMT, Pratiksha.Sawant <duke at openjdk.org> wrote:

> Mapping ISO-8859-8-I charset to ISO-8859-8.
> Below mentioned 2 aliases are added as part of this:-
> **ISO-8859-8-I**
> **ISO8859-8-I**
> 
> The bug report for the same:- https://bugs.openjdk.org/browse/JDK-8195686

"ISO-8859-8-I" is a charset name for character encoding "ISO-8859-8".(https://en.wikipedia.org/wiki/ISO-8859-8-I).

We had found 2 files where the aliases for charsets are added in jdk code base.

“src/java.xml/share/classes/com/sun/org/apache/xerces/internal/util/EncodingMap.java”
“/make/data/charsetmapping/charsets”

“ISO-8859-8-I” charset is referenced in the headers as the charset of the email contents in few clients when the email is generated from Middle East and China. As it is supposed to be a duplicate of ISO-8859-8, and we are supporting this ISO-8859-8-I in EncodingMap.java, supporting this encoding in charsets file also makes the behaviour consistent through the JDK.

There is a ticket raised in angus-mail for similar issue :- https://github.com/eclipse-ee4j/angus-mail/issues/147

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20690#issuecomment-2310164523