<i18n dev> RFR: 8235834 IBM-943 charset encoder needs updating

Ichiroh Takiguchi takiguc at linux.vnet.ibm.com
Thu Feb 20 10:47:55 UTC 2020


Hello Naoto.

I appreciate your comment.

The definition has not changed recently.
Applying the change will prevent the characters which are used on 
Japanese PC from being converted to "?".

I checked IBM-943 definition and changelog.

Definitions and creation date are as follows:
(All definitions are valid)
03AF34B0.TPMAP120: May 20 1997 (Current OpenJDK)
34B003AF.RPMAP130: May 20 1997 (Proposed change, upward compatible)
34B003AF.RPMAP14A: Jul 29 1998 (Same as 34B003AF.RPMAP130)
34B003AF.RPMAP15A: Jan  8 2003 (Additionally, 13 characters are changed)

According to IBM943.map, OpenJDK JDK refers 03AF34B0.TPMAP120.
03AF34B0.TPMAP120 just has B2C conversion table only, no C2B definition.
34B003AF.RPMAP130 which has C2B definition was released on same date.
I assume C2B definition was not implemented at that time.

C2B definition for 34B003AF.RPMAP130 and 34B003AF.RPMAP14A are same, 
only replacement character is changed.
34B003AF.RPMAP15A is the latest, but it's almost same as MS932.
If 34B003AF.RPMAP15A is applied, 03AF34B0.TPMAP14A is also required.
I'd like to add C2B definition without B2C definition because of 
compatibility.
I don't want to apply 03AF34B0.TPMAP14A B2C definition.

So I'd like to apply 34B003AF.RPMAP130 definition.

Thanks,
Ichiroh Takiguchi

On 2020-02-19 07:33, naoto.sato at oracle.com wrote:
> Hi Takiguchi-san,
> 
> Can you please elaborate the rationale for the change? It looks like
> IBM943 chaset hasn't changed for a long time, at least from JDK8. Has
> the mapping definition recently changed?
> 
> Naoto
> 
> On 2/17/20 10:23 AM, Ichiroh Takiguchi wrote:
>> Hello.
>> 
>> Could you review the fix ?
>> 
>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8235834
>> Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.00/
>> 
>> IBM-943 is for IBM Japanese PC encoding.
>> MS932 is for Microsoft Japanese Windows encoding.
>> IBM-943 charset encoder does not contain 5 compatible entries compared 
>> to MS932 charset.
>> 
>> Thanks,
>> Ichiroh Takiguchi
>> IBM Japan, Ltd.


More information about the i18n-dev mailing list