<i18n dev> RFR: 8235834 IBM-943 charset encoder needs updating

Ichiroh Takiguchi takiguc at linux.vnet.ibm.com
Mon Feb 24 14:57:06 UTC 2020


Hello Naoto.

Yes, I agree with your suggestion.
I'm very happy if you are the sponsor of this issue.

Thanks,
Ichiroh Takiguchi

On 2020-02-22 01:53, naoto.sato at oracle.com wrote:
> Two subtle comments to the new webrev:
> 
> - I'd add "private" to those static finals.
> - "cs.name()" in the exception messages can be replaced with "csName".
> 
> Otherwise it looks good. If you agree with the above, no further
> review is needed. Also, if you need a sponsor, I can sponsor your
> changeset.
> 
> Naoto
> 
> On 2/21/20 5:10 AM, Ichiroh Takiguchi wrote:
>> Hello Naoto.
>> 
>> I appreciate your suggestions.
>> I applied your suggestions into new patch.
>> Could you review the fix again ?
>> 
>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8235834
>> Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.01/
>> 
>> Thanks,
>> Ichiroh Takiguchi
>> IBM Japan, Ltd.
>> 
>> On 2020-02-21 02:51, naoto.sato at oracle.com wrote:
>>> Thanks for the explanation. Here are comments to your fix:
>>> 
>>> - Can you add some comments about the gist of the change to
>>> IBM943.c2b? Summarizing the explanation below would be fine.
>>> 
>>> For the test case:
>>> 
>>> - Please wrap the @bug line appropriately.
>>> 
>>> - Those byte array/Strings can be "static final" outside the test 
>>> method.
>>> 
>>> - Looks like the mapping is exactly the same with IBM943C for those
>>> code points. Is that correct? If so, would you please add some 
>>> comment
>>> as such?
>>> 
>>> - Typo: "round-tip" -> "round-trip"
>>> 
>>> Naoto
>>> 
>>> On 2/20/20 2:47 AM, Ichiroh Takiguchi wrote:
>>>> Hello Naoto.
>>>> 
>>>> I appreciate your comment.
>>>> 
>>>> The definition has not changed recently.
>>>> Applying the change will prevent the characters which are used on 
>>>> Japanese PC from being converted to "?".
>>>> 
>>>> I checked IBM-943 definition and changelog.
>>>> 
>>>> Definitions and creation date are as follows:
>>>> (All definitions are valid)
>>>> 03AF34B0.TPMAP120: May 20 1997 (Current OpenJDK)
>>>> 34B003AF.RPMAP130: May 20 1997 (Proposed change, upward compatible)
>>>> 34B003AF.RPMAP14A: Jul 29 1998 (Same as 34B003AF.RPMAP130)
>>>> 34B003AF.RPMAP15A: Jan  8 2003 (Additionally, 13 characters are 
>>>> changed)
>>>> 
>>>> According to IBM943.map, OpenJDK JDK refers 03AF34B0.TPMAP120.
>>>> 03AF34B0.TPMAP120 just has B2C conversion table only, no C2B 
>>>> definition.
>>>> 34B003AF.RPMAP130 which has C2B definition was released on same 
>>>> date.
>>>> I assume C2B definition was not implemented at that time.
>>>> 
>>>> C2B definition for 34B003AF.RPMAP130 and 34B003AF.RPMAP14A are same, 
>>>> only replacement character is changed.
>>>> 34B003AF.RPMAP15A is the latest, but it's almost same as MS932.
>>>> If 34B003AF.RPMAP15A is applied, 03AF34B0.TPMAP14A is also required.
>>>> I'd like to add C2B definition without B2C definition because of 
>>>> compatibility.
>>>> I don't want to apply 03AF34B0.TPMAP14A B2C definition.
>>>> 
>>>> So I'd like to apply 34B003AF.RPMAP130 definition.
>>>> 
>>>> Thanks,
>>>> Ichiroh Takiguchi
>>>> 
>>>> On 2020-02-19 07:33, naoto.sato at oracle.com wrote:
>>>>> Hi Takiguchi-san,
>>>>> 
>>>>> Can you please elaborate the rationale for the change? It looks 
>>>>> like
>>>>> IBM943 chaset hasn't changed for a long time, at least from JDK8. 
>>>>> Has
>>>>> the mapping definition recently changed?
>>>>> 
>>>>> Naoto
>>>>> 
>>>>> On 2/17/20 10:23 AM, Ichiroh Takiguchi wrote:
>>>>>> Hello.
>>>>>> 
>>>>>> Could you review the fix ?
>>>>>> 
>>>>>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8235834
>>>>>> Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.00/
>>>>>> 
>>>>>> IBM-943 is for IBM Japanese PC encoding.
>>>>>> MS932 is for Microsoft Japanese Windows encoding.
>>>>>> IBM-943 charset encoder does not contain 5 compatible entries 
>>>>>> compared to MS932 charset.
>>>>>> 
>>>>>> Thanks,
>>>>>> Ichiroh Takiguchi
>>>>>> IBM Japan, Ltd.


More information about the core-libs-dev mailing list