<i18n dev> RFR: 8235834 IBM-943 charset encoder needs updating

naoto.sato at oracle.com naoto.sato at oracle.com
Fri Feb 21 16:53:59 UTC 2020


Two subtle comments to the new webrev:

- I'd add "private" to those static finals.
- "cs.name()" in the exception messages can be replaced with "csName".

Otherwise it looks good. If you agree with the above, no further review 
is needed. Also, if you need a sponsor, I can sponsor your changeset.

Naoto

On 2/21/20 5:10 AM, Ichiroh Takiguchi wrote:
> Hello Naoto.
> 
> I appreciate your suggestions.
> I applied your suggestions into new patch.
> Could you review the fix again ?
> 
> Bug:    https://bugs.openjdk.java.net/browse/JDK-8235834
> Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.01/
> 
> Thanks,
> Ichiroh Takiguchi
> IBM Japan, Ltd.
> 
> On 2020-02-21 02:51, naoto.sato at oracle.com wrote:
>> Thanks for the explanation. Here are comments to your fix:
>>
>> - Can you add some comments about the gist of the change to
>> IBM943.c2b? Summarizing the explanation below would be fine.
>>
>> For the test case:
>>
>> - Please wrap the @bug line appropriately.
>>
>> - Those byte array/Strings can be "static final" outside the test method.
>>
>> - Looks like the mapping is exactly the same with IBM943C for those
>> code points. Is that correct? If so, would you please add some comment
>> as such?
>>
>> - Typo: "round-tip" -> "round-trip"
>>
>> Naoto
>>
>> On 2/20/20 2:47 AM, Ichiroh Takiguchi wrote:
>>> Hello Naoto.
>>>
>>> I appreciate your comment.
>>>
>>> The definition has not changed recently.
>>> Applying the change will prevent the characters which are used on 
>>> Japanese PC from being converted to "?".
>>>
>>> I checked IBM-943 definition and changelog.
>>>
>>> Definitions and creation date are as follows:
>>> (All definitions are valid)
>>> 03AF34B0.TPMAP120: May 20 1997 (Current OpenJDK)
>>> 34B003AF.RPMAP130: May 20 1997 (Proposed change, upward compatible)
>>> 34B003AF.RPMAP14A: Jul 29 1998 (Same as 34B003AF.RPMAP130)
>>> 34B003AF.RPMAP15A: Jan  8 2003 (Additionally, 13 characters are changed)
>>>
>>> According to IBM943.map, OpenJDK JDK refers 03AF34B0.TPMAP120.
>>> 03AF34B0.TPMAP120 just has B2C conversion table only, no C2B definition.
>>> 34B003AF.RPMAP130 which has C2B definition was released on same date.
>>> I assume C2B definition was not implemented at that time.
>>>
>>> C2B definition for 34B003AF.RPMAP130 and 34B003AF.RPMAP14A are same, 
>>> only replacement character is changed.
>>> 34B003AF.RPMAP15A is the latest, but it's almost same as MS932.
>>> If 34B003AF.RPMAP15A is applied, 03AF34B0.TPMAP14A is also required.
>>> I'd like to add C2B definition without B2C definition because of 
>>> compatibility.
>>> I don't want to apply 03AF34B0.TPMAP14A B2C definition.
>>>
>>> So I'd like to apply 34B003AF.RPMAP130 definition.
>>>
>>> Thanks,
>>> Ichiroh Takiguchi
>>>
>>> On 2020-02-19 07:33, naoto.sato at oracle.com wrote:
>>>> Hi Takiguchi-san,
>>>>
>>>> Can you please elaborate the rationale for the change? It looks like
>>>> IBM943 chaset hasn't changed for a long time, at least from JDK8. Has
>>>> the mapping definition recently changed?
>>>>
>>>> Naoto
>>>>
>>>> On 2/17/20 10:23 AM, Ichiroh Takiguchi wrote:
>>>>> Hello.
>>>>>
>>>>> Could you review the fix ?
>>>>>
>>>>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8235834
>>>>> Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.00/
>>>>>
>>>>> IBM-943 is for IBM Japanese PC encoding.
>>>>> MS932 is for Microsoft Japanese Windows encoding.
>>>>> IBM-943 charset encoder does not contain 5 compatible entries 
>>>>> compared to MS932 charset.
>>>>>
>>>>> Thanks,
>>>>> Ichiroh Takiguchi
>>>>> IBM Japan, Ltd.


More information about the i18n-dev mailing list