<i18n dev> RFR: 8232161 Unexpected 1-way trip conversion entries on MS950 charset

Ichiroh Takiguchi takiguc at linux.vnet.ibm.com
Fri Nov 1 13:56:34 UTC 2019


Hello Sato-san.

I could not find any specific reason for this issue.
I said "1-way trip conversion", I should say "round trip conversion".
bestfit950.txt has many 1-way trip entries for Unicode to MS950 
conversion.
At least, I think round trip conversion needs to be consistent between 
Windows and Java.

Thanks,
Ichiroh Takiguchi

On 2019-10-31 01:25, naoto.sato at oracle.com wrote:
> Takiguchi-san,
> 
> Personally I am reluctant to make this change. If we were to introduce
> this, it will be a different encoding from the existing MS950, so
> either 1) we need a new encoding, or 2) make some switch between the
> encoding, possibly a system property. But neither seems worth doing,
> as :-
> 
> 1) JDK's conversion is not a bug per se.
> 2) Seems that Unicode.org's "best fit" was introduced around 2006?
> (From the date on the unicode.org
> (https://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit950.txt),
> so JDK's mapping predates it.
> 3) Those code points are not a common ones (BOX DRAWINGS), and no
> customer had a complaint about it.
> 
> Please let me know if there are some rationale for fixing it.
> 
> BTW, as to the CSR, I don't see it was created.
> 
> Naoto
> 
> 
> On 10/29/19 11:35 AM, Ichiroh Takiguchi wrote:
>> Thanks, Sato-san.
>> 
>> There is no special meaning to the word "until now".
>> I rewrote charset related testcases, then I found this issue.
>> 
>> I read "Frequently Asked Questions about the CSR" [1],
>> I tried "Create CSR" operation, but I could not determine it worked or 
>> not...
>> (Select "Create CSR" from "More" menu)
>> It worked ?
>> 
>> [1] https://wiki.openjdk.java.net/display/csr/CSR+FAQs
>> 
>> Thanks,
>> Ichiroh Takiguchi
>> 
>> On 2019-10-29 03:03, naoto.sato at oracle.com wrote:
>>> Hi Takiguchi-san,
>>> 
>>> On 10/28/19 9:51 AM, Ichiroh Takiguchi wrote:
>>>> Hello.
>>>> 
>>>> I have no idea about compatibility impact.
>>>> 
>>>> But according to 
>>>> ftp://ftp.unicode.org/Public/12.1.0/ucd/UnicodeData.txt
>>>> These are BOX DRAWINGS characters.
>>>> 
>>>> 2550;BOX DRAWINGS DOUBLE HORIZONTAL;So;0;ON;;;;;N;FORMS DOUBLE 
>>>> HORIZONTAL;;;;
>>>> 255E;BOX DRAWINGS VERTICAL SINGLE AND RIGHT 
>>>> DOUBLE;So;0;ON;;;;;N;FORMS VERTICAL SINGLE AND RIGHT DOUBLE;;;;
>>>> 2561;BOX DRAWINGS VERTICAL SINGLE AND LEFT 
>>>> DOUBLE;So;0;ON;;;;;N;FORMS VERTICAL SINGLE AND LEFT DOUBLE;;;;
>>>> 256A;BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL 
>>>> DOUBLE;So;0;ON;;;;;N;FORMS VERTICAL SINGLE AND HORIZONTAL DOUBLE;;;;
>>>> 
>>>> I don't think it was used as valuable data until now.
>>>> I think it's necessary to evaluate compatibility.
>>> 
>>> What do you mean by "until now"? Are there customers claiming that it
>>> should be corrected? Since the current JDK's mapping is not incorrect
>>> per se (not just "best match"), I would like to know why this needs 
>>> to
>>> be fixed now.
>>> 
>>> Naoto
>>> 
>>>> 
>>>> To Sato-san,
>>>> if you have any question and suggestion, please let me know.
>>>> 
>>>> To other reviewers,
>>>> please let me know if you have any question and concern.
>>>> 
>>>> Thanks,
>>>> Ichiroh Takiguchi
>>>> 
>>>> On 2019-10-19 16:36, Alan Bateman wrote:
>>>>> On 14/10/2019 16:53, Ichiroh Takiguchi wrote:
>>>>>> Hello.
>>>>>> 
>>>>>> Could you review the fix ?
>>>>>> 
>>>>>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8232161
>>>>>> Change: https://cr.openjdk.java.net/~itakiguchi/8232161/webrev.00/
>>>>>> 
>>>>>> I have a concern about 1-way trip conversion entries (4 entries) 
>>>>>> on MS950 charset.
>>>>>> The detail information is in JDK-8232161 [1]
>>>>>> 
>>>>> Do you know any sense on the compatibility impact of changing this? 
>>>>> I
>>>>> think Naoto has the same question and we aren't sure if this one 
>>>>> with
>>>>> need a compatibility property. I think it will need a CSR.
>>>>> 
>>>>> -Alan


More information about the core-libs-dev mailing list