<i18n dev> RFR: 8232161 Unexpected 1-way trip conversion entries on MS950 charset
Ichiroh Takiguchi
takiguc at linux.vnet.ibm.com
Fri Nov 1 13:56:34 UTC 2019
Hello Sato-san.
I could not find any specific reason for this issue.
I said "1-way trip conversion", I should say "round trip conversion".
bestfit950.txt has many 1-way trip entries for Unicode to MS950
conversion.
At least, I think round trip conversion needs to be consistent between
Windows and Java.
Thanks,
Ichiroh Takiguchi
On 2019-10-31 01:25, naoto.sato at oracle.com wrote:
> Takiguchi-san,
>
> Personally I am reluctant to make this change. If we were to introduce
> this, it will be a different encoding from the existing MS950, so
> either 1) we need a new encoding, or 2) make some switch between the
> encoding, possibly a system property. But neither seems worth doing,
> as :-
>
> 1) JDK's conversion is not a bug per se.
> 2) Seems that Unicode.org's "best fit" was introduced around 2006?
> (From the date on the unicode.org
> (https://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit950.txt),
> so JDK's mapping predates it.
> 3) Those code points are not a common ones (BOX DRAWINGS), and no
> customer had a complaint about it.
>
> Please let me know if there are some rationale for fixing it.
>
> BTW, as to the CSR, I don't see it was created.
>
> Naoto
>
>
> On 10/29/19 11:35 AM, Ichiroh Takiguchi wrote:
>> Thanks, Sato-san.
>>
>> There is no special meaning to the word "until now".
>> I rewrote charset related testcases, then I found this issue.
>>
>> I read "Frequently Asked Questions about the CSR" [1],
>> I tried "Create CSR" operation, but I could not determine it worked or
>> not...
>> (Select "Create CSR" from "More" menu)
>> It worked ?
>>
>> [1] https://wiki.openjdk.java.net/display/csr/CSR+FAQs
>>
>> Thanks,
>> Ichiroh Takiguchi
>>
>> On 2019-10-29 03:03, naoto.sato at oracle.com wrote:
>>> Hi Takiguchi-san,
>>>
>>> On 10/28/19 9:51 AM, Ichiroh Takiguchi wrote:
>>>> Hello.
>>>>
>>>> I have no idea about compatibility impact.
>>>>
>>>> But according to
>>>> ftp://ftp.unicode.org/Public/12.1.0/ucd/UnicodeData.txt
>>>> These are BOX DRAWINGS characters.
>>>>
>>>> 2550;BOX DRAWINGS DOUBLE HORIZONTAL;So;0;ON;;;;;N;FORMS DOUBLE
>>>> HORIZONTAL;;;;
>>>> 255E;BOX DRAWINGS VERTICAL SINGLE AND RIGHT
>>>> DOUBLE;So;0;ON;;;;;N;FORMS VERTICAL SINGLE AND RIGHT DOUBLE;;;;
>>>> 2561;BOX DRAWINGS VERTICAL SINGLE AND LEFT
>>>> DOUBLE;So;0;ON;;;;;N;FORMS VERTICAL SINGLE AND LEFT DOUBLE;;;;
>>>> 256A;BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL
>>>> DOUBLE;So;0;ON;;;;;N;FORMS VERTICAL SINGLE AND HORIZONTAL DOUBLE;;;;
>>>>
>>>> I don't think it was used as valuable data until now.
>>>> I think it's necessary to evaluate compatibility.
>>>
>>> What do you mean by "until now"? Are there customers claiming that it
>>> should be corrected? Since the current JDK's mapping is not incorrect
>>> per se (not just "best match"), I would like to know why this needs
>>> to
>>> be fixed now.
>>>
>>> Naoto
>>>
>>>>
>>>> To Sato-san,
>>>> if you have any question and suggestion, please let me know.
>>>>
>>>> To other reviewers,
>>>> please let me know if you have any question and concern.
>>>>
>>>> Thanks,
>>>> Ichiroh Takiguchi
>>>>
>>>> On 2019-10-19 16:36, Alan Bateman wrote:
>>>>> On 14/10/2019 16:53, Ichiroh Takiguchi wrote:
>>>>>> Hello.
>>>>>>
>>>>>> Could you review the fix ?
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232161
>>>>>> Change: https://cr.openjdk.java.net/~itakiguchi/8232161/webrev.00/
>>>>>>
>>>>>> I have a concern about 1-way trip conversion entries (4 entries)
>>>>>> on MS950 charset.
>>>>>> The detail information is in JDK-8232161 [1]
>>>>>>
>>>>> Do you know any sense on the compatibility impact of changing this?
>>>>> I
>>>>> think Naoto has the same question and we aren't sure if this one
>>>>> with
>>>>> need a compatibility property. I think it will need a CSR.
>>>>>
>>>>> -Alan
More information about the core-libs-dev
mailing list