RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5
Thomas Stüfe
thomas.stuefe at gmail.com
Mon May 21 07:53:07 UTC 2018
On Mon, May 21, 2018 at 9:52 AM, Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
> Hi Bhaktavatsal,
>
> sorry, I have not had time to look at it yet. Too much unforeseen work.
>
> If there is another one wanting to review
(preferably someone from Oracles core libs group? Someone?)
, please do. If noone steps
> up, I'll try to do it next week, but unfortunately I cannot promise
> anything.
>
> ..Thomas
>
> On Mon, May 21, 2018 at 9:47 AM, Bhaktavatsal R Maram
> <bhamaram at in.ibm.com> wrote:
>> Hi Thomas,
>>
>> Is this fix ready to merge?
>>
>> Thanks,
>> Bhaktavatsal
>>
>>
>>
>>
>> -----"Thomas Stüfe" <thomas.stuefe at gmail.com> wrote: -----
>> To: Ichiroh Takiguchi <takiguc at linux.vnet.ibm.com>, Bhaktavatsal R Maram <bhamaram at in.ibm.com>
>> From: "Thomas Stüfe" <thomas.stuefe at gmail.com>
>> Date: 05/11/2018 11:49AM
>> Cc: Volker Simonis <volker.simonis at gmail.com>, Java Core Libs <core-libs-dev at openjdk.java.net>
>> Subject: Re: RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5
>>
>> Hi,
>>
>> I'll test and review next week. We also have some in-house tests which
>> I'd like to run.
>>
>> You IBM folks should really apply for authorship so that this
>> contribution process gets streamlined. After all, if something breaks
>> in this code, you want to be able to fix it, yes? So even if you do
>> not contribute much else, more patches may be forthcoming.
>>
>> Of course I hope these are not your last contributions :)
>>
>> Best, Thomas
>>
>>
>>
>> On Fri, May 11, 2018 at 7:57 AM, Ichiroh Takiguchi
>> <takiguc at linux.vnet.ibm.com> wrote:
>>> Hi.
>>>
>>> I tested this fix on AIX.
>>>
>>> I got following results.
>>> $ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
>>> Ja_JP x-IBM943C IBM-943C IBM-943C
>>> $ LANG=Ja_JP.IBM-943 ~/jdk/bin/java PrintDefaultCharset
>>> Ja_JP.IBM-943 x-IBM943C IBM-943C IBM-943C
>>> $ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
>>> Zh_TW x-IBM950 IBM-950 IBM-950
>>> $ LANG=Zh_TW.big5 ~/jdk/bin/java PrintDefaultCharset
>>> Zh_TW.big5 x-IBM950 IBM-950 IBM-950
>>>
>>> Also I reviewed source code, it's fine
>>>
>>> Since this testing requires locale installation for Ja_JP and Zh_TW,
>>> so it's not easy to test it...
>>> (At least, I think bos.loc.pc.Ja_JP and bos.loc.iso.Zh_TW filesets are
>>> required)
>>>
>>>
>>> On 2018-05-02 18:32, Volker Simonis wrote:
>>>>
>>>> Hi Bhaktavatsal Reddy,
>>>>
>>>> your change looks good. I can sponsor it.
>>>>
>>>> Just waiting for a second review...
>>>>
>>>> Thank you and best regards,
>>>> Volker
>>>>
>>>>
>>>> On Mon, Apr 30, 2018 at 11:29 AM, Bhaktavatsal R Maram
>>>> <bhamaram at in.ibm.com> wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> Please review the fix.
>>>>>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8202329
>>>>> webrev: http://cr.openjdk.java.net/~aleonard/8202329/webrev.00/
>>>>>
>>>>> Thanks,
>>>>> Bhaktavatsal Reddy
>>>>>
>>>>> -----"core-libs-dev" <core-libs-dev-bounces at openjdk.java.net> wrote:
>>>>> -----
>>>>> To: Volker Simonis <volker.simonis at gmail.com>
>>>>> From: "Bhaktavatsal R Maram"
>>>>> Sent by: "core-libs-dev"
>>>>> Date: 04/26/2018 09:31PM
>>>>> Cc: Java Core Libs <core-libs-dev at openjdk.java.net>
>>>>> Subject: Re: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>>>
>>>>> Hi Volker,
>>>>>
>>>>> Thank you. I will address your review comments and send webrev for
>>>>> review.
>>>>>
>>>>> - Bhaktavatsal Reddy
>>>>>
>>>>>
>>>>>
>>>>> -----Volker Simonis <volker.simonis at gmail.com> wrote: -----
>>>>> To: Bhaktavatsal R Maram <bhamaram at in.ibm.com>
>>>>> From: Volker Simonis <volker.simonis at gmail.com>
>>>>> Date: 04/26/2018 09:12PM
>>>>> Cc: Java Core Libs <core-libs-dev at openjdk.java.net>
>>>>> Subject: Re: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>>>
>>>>> Hi Bhaktavatsal Reddy,
>>>>>
>>>>> I've opened the following issue for this problem:
>>>>>
>>>>> 8202329: [AIX] Fix codepage mappings for IBM-943 and Big5
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8202329
>>>>>
>>>>> Looking at you fix, can you please replace the "#elif AIX" by "#ifdef
>>>>> AIX" and the original "#else" by "#ifdef __solaris__". The original
>>>>> else branch contains Solaris-only code anyway and it is an historical
>>>>> omission that there are still a lot of places in the code where "not
>>>>> Linux" implicitly means "Solaris", but that's often wrong.
>>>>>
>>>>> Regards,
>>>>> Volker
>>>>>
>>>>>
>>>>> On Thu, Apr 26, 2018 at 4:02 PM, Bhaktavatsal R Maram
>>>>> <bhamaram at in.ibm.com> wrote:
>>>>>>
>>>>>> Oops! Looks like there is problem with attachment (might be because I
>>>>>> attached .class file as well). I'm pasting the fix and test program here in
>>>>>> mail.
>>>>>>
>>>>>> Test Program:
>>>>>>
>>>>>> import java.nio.charset.*;
>>>>>> class PrintDefaultCharset {
>>>>>> public static void main(String[] args) {
>>>>>> System.out.println("LANG = "+System.getenv("LANG"));
>>>>>> System.out.println("Default charset =
>>>>>> "+Charset.defaultCharset().name());
>>>>>> System.out.println("file.encoding =
>>>>>> "+System.getProperty("file.encoding"));
>>>>>> System.out.println("sun.jnu.encoding =
>>>>>> "+System.getProperty("sun.jnu.encoding"));
>>>>>> }
>>>>>> }
>>>>>>
>>>>>>
>>>>>> Fix:
>>>>>>
>>>>>> diff --git a/src/java.base/unix/native/libjava/java_props_md.c
>>>>>> b/src/java.base/unix/native/libjava/java_props_md.c
>>>>>> --- a/src/java.base/unix/native/libjava/java_props_md.c
>>>>>> +++ b/src/java.base/unix/native/libjava/java_props_md.c
>>>>>> @@ -1,5 +1,5 @@
>>>>>> /*
>>>>>> - * Copyright (c) 1998, 2016, Oracle and/or its affiliates. All rights
>>>>>> reserved.
>>>>>> + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>>>>>> reserved.
>>>>>> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>>>>> *
>>>>>> * This code is free software; you can redistribute it and/or modify it
>>>>>> @@ -297,6 +297,18 @@
>>>>>> if (strcmp(p, "EUC-JP") == 0) {
>>>>>> *std_encoding = "EUC-JP-LINUX";
>>>>>> }
>>>>>> +#elif defined _AIX
>>>>>> + if (strcmp(p, "big5") == 0) {
>>>>>> + /* On AIX Traditional Chinese Big5 codeset is mapped to
>>>>>> IBM-950 */
>>>>>> + *std_encoding = "IBM-950";
>>>>>> + } else if (strcmp(p, "IBM-943") == 0) {
>>>>>> + /*
>>>>>> + * On AIX, IBM-943 is mapped to IBM-943C in which symbol
>>>>>> 'yen' and
>>>>>> + * 'overline' are replaced with 'backslash' and 'tilde'
>>>>>> from ASCII
>>>>>> + * making first 96 code points same as ASCII.
>>>>>> + */
>>>>>> + *std_encoding = "IBM-943C";
>>>>>> + }
>>>>>> #else
>>>>>> if (strcmp(p,"eucJP") == 0) {
>>>>>> /* For Solaris use customized vendor defined character
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Bhaktavatsal Reddy
>>>>>>
>>>>>>
>>>>>> -----"core-libs-dev" <core-libs-dev-bounces at openjdk.java.net> wrote:
>>>>>> -----
>>>>>> To: "Java Core Libs" <core-libs-dev at openjdk.java.net>
>>>>>> From: "Bhaktavatsal R Maram"
>>>>>> Sent by: "core-libs-dev"
>>>>>> Date: 04/26/2018 07:26PM
>>>>>> Subject: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> This issue is continuation to bug 8201540 (Extend the set of supported
>>>>>> charsets in java.base on AIX) in which we have moved default charsets of
>>>>>> most of the locales supported by Operating System to java.base module thus
>>>>>> enabling OpenJDK on those locales for AIX platform.
>>>>>>
>>>>>> As part of that, charsets for locales Ja_JP (IBM-943) and Zh_TW (big5)
>>>>>> also have been moved. However, corresponding charsets mapped in Java is not
>>>>>> correct for them on AIX. Following are the details:
>>>>>>
>>>>>> 1. IBM-943 [1] for locale Ja_JP should be mapped to IBM-943C [2]
>>>>>>
>>>>>> Fundamental difference between IBM-943 and IBM-943C is that IBM-943C is
>>>>>> ASCII compatible which means code points 'yen' and 'overline' of IBM-943 is
>>>>>> replaced with 'backslash' and 'tilde' from ASCII character set.
>>>>>>
>>>>>>
>>>>>> 2. Big5 for locale Zh_TW should be mapped to IBM-950 [3]
>>>>>>
>>>>>> I've attached simple test program to print the default charset along
>>>>>> with fix for this issue. When run test program (PrintDefaultCharset) with
>>>>>> IBM JDK 8 (on AIX) for locales Ja_JP & Zh_TW, following is output.
>>>>>>
>>>>>> -bash-4.4$ LANG=Ja_JP ~/JDKs/IBM/80/ON/sdk/jre/bin/java
>>>>>> PrintDefaultCharset
>>>>>> LANG = Ja_JP
>>>>>> Default charset = x-IBM943C
>>>>>> file.encoding = IBM-943C
>>>>>> sun.jnu.encoding = IBM-943C
>>>>>>
>>>>>> -bash-4.4$ LANG=Zh_TW ~/JDKs/IBM/80/ON/sdk/jre/bin/java
>>>>>> PrintDefaultCharset
>>>>>> LANG = Zh_TW
>>>>>> Default charset = x-IBM950
>>>>>> file.encoding = IBM-950
>>>>>> sun.jnu.encoding = IBM-950
>>>>>>
>>>>>>
>>>>>> Same test run with openJDK 11 gives following output
>>>>>>
>>>>>> -bash-4.4$ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
>>>>>> LANG = Ja_JP
>>>>>> Default charset = x-IBM943
>>>>>> file.encoding = IBM-943
>>>>>> sun.jnu.encoding = IBM-943
>>>>>>
>>>>>> -bash-4.4$ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
>>>>>> LANG = Zh_TW
>>>>>> Default charset = Big5
>>>>>> file.encoding = big5
>>>>>> sun.jnu.encoding = big5
>>>>>>
>>>>>> I will get webrev hosted in
>>>>>> http://cr.openjdk.java.net
>>>>>> for this change and send it for review once JIRA bug is created.
>>>>>>
>>>>>> [1]
>>>>>> http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P130-1999&s=JAVA
>>>>>> [2]
>>>>>> http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL
>>>>>> [3]
>>>>>> https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.nlsgdrf/big5.htm
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Bhaktavatsal Reddy
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>>
More information about the core-libs-dev
mailing list