RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5

Ichiroh Takiguchi takiguc at linux.vnet.ibm.com
Fri May 11 05:57:34 UTC 2018


Hi.

I tested this fix on AIX.

I got following results.
$ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
Ja_JP   x-IBM943C       IBM-943C        IBM-943C
$ LANG=Ja_JP.IBM-943 ~/jdk/bin/java PrintDefaultCharset
Ja_JP.IBM-943   x-IBM943C       IBM-943C        IBM-943C
$ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
Zh_TW   x-IBM950        IBM-950 IBM-950
$ LANG=Zh_TW.big5 ~/jdk/bin/java PrintDefaultCharset
Zh_TW.big5      x-IBM950        IBM-950 IBM-950

Also I reviewed source code, it's fine

Since this testing requires locale installation for Ja_JP and Zh_TW,
so it's not easy to test it...
(At least, I think bos.loc.pc.Ja_JP and bos.loc.iso.Zh_TW filesets are 
required)

On 2018-05-02 18:32, Volker Simonis wrote:
> Hi Bhaktavatsal Reddy,
> 
> your change looks good. I can sponsor it.
> 
> Just waiting for a second review...
> 
> Thank you and best regards,
> Volker
> 
> 
> On Mon, Apr 30, 2018 at 11:29 AM, Bhaktavatsal R Maram
> <bhamaram at in.ibm.com> wrote:
>> Hi All,
>> 
>> Please review the fix.
>> 
>> bug: https://bugs.openjdk.java.net/browse/JDK-8202329
>> webrev: http://cr.openjdk.java.net/~aleonard/8202329/webrev.00/
>> 
>> Thanks,
>> Bhaktavatsal Reddy
>> 
>> -----"core-libs-dev" <core-libs-dev-bounces at openjdk.java.net> wrote: 
>> -----
>> To: Volker Simonis <volker.simonis at gmail.com>
>> From: "Bhaktavatsal R Maram"
>> Sent by: "core-libs-dev"
>> Date: 04/26/2018 09:31PM
>> Cc: Java Core Libs <core-libs-dev at openjdk.java.net>
>> Subject: Re: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>> 
>> Hi Volker,
>> 
>> Thank you. I will address your review comments and send webrev for 
>> review.
>> 
>> - Bhaktavatsal Reddy
>> 
>> 
>> 
>> -----Volker Simonis <volker.simonis at gmail.com> wrote: -----
>> To: Bhaktavatsal R Maram <bhamaram at in.ibm.com>
>> From: Volker Simonis <volker.simonis at gmail.com>
>> Date: 04/26/2018 09:12PM
>> Cc: Java Core Libs <core-libs-dev at openjdk.java.net>
>> Subject: Re: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>> 
>> Hi Bhaktavatsal Reddy,
>> 
>> I've opened the following issue for this problem:
>> 
>> 8202329: [AIX] Fix codepage mappings for IBM-943 and Big5
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8202329&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=KUVGEwJiRVpNtQ9wUhGP6BKqzSTV1OWX31WWPdQMmqg&m=iQCg2Acve4LeG-Zymt7gpXuSgJLFbCFHsSVHCETqGt8&s=3KL9rSzXZgjLGz-ayIEaq94QK5rTY0PlEgewOjarNPE&e=
>> 
>> Looking at you fix, can you please replace the "#elif AIX" by "#ifdef
>> AIX" and the original "#else" by "#ifdef __solaris__". The original
>> else branch contains Solaris-only code anyway and it is an historical
>> omission that there are still a lot of places in the code where "not
>> Linux" implicitly means "Solaris", but that's often wrong.
>> 
>> Regards,
>> Volker
>> 
>> 
>> On Thu, Apr 26, 2018 at 4:02 PM, Bhaktavatsal R Maram
>> <bhamaram at in.ibm.com> wrote:
>>> Oops! Looks like there is problem with attachment (might be because I 
>>> attached .class file as well). I'm pasting the fix and test program 
>>> here in mail.
>>> 
>>> Test Program:
>>> 
>>> import java.nio.charset.*;
>>> class PrintDefaultCharset {
>>>      public static void main(String[] args) {
>>>         System.out.println("LANG = "+System.getenv("LANG"));
>>>         System.out.println("Default charset = 
>>> "+Charset.defaultCharset().name());
>>>         System.out.println("file.encoding = 
>>> "+System.getProperty("file.encoding"));
>>>         System.out.println("sun.jnu.encoding = 
>>> "+System.getProperty("sun.jnu.encoding"));
>>>      }
>>> }
>>> 
>>> 
>>> Fix:
>>> 
>>> diff --git a/src/java.base/unix/native/libjava/java_props_md.c 
>>> b/src/java.base/unix/native/libjava/java_props_md.c
>>> --- a/src/java.base/unix/native/libjava/java_props_md.c
>>> +++ b/src/java.base/unix/native/libjava/java_props_md.c
>>> @@ -1,5 +1,5 @@
>>>  /*
>>> - * Copyright (c) 1998, 2016, Oracle and/or its affiliates. All 
>>> rights reserved.
>>> + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All 
>>> rights reserved.
>>>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>>   *
>>>   * This code is free software; you can redistribute it and/or modify 
>>> it
>>> @@ -297,6 +297,18 @@
>>>          if (strcmp(p, "EUC-JP") == 0) {
>>>              *std_encoding = "EUC-JP-LINUX";
>>>          }
>>> +#elif defined _AIX
>>> +        if (strcmp(p, "big5") == 0) {
>>> +            /* On AIX Traditional Chinese Big5 codeset is mapped to 
>>> IBM-950 */
>>> +            *std_encoding = "IBM-950";
>>> +        } else if (strcmp(p, "IBM-943") == 0) {
>>> +            /*
>>> +             * On AIX, IBM-943 is mapped to IBM-943C in which symbol 
>>> 'yen' and
>>> +             * 'overline' are replaced with 'backslash' and 'tilde' 
>>> from ASCII
>>> +             * making first 96 code points same as ASCII.
>>> +             */
>>> +            *std_encoding = "IBM-943C";
>>> +        }
>>>  #else
>>>          if (strcmp(p,"eucJP") == 0) {
>>>              /* For Solaris use customized vendor defined character
>>> 
>>> 
>>> Thanks,
>>> Bhaktavatsal Reddy
>>> 
>>> 
>>> -----"core-libs-dev" <core-libs-dev-bounces at openjdk.java.net> wrote: 
>>> -----
>>> To: "Java Core Libs" <core-libs-dev at openjdk.java.net>
>>> From: "Bhaktavatsal R Maram"
>>> Sent by: "core-libs-dev"
>>> Date: 04/26/2018 07:26PM
>>> Subject: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>> 
>>> Hi All,
>>> 
>>> This issue is continuation to bug 8201540 (Extend the set of 
>>> supported charsets in java.base on AIX) in which we have moved 
>>> default charsets of most of the locales supported by Operating System 
>>> to java.base module thus enabling OpenJDK on those locales for AIX 
>>> platform.
>>> 
>>> As part of that, charsets for locales Ja_JP (IBM-943) and Zh_TW 
>>> (big5) also have been moved. However, corresponding charsets mapped 
>>> in Java is not correct for them on AIX. Following are the details:
>>> 
>>> 1. IBM-943 [1] for locale Ja_JP should be mapped to IBM-943C [2]
>>> 
>>> Fundamental difference between IBM-943 and IBM-943C is that IBM-943C 
>>> is ASCII compatible which means code points 'yen' and 'overline' of 
>>> IBM-943 is replaced with 'backslash' and 'tilde' from ASCII character 
>>> set.
>>> 
>>> 
>>> 2. Big5 for locale Zh_TW should be mapped to IBM-950 [3]
>>> 
>>> I've attached simple test program to print the default charset along 
>>> with fix for this issue. When run test program (PrintDefaultCharset) 
>>> with IBM JDK 8 (on AIX) for locales Ja_JP & Zh_TW, following is 
>>> output.
>>> 
>>> -bash-4.4$ LANG=Ja_JP ~/JDKs/IBM/80/ON/sdk/jre/bin/java 
>>> PrintDefaultCharset
>>> LANG = Ja_JP
>>> Default charset = x-IBM943C
>>> file.encoding = IBM-943C
>>> sun.jnu.encoding = IBM-943C
>>> 
>>> -bash-4.4$ LANG=Zh_TW ~/JDKs/IBM/80/ON/sdk/jre/bin/java 
>>> PrintDefaultCharset
>>> LANG = Zh_TW
>>> Default charset = x-IBM950
>>> file.encoding = IBM-950
>>> sun.jnu.encoding = IBM-950
>>> 
>>> 
>>> Same test run with openJDK 11 gives following output
>>> 
>>> -bash-4.4$ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
>>> LANG = Ja_JP
>>> Default charset = x-IBM943
>>> file.encoding = IBM-943
>>> sun.jnu.encoding = IBM-943
>>> 
>>> -bash-4.4$ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
>>> LANG = Zh_TW
>>> Default charset = Big5
>>> file.encoding = big5
>>> sun.jnu.encoding = big5
>>> 
>>> I will get webrev hosted in 
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=KUVGEwJiRVpNtQ9wUhGP6BKqzSTV1OWX31WWPdQMmqg&m=Prdd2GWj8c4aCa1qPr65xklNyDyu64w_6X7qkNaI-D8&s=8C1bILMg6JLJtbL0KLRPfU0MkIPkAmq_IlJgdTfpjdI&e= 
>>> for this change and send it for review once JIRA bug is created.
>>> 
>>> [1] 
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__demo.icu-2Dproject.org_icu-2Dbin_convexp-3Fconv-3Dibm-2D943-5FP130-2D1999-26s-3DJAVA&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=KUVGEwJiRVpNtQ9wUhGP6BKqzSTV1OWX31WWPdQMmqg&m=Prdd2GWj8c4aCa1qPr65xklNyDyu64w_6X7qkNaI-D8&s=RJOiyJTR1jkgxxnRZu5JL97irAnHo1M4wMp7x21dgvs&e=
>>> [2] 
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__demo.icu-2Dproject.org_icu-2Dbin_convexp-3Fconv-3Dibm-2D943-5FP15A-2D2003-26s-3DALL&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=KUVGEwJiRVpNtQ9wUhGP6BKqzSTV1OWX31WWPdQMmqg&m=Prdd2GWj8c4aCa1qPr65xklNyDyu64w_6X7qkNaI-D8&s=gXshfq2f9yed1iEYTdt57Lk0vvHXztqgUzOLQ64h8Vo&e=
>>> [3] 
>>> https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.nlsgdrf/big5.htm
>>> 
>>> 
>>> Thanks,
>>> Bhaktavatsal Reddy
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 



More information about the core-libs-dev mailing list